We present a secure LAN using sound as the physical layer for low speed applications. In particular, we show a real implementation of a point-to-point or point-to-multipoint secure acoustic network, having a short range, consuming a negligible amount of power, and requiring no specific hardware on mobile clients. The present acoustic network provides VPN-like private channels to multiple users sharing the same medium. It is based on Time-hopping CDMA, and makes use of an encrypted Bloom filter. An asymmetrical error-correction is used to supply data integrity, even in the presence of strong interference. Simulations and real experiments show its feasibility. We also provide some theoretical analysis on the principle of operation.
Information sharing and data synchronization among devices, such as mobile phones and computers, are fre- quently conducted through self-configurable ad-hoc wireless networks. For instance, networks based on IEEE 802.11 (Wi-Fi) or Bluetooth standards became ubiquitous due to, among other factors, their ease of use auto- nomous nature and independence of fixed structure. However, they are prone to security risks such as eaves- dropping [
In security-minded organizations, such as military, homeland security, R&D or even banks and other financial institutions, where it is imperative to restrict the flow of sensitive information to specific areas, it has become a necessity to find an alternative to RF communication for establishing ad-hoc networks. Non-RF wireless com- munication solutions resilient to out-of-room monitoring comprise free-space optical and acoustical links. A number of them are reviewed in [
The main drawback of optical links is the requirement of a clear line of sight between devices, a condition that cannot be guaranteed in some working environments. Moreover, they require the installation of additional light sensing hardware on devices not equipped with cameras, or even infra-red transceivers.
Acoustical communications, however, can operate using standard hardware microphones and speakers, ubi- quitous on information devices [
A similar technology addressing these user cases is RF based Near Field Communications (NFC) [
But to the best of our knowledge, the question of privacy has been addressed only in the application layer us- ing security protocols. We present a physical layer approach to secure acoustical communications based on time-hopping CDMA, similar to those presented in references [
Establishing a private link among previously un-paired mobile devices based on software privacy schemes requires some degree of user interaction that is usually neglected [
An envisioned application scenario for this technology is the validation of small financial transactions such as PoS (Point of Sale) or ATMs using an unmodified mobile device (e.g. a smart phone).
The main advantage of the proposed system is its simplicity, just a sound emitter (speaker), a receiver (micro- phone) and a sound media channel are needed; and those are already available in computers, tablets and cell- phones (
Proposed acoustic network may have heterogeneous nodes like cell-phones and personal computers
Code Division Multiple Access (CDMA) is usually implemented either as frequency hopping or direct-sequence spread-spectrum. However, we chose a time-hopping scheme because it enables a simple way of implementing a secure transmission and also to deal with broadcast media. Indeed, security is achieved, for each user communi- cation, by selecting a time-slot to transmit each message bit, according to a cryptographically-secure pseu- do-random number generator (CS-PRNG) [
Communication channels are structured into frames with a certain number of slots/bits, where each user transmits data according to the CS-PRNG [
The size of the frame is proportional to the number of simultaneous users that the system supports. Increasing the frame size will cause the total bandwidth to be shared among more users, but transmission delay will also increase because the whole frame must be received to start decoding the data. Ideally frame size must be mini- mized.
Whenever On-Off Keying (OOK) modulation is used (see Section 2.4), the sound media can be modeled as a Z-channel [
Symbol insertion into the Bloom filter, considering symbols with two 1 s and length or redundancy of 3. Arrows indicate assigned slot by the CS-PRNG
As an enhancement of the Bloom filter stage, Hamming weight equalization can be implemented as follows. Each user parses its output stream in groups of n bits and maps them to words of m bits with exactly m1 1 s. No- tice that encoding has multiple 0 s, and due to the Z-channel property, they do not interfere with other users transmissions in the physical media. Such proposed mapping is possible if the combinatorial number of subsets of m1 elements taken out of a set of size m is at least 2n, i.e.
notice that there is space for error detection in the case of a strict inequality. Furthermore, the reduction of Hamming weight from an average of n/2 to a fixed m1/m has two effects:
1) Optimization of the Bloom filter algorithm, as low Hamming weight symbols lead to a reduced collision rate;
2) Since all symbols have the same Hamming weight, no information can be extracted from the statistical analysis of the number of 1 s and 0 s.
The encoding/decoding process can be accomplished by using a simple lookup table, so it has O(1) time com- plexity, but O(m2) space complexity. A frame encoded using words of weight m1 will have W1 = m1 ´ N ´ K bits in the 1 state, where N is the number of clients/encrypted channels and K is the Bloom Filter repetition parame- ter (see
On-off keying modulation of sound waves, following the Z-channel interface model described in Section 2.2., encode transmitted bits as pulses. Carrier frequency can vary from 10 kHz to 16 kHz. Good results can be ob- tained with a rate of 1000 bps at frame level. In experiments, delay (the time for a bit to traverse the network) was very high, due to Reed-Solomon 223/255 coding, a frame to support up to 16 users and the low capacity of the physical media. A more sensible choice of FEC algorithm (like BCH [
As it follows from the description of the communication channel, synchronization between transmitter and receiver is essential for the correct decoding of information. For this purpose, an initial synchronization pattern is sent, so the receiver can adjust parameters like phase and decision level (see
(a) The synchronization pattern is a OOK (ASK) modulation of a 1-0-1-0 sound pattern. It is shown here before pulse shaping; (b) Once synchronization is done, the actual data bits are transmitted with the same modulation but with a duty cycle of 50%
In this section, we present an analytical estimate of the upper bound of the bit error probability, taking into ac- count only the interference from other users in the network. Moreover, we will not consider the correction capa- bility of the Reed-Solomon code.
As explained in Section 2.3, let us assume that each user groups information bits into packets of length n. Each packet is coded using exactly m1 ones and m0 zeroes (m1 + m0 = m). Each of the resulting m binary digits is repeated K (Bloom filter length) times at randomly chosen places in a frame of length M. Repetitions of a binary digit may collide with other repetitions of the same digit or with those corresponding to another digit. Let N be the number of active users.
In order to roughly estimate the bit error rate, we shall make the following simplifying assumptions:
We shall assume that frames from different users are synchronized and, thus, each frame contains (counting collisions) W0 = N ´ K ´ m0 zeroes and W1 = N ´ K ´ m1 ones.
We shall not include in the analysis the possibility of error correction due to the fact that, in general,
More specifically, whenever an erroneous sequence of m binary digits with more than m1 1 s is received, it is mapped to a randomly selected bit string of length n. Therefore, the expected number of errors will be n/2.
Under these assumptions, the bit error rate for a given user is given by
By the union bound,
Thus, let us fix our attention on one of the m0 zeroes. If the transmitted (by all users) W1 ones occupy s slots and the K repetitions of the given 0 use r (≤ K) slots, then a necessary condition for error is that s ≥ r. So let us assume that there are s 1 s in a frame of M bits. Given r (fixed) places in the frame, the probability that the 1 s occupy those r positions is given by
where we have assumed that M, s >> r.
If M >> K, it is not difficult to see that the K repetitions of a given 0 occupy µR ≈ K slots in average. It can al- so be shown that the average number of slots occupied by the W1 1 s transmitted by all users is
if M and W1 are large (see Appendix).
From these equations, a rough estimate of the bit error rate for a given user is
BER estimate vs. Bloom filter repetition rate. Reed- Solomon coding is not taken into account. The theoretical cal- culations are an upper bound for the BER. At K = 11 the lowest bound is found. Not only its shape is quite similar to the BER found through numerical simulations, but also the actual mini- mum BER is found for K = 10, very close to that indicated by the theoretical calculation. Results shown are for a frame of M = 256 assuming 8 simultaneous clients
In order to study the behavior of our protocol, we developed a piece of software implementing the present proposal. We detail firstly, the software architecture, and secondly the results of the performed simulations.
In order to match the communication stack accurately, we adopted a modular software architecture where the modules are chained via POSIX standard input/outputs. This structure allows to modify each stage separately and to reuse some of the simulations’ modules at the final implementation without modification. The high-level struc- ture can be seen at
Each simulation run begins when a binary data block (random bit sequence) is fed to the first module, that is, the Reed-Solomon encoder. The output of this encoder is fed to the next block, an interleaver. In this way, the original data is transformed at each stage and successively passes through all modules until it is converted into audio. At the receiver stage the demodulator generates the binary data blocks that go in reverse order through the same modules of the modulator stage, until the receiver Reed-Solomon decoder is reached. At this stage, the system compares the original input with the output, and calculates and reports the BER.
In the numerical simulations, no noise was added to the audio channel, but speaker and microphone hardware limitations were simulated using bandpass filters at the carrier frequency with a 4 kHz bandwidth. Filters were implemented using the Sound eXchange (SoX) Unix utility.
Normally over 106 bits need to be simulated for each client under this configuration. Computational resources needed for simulation are considerable, so the software provides a client-server model in which calculations can be shared among multiple nodes.
Simulations show a peak medium utilization of 35% (consistent with a slotted-aloha type network [
It is important to note that only a client-to-client interference was simulated with no ambient noise. The OOK modulation is more sensitive to noise than other more efficient modulation schemes, but the flexibility of the
Software stack of the modulator stage
error correction architecture adapts to situations of high ambient noise by simply reducing the amount of availa- ble encrypted channels, as we will discuss in the next section.
Simulation results tracked closely that of experiments with the physical implementation of the scheme (see
We present here the results of real world scenarios, testing our protocol in two computers Lenovo T420 and Le- novo X60, and one cell-phone HTC Status. We developed a version working on GNU/Linux and another on Android.
All measurements were conducted at a rate of 1000 bps at the physical media, with a 16 kHz carrier signal, while maintaining the other parameters in the same values that were used in the simulations, including the bandpass filter centered at 16 kHz with a 4 kHz width. The transmitted data was 4096 bits per user, with M = 256. This M, lower than the simulated cases, was used due it reduces the delay before the end user starts to re- ceive information, which is introduced by the error correction. Indeed the implemented Reed-Solomon needs 256 bytes blocks to perform corrections. Sound output volume was set at the maximum possible for each device, while the sound input amplification was optimized for each measurement.
We also performed an experiment to test the BER as a function of distance, using a cell-phone as a transmitter and the computer as a receptor.
We performed experiments with the system operating at different carrier frequencies; we discuss them in the following paragraphs.
Communications using a 12 kHz tone carrier were extremely susceptible to ambient noise. Indeed, a 50 cm link between a laptop Lenovo T420 and a HTC Status phone suffered an excess of 15% BER with slight noise interference (like little bumps on a nearby table). This observation motivated the use of the highest attainable frequency. A 16 kHz carrier provided the widest range of compatibility among tested devices, because some of them could not emit at higher frequencies.
Tests were also done at the highest frequencies attainable for each device. For instance, laptop speakers in Le- novo T420 and Lenovo ×60 laptops proved capable of establishing a link at 19.2 kHz, but only for very short distances (20 cm). Nevertheless, this carrier frequency allowed a faster link (2000 bps) with the same BER.
The modulated sound signal can represent a nuisance to nearby persons and animals. Several different carrier frequencies were tested as a way to evaluate the level of discomfort. The 19.2 kHz carrier signal was perceived as almost non-audible, with 16 kHz being clearly audible for most people and the link at 12 kHz being the most uncomfortable. It should be noticed that loudness and hence discomfort increase with the number of simultane- ous users.
We presented a wireless secure communication protocol that uses sound waves as the transmission physical me- dia. It can be used as a point-to-point or point-to-multipoint protocol as digital data can be transmitted using dif-
Link between two laptops (Lenovo T420 and Lenovo ×60) simulating multiple client nodes. No errors were measured for 10 to 14 clients when transmitting 4096 bits
Link between Laptop (Lenovo T420) and Mo- bile Phone (HTC Status) presents errors only over 60 cm. At a 50 cm separation, and below, no errors were rec- orded with 4096 transmitted bits
ferent carrier frequencies. More importantly, the protocol can use transducers and sensors like speakers and mi- crophones already present in billions of devices like desktops, laptops and other mobile computers.
Security is built-in into the system, guaranteed by Time-Hopping CDMA with a CS-PRNG. Design problems, like the delay before communication starts and limited available bandwidth in standard audio channels, can be easily corrected by adjustments to system parameters. For instance, a mechanism for dynamic utilization and as- signation of the channels as a function of the present users (i.e., noise), can be implemented to improve the ca- pacity of the system. Although in the present test channel the mentioned delay was excessive (66 s) for some applications requiring short response times (e.g. banking transactions), this can be reduced by decreasing the maximum number of simultaneous users supported by the system, and implementing both a lower-delay inter- leaver and an outer error correction algorithm like BCH.
We validate our proposition by simulations, theoretical analysis and experiments in real scenarios. Our pro- tocol may provide an economical alternative to NFC, RFID, two-factor authentication, ATM-banking or any other application requiring cryptographically secure network access at short distances.
This work was supported by PICT-497/2006 of the Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT), Argentina.
<
1More efficient soft-decoding is also possible.