Underwater Acoustic Voice Communications Using

8/3/2019 Underwater Acoustic Voice Communications Using

1/5

UNDERW ATER ACOUSTIC VOICE COMM UNICATIONS USINGDIGITAL PUL SE POSITION M ODULATIONHayri Sari and Bryan Woodward

The autho rs are with the Departm ent of Electronic and Electrical Engineering,Loughborough University, Leicestershire, United K mgdomAbstract - The paper presents a novel technique for underwateracoustic voice communications. The speech signal iscompressed prior to transmission by using Linear PredictiveCoding and transmission of appropriate speech parameters isachieved by Digital Pulse Position Modulation. Th e mainemphasis here is on the reception of the multipath-dominantsignal and on the demodulation process.

I. INTRODUCTIONMany underwater acoustic communication systems have

been designed and manufactured commercially, mostly basedon analogue technology and employing Single Side Band(SSB) modulation [ I ] . Technically, these systems are waybehind the advanced digital technology of mobile radiocommunication systems. Apart from the modulation schemeadopted, analogue systems have other inherent shortcomings;for example, private communications are not easily achievableand multipath propagation can degrade the quality of thetransmitted speech. These limitations can be overcome byimplementing digital technology as is now commonplace formobile telephones. It is for these reasons that severalexperimental systems for digital underwater voicecommun ications have been dev eloped [2, 31.

Since an underwater acoustic communication channel isbandwidth-limited, transmission of quantized speech samplesat high bit rates is restricted, hence speech signals must becompressed. There are internationally recognised low bit ratespeech coding techniques ava ilable at transmission rates of 2.4kbit/s to 16 kbit/s [4]. In this study, Linear Predictive Coding(LPC) at a bit rate of 2.4 kbit/s is implemented [SI. Thistechnique is the most fundamental of the low bit rate speechcoders and is based on estimations of a speech productionmodel represented by the following equation:

where G is the gain of the excitation input, each value of a [ k]is a coefficient of an all-pole digital filter representing thevocal tract, and P is the number of coefficients. Theseparameters are calculated for every speech frame, each ofduration 22.5 ms, on the assumption that the speech signal is

stationary over this period. The speech parameters, i.e. tenreflection coefficients (extracted from all-pole filtercoefficients), pitch period, gain and voicedlunvoiced decision,are quantized in the sequence 5151515141414l4/312161611 (54 bitsper speech frame) to achieve a transmission rate of 2.4 kbit/s.11. TRANSMISSION OF SPEECH PARAMETERS

Transmission of encoded speech parameters can beachieved by using various digital modulation techniques, suchas ASK, FSK and PSK. Each ha s certain advantages anddisadvantages when applied to a multipath-dominantunderwater acoustic comm unication channel. Thes e factorsincludes the complexity of implementated system, bandwidthefficiency, power efficiency and the effect of multipathpropagation and inter-symbol interference. In recent years,there has been a growing number of applications of coherentPSK modulation at high data rates [6]. This is achievableowing to advances in digital signal processing technology.However, since the present system is designed to be portable,an alternative, more power-efficient, modulation technique isconsidered. A suitable choice is digital pulse positionmodulation (DPPM ) [7].The design of the prototype system is based onTMS320C31 Digital Signal Processors (DSP), one in thetransmitter and the other in the receiver. They are designed toexecute the digital pulse position modulation I demodulationalgorithms in real-time as well as to digitize and compress /decompress the speech signal.A. Digital Pulse Position Modulation and Implementation

Digital PPM has been shown to be an effective modulationformat for transmitting digital information in an opticalchannel (81. It has also been considered for underwateracoustic data transmission [9], including voicecommunications, because of its suitability for power-efficientchannels [7], comparative simplicity in implementation andreduced sensitivity to multipath propagation. Digitalinformation is transmitted by dividing each data frameduration Tyymbo[,nt o M possible data slots, each of durationTyl(,t, and locating a transmission pulse in just one of thesetime slots. The mathematical definition of a DPPM pulsestream is given in [SI as

0 - 7 8 0 3 - 4 1 0 8 - 2 1 9 7 / $ 1 0 . 0 0 0 1 9 9 7 IEEE 8 7 0


2/5

where g ( t ) is the PPM pulse shape and t , is the random datacoded into PPM suc h that

1"j\ i i i

0 2 , s ( M - )Ts,,, (3 )

I I I IGuard

To transmit quantized speech parameters, it is appropriateto select 8-slot DPPM, i.e. 3-bits per symbol. Therefore 18symbols must be transmitted during each speech frame so thatreal time operation may be achieved. The symbol interval issub-divided into eight data slots and two guard slot intervals,as shown in Fig. 1

: ........... ............................................. ............. ........................ !....................... :i .

: .......... ......... ........................... "" x .............. .......... .__........ :........................ :

Fig. 2 Transmission of synchronisation signal and octal data(601) in DPPM format.

k speech fram e interval= 22.5 ms B. Synchronisation in DPPM

......... ................. .....................................

~ 000 I o 0 1 010 011 100 101 110 11 1 1 Band1

- I

Fig. 1 Transmission of digital data in DP PM formatIn defining Tyymbol,areful attention must be paid to theresonance frequency and Q of the underwater transdlucers;these are normally referred to as a projector-hydrophone pairwhich in this application each act in both transmission andreception mode. For such a transducer to be driven in itssteady state, it must be excited by a sinewave of at least Qcycles duration. Another limiting factor is the rapid increase

with range of the absorption loss at higher frequencies [ I ] . T oachieve a range of several hundred meters, as is commonlyclaimed for the conventional analogue systems mentionedabove, a transducer with a 70 kHz resonance frequency and aQ factor of 5 is used. For this application Tsymbol l m s ,hence TYlot= 100 ps is defined. Tw o timer ports of the DSP,TO and T I , are used to generate Tylotand the carrier frequencyof 70 kHz, as illustrated in Fig. 2.

Since in the DPPM method information is encoded by thetemporal slot position of the pulse in the symbol interval,accurate timing in both the modulation and the demodulationprocesses is essential. Th e question of how best to achieveDPPM synchronisation has received some attention in theliterature [101, and here three aspects are considered, speechframe synchronisation, symbol synchronisation and slotsynchronisation. The last of these is the most significantbecause it provides synchronisation of the others. However,since the speech data are in packet form, it is important toknow when it is being transmitted, therefore a speech framesynchronisation signal is introduced, as shown in Figs. 1an d 2,so that synchronisation between the transmitter and receiver isupdated every 22.5 ms. This is necessary because, at thereceiver, the pulse position may be significantly shifted fromits original location, leading to a timing error during thedemodulation process. Mo reover, the communication link canbe intermittently interrupted and m ay need to be resumed.The duration of the speech frame synchronisation signalmust be carefully dec ided in orde r to distinguish it from a databurst. After the transm ission of a data burst, the receivedsignal will generally be detected as a spread-out version of thetransmitted burst due to the response of the hydrophone andthe effect of multipath propagation in the underwater channel.The transmitted data burst may therefore occupy severalsuccessive slot intervals. Consideration of these factors led tothe choice of a speech frame synchronisation signal equal tosix slot intervals, i.e. 6T,lCJt,nd transmitted during a symbolperiod as show n in Figs. 1and 2.C. Symbol Transmission

When the speech parameters are calculated and encoded(54 bits) as defined above they are stored in two 32-bit arrays,with an integer number of symbols in each array. As soon asthe synchronisation signal is transmitted, the symbols areencoded for transmission. Th e symbol value and the slot

87 1


3/5

counter for each symbol interval are compared with timer Toof the DSP. When they are equal, the carrier waveform isgenerated via timer TI during that slot interval. As anexample, transmission of three symbols (6 01 in octal) is shownin Fig. 2. This procedure is repeated until all 18 symbols aretransmitted.The speech analysis and parameter transmission algorithmsare simultaneously executed during the operation of thesystem. Priority is given to parameter transmission due to theprecision requirement of DPPM. These priorities aremanipulated by the interrupts of the DSP.

111. DEM ODU LAT ION OF DPPM SIGNA LAt the receiver, an omnidirectional hydrophone is used andacoustic pulses transmitted through the underwater channel inDPPM format are detected. The input signal is amplified witha factor of G, which depends on the radiated acoustic power,

the transmission loss along the channel and the sensitivity ofthe hydrophone. Next, the amplified signal is applied to abandpass filter. To preserve the envelope of the signal, itsbandwidth is set to be the same as that of the hydrophone, i.e.14 kHz. In DPPM signal detection, the limited bandwidth ofthe receiver affects the performance of the system byintroducing a finite rise time to the received signal envelope,as shown in Fig. 3.The slow signal rise time may result in anerror in data decoding since the necessary timing accuracy inthe pulse position may not be achieved.

.......................................................................................!.........................L ......... ............. .... .._:....

..........................

: :. . . . . . . . . . ........... ............. .............................

...... ................ ........................ : ........... .............

Fig. 3 Transmitted and received waveforms in DPPM formatDemodulation of the DPPM signal is based on envelopedetection of the passband signal. Therefore, an envelopedetector is used to extract the baseband signal, which has an

approximate duration of loops for the data bursts. Then itsoutput is applied to a lowpass filter with a bandwidth of 10kHz.Coherent detection principles are employed to demodulatethe DPPM baseband signal as shown in Fig. 4.This is done bydigitising at a 40 kHz sampling frequency, using an 8-bitMAX IM 153 Analogue-to-Digital Converter (ADC). Asexpected, the higher the sampling frequency the better the

estimation of the pulse position. Since the rising edge of thsynchronisation signal is taken as the reference point iDPPM, the high sampling of the input signal will improve thcorrect decoding of the transmitted data.

...................... ............ ................................................ ..__.:: :

........................ ...................................................................... : :

: .......... .. ... .............................................................................

Fig. 4 Baseband signal detection

A . Threshold D efinition and D P P M Signal DetectionIf the transmitter and receiver are not synchronised, difficulty will arise i n the receiver making a decision about thinput signal, i.e. whether it is the synchro nisation signal, a dasignal, a multipath signal or noise. Th e presence of the sign

must therefore be verified and the slot syn chronisation must bestablished. This is achieved by introducing a threshold-basedetection method, but this is only used for recognising thsynchronisation signal. However, selection of the threshollevel is a difficult task that requires consideration of multipatsignals and the ambient noise level of the channel.A possible solu tion to this problem is the introduction of aadaptive threshold setting process. Samples of the inpusignal, x [ n ] , are taken for the duration of a speech frame (22.ms), which is an optimum interval since it includes thsynchronisation signal. From these samples, the maximum anminimum magnitudes representing the transmitted informatioor noise are extracted. Then , by using the maximumagnitude, a synchronisation signal, a [ k ] , consisting of 2samples (i.e. 6T,lOt) is simulated in software. Thsynchronisation interval in the real input sig nal is estimated busing

23&[n] a [ k ]- [ n + k ] n=0,1,2, N-k (4...

k= O

where N is the number of samples in process, ~ [ n ]s thmagnitude difference function and its minimum valurepresents the location of the received synchronisation signaOnce the beginning of this signal is found, its mean amplitudis calculated and set as the threshold level. For a reliabthreshold setting, the maximum value of the differencfunction, which defines the dissim ilarity of two signals, is als

872


4/5

used. If the maximum and minimum values of ~ [ n ]re similar,this suggests that there is no data transmission i n the clhanneland the processed signal is background noise. Therefore, thecondition of E 2 2&,in must be provided, wheire themultiplication factor of 2 is arbitrarily selected. Onc e theinitial threshold setting is achieve d, its value can be upd ated bymonitoring subsequent synchronisation signals.For the system, the threshold level is fixed . Although thissimplifies the design, it reduces the performance of the system.In the controlled test environment (a tank 9m long x 5m wide x2m deep filled with fresh water), with suitably placedtransmitting and receiving transducers, the threshold level wasdefined after carefully studying the multipath signalmagnitudes. Since coherent detection is applied to thesynchronisation signal, its rising edge is detected. When thedigitised input signal magnitude is equivalent to o r greater thanthe threshold level, the sample could be from a synchronisationsignal, a data signal or even a multipath signal. Thi s may notbe the closest sample to the rising edge of the input basebandsignal, which is considered as late synchronisation. The refore,it is important that the first sample is taken from non-signalintervals, i.e. the sample value is lower than the thresholdlevel. The receiver then starts searching for the rising edge ofthe synchronisation signal at the same time as achieving slotsynchronisation. However, there is still an uncertainty aboutthe origin of the sign al, i.e. synchronisation or da ta signal.At this stage, it is essential to distinguish these signals fromeach other. Sinc e the duration of the synchronisation signaland data signal are known, their energies during each slotinterval a re measured as defined by

3E[slot]= C X [ k ] 2 os slot 57k= O

Then a comparison-based decision is applied to identify thenature of the signal. It is expecte d that the energies in the sixslot intervals are higher than those of non-signal transmissionintervals. Thi s condition is provided as defined inE[O]2 4 threshold value

E[S] 2 4 threshold value2E [ 6 ] 4 hreshold valueE[7] 5 4 threshold value

then synchronisation signal

If the synchronisation signal is not detected, the system willnot proceed to decode DPPM data. When this occurs duringtransmission, the sp eech parameters a re lost; this will introducedistortion in the synthesised speech signal.

B. Decoding of DPPM DataThe rising edge of the synchronisation signal, oncedetected, is taken as the timing reference point in decoding theDPP M signal. Th e threshold level comparison is not used in

this process. T he energy in each slot ov er a symbol interval iscalculated as in Eq. 5. If the multipath signal does not have adestructive effect, the maximum energy should be measuredduring the data slot interval. The position of this slot representsthe transmitted DPPM symbol value. The same energycomparison process is continued for the subsequent 17 symbolintervals. On ce this is completed, the receiver starts seekingthe synchronisation signal again. The decoded speechparameters are then applied to synthesise the speech signal byusing Eq. 1.During tests of the system, the multipath effect wasminimised by suitable spatial positioning of the transducers inthe tank. Therefore, during demodulation it was ignored.IV . RESULT AND DISCUSSIONS

Since the system is designed to operate as a portable unit,i.e. no interface is available to a computer, it does not providethe facility of error rate measurement. Therefore, performanceof the DPPM transmission is based on subjective tests appliedto jud ge the synthesised speech quality.The system was tested in two different communicationchannels. Th e first was a noiseless channel with a widebandwidth, i.e. a wire connection between the transmitter andthe receiver to eliminate the bandwidth limitations of thetransducers and multipath propagation effect. The second wasa bandwidth-limited underwater acoustic channel. In the wirelink case, detection and demodulation of the DPPM signalwere accurately done since the rectangular envelope of thebaseband signal was preserved. Five subjects detected no lossof the synchronisation signal, which would introduce gaps inthe detected speech, and noted that speech intelligibility waspreserved.The system was also tested in the tank, but with only thetransducers underwater (not the electronics). Wit h these at 1m depth and 1 m separation, correct transmission andreception were successfully achieved, as illustrated in Fig. 5 .Although the same speech intelligibility was observed withsynthesised through-water speech as with the wire linktransmission, occasional ga ps in the speech signal were noticeddue to loss of the synchronisation signal. Since no errorcorrection and reconstruction of lost parameters is included inthe design, these problems were assessed subjectively. Arecommended solution, especially for the loss ofsynchronisation, is to use the last correctly detected speechparameters. The system is capable of implementing such analgorithm and this is under consideration for futuredevelopments.

873


5/5

V. CONCLUSIONSUsing a DSP-based underwater voice communication system,transmission and reception of digital pulse position modulatedspeech parameters has been successfully achieved at a rate of2.4 kbit/s. The speech signal quality was found to besynthetic, as expected. However, recent advances in low bitrate speech coding studies show that transmission of goodquality speech in bandwidth-limited channels are in a fastgrowing area of research and these coders must be consideredfor future studies of underwater voice communications.

.......... -. ...................... -. .................- ................. .- ..................... -. ........................4

........................................................... -. ................... - ..............-. ..............................................................

-

VI. REFERENCESWoodward, B. Underwater Telephony: Past, Present anFuture, Colloque De Physique, Colloque C2, No. 2, ppWoodward, B. and Sari, H. Digital underwater acoustivoice communications, IEEE J.Oceanic Eng, Vol. 21Goalic, A., Labat, J., Trubuil, J., Saoudi, S. and RiouatenD. Toward a digital acoustic underwater phone, iProc. Oceans94, pp.III.489-111.494 (1994).Spanias, S.A. Speech coding: A tutorial review, ProcMakhoul, J. Linear Prediction: A tutorial review, ProcStojanovic, M. Recent advances in high-speeunderwater acoustic communications, IEEE J . OceaniEng, Vol. 21, No. 2, pp. 125-136 (1996).Proakis, J.G., and Salehi, M., Communication SystemEngineering, Prentice-Hall, New Jersey (1994).Ling, G. and Cagliardi, R.M. Slot synchronization ioptical PPM communications, IEEE Trans. CommunRiter, S. Pulse position modulation communications vithe underwater acoustic communication channel, IEESWIEECO Rec. 22nd Southwestern Conf. & Exhib. pp

591-594 (1990).

NO. 2, pp. 181-192 (1996).

IEEE, Vol. 82, NO. 10, pp. 1541-1582 (1994).IEEE, Vol. 63, No.4, pp. 561-580 (1975).

Vol.COM-34, NO. 12 , pp . 1202-1208 (1986).

453-457 (1970).[ 101 Georgehiades, C.N. Optimum joint slot and symbosynchronization for optical PPM channel, IEEE Tran

Commun., Vol. COM -35, No. 6, pp. 518-527 (1987).

(c >Fig.5 Underwater acoustic voice communications: (a)Analysedspeech signal at the transmitter, (b) Transmitted and receivedDPP M signal, (c) Synthesised speech signal at the receiver.

874

Underwater Acoustic Voice Communications Using

Documents

Transcript of Underwater Acoustic Voice Communications Using