1 SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03.

27
1 SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03

Transcript of 1 SPEECH PROCESSING FOR BINAURAL HEARING AIDS Dr P. C. Pandey EE Dept., IIT Bombay Feb’03.

1

SPEECH PROCESSING FOR BINAURAL HEARING AIDS

Dr P. C. Pandey

EE Dept., IIT Bombay

Feb’03

2

R&D activities in SPI Lab, EE Dept, IIT Bombay

• Speech & hearing• Healthcare instrumentation • Impedance cardiography• Industrial instrumentation

3

Speech & hearing

• Speech processing for improving perception by persons with sensori-neural hearing loss:

- Consonantal enhancement (with Prof SD Agashe)- Binaural dichotic presentation

• Vocal tract shape estimation for speech training of deaf children

• Speech synthesis and study of phonemic features using HNM

• Cancellation of background noise in alaryngeal speech using spectral subtraction

4

Healthcare instrumentation

• Low cost diagnostic audiometer

• Impedance glottograph for voice pitch

• Impedance cardiograph for sports medicine.

• Intravenous drip rate indicator

• Communicator for children with cerebral palsy (with Prof GG Ray)

• Non-invasive ultrasonic thermometry system (with Prof T Anjaneyulu)

• Myoelectric hand (with Prof SR Devasahayam & R Lal)

5

Impedance cardiography Signal processing for improving the estimation of stroke volume from impedance cardiogram

Industrial Instrumentation

Noninvasive m/s of single phase fluid flow using ultrasonic crosscorrelation technique (with Prof T Anjaneyulu)

Online measurement of dielectric dissipation factor for condition monitoring of high voltage insulation (with Prof SV Kulkarni)

6

7

8

9

  

 

10

11

Speech Processing for Binaural Hearing Aids

Hearing systemOuter ear Middle ear Inner ear Cochlear nerveBrain

Hearing impairments• Conducrtive • Sensorineural • Central • Functional

Sensory Aids for the hearing impaired • Hearing aids• Cochlear prosthesis• Visual & tactile aids

12

Causes of sensorineural loss

• Loss of sensory hair cells in cochlea • Degeneration of auditory nerve fibers

Characteristics of sensorineural loss

• Frequency dependent shifts in hearing thresholds• Reduced dynamic range, loudness recruitment• Poor frequency selectivity & increased spectral masking• Reduced temporal resolution & increased temporal masking

13

Effects of increased spectral masking• Smearing of spectral peaks and valleys due to broader auditory filters• Reduction of internal spectral contrast • Reduced discrimination of consonantal place feature

Effects of increased temporal masking• Forward and backward masking of weak segments by strong ones• Reduced ability to discriminate sub-phonemic segments like noise bursts, voice-onset-time, and formant transitions

14

Speech processing for dichotic presentation for binaural hearing aids to reduce the effects of masking

Masking takes place at the peripheral level of the auditory system

Information from the two ears gets integrated at higher levels in the perception process

Binaural dichotic presentation for persons with bilateral residual hearing: - Speech signal split in a complementary form, - Signal components likely to mask each other presented to different ears, - - Information integrated at higher levels, for better speech perception

15

Binaural dichotic presentation schemes  Spectral splitting

Filtering by 2 complementary comb filters: better place reception

 Temporal splitting

Gating by 2 complementary fading functions: better duration reception

Combined splitting

Processing by 2 time-varying comb filters All the sensory cells of the basilar membrane get periodic relaxation from stimulation: better perception of consonantal duration, place, and other features

16

Inter-aural fading with trapezoidal transition and inter-aural overlap

Temporal splitting of the signal for dichotic presentation using w1(n) and w2(n)

TEMPORAL SPLITTING WITH TRAPEZOIDAL FADING

Inter-aural switching period = 20 ms, Duty cycles = 70%, Transition durations = 0, 1, 2, 3 ms

s1(n)

s2(n)

w1(n)

w2(n)

s (n)

n

n

w1(n)L

Nw2(n)

N

L

MM

17

Investigations with spectral splitting

• Auditory filter bandwidth based comb filters18 bands over 5 kHz, 256 coefficient linear phase filters, designed using frequency sampling technique

• Listening tests with hearing impaired subjects: improvement in response time, recognition scores, & reception of place feature

• Better results with perceptually balanced filters 1 dB ripple, 30 dB attenuation, 4-6 dB crossover

• Filters with personalized frequency response Overall improvement, but not particularly for place

18

Combined splitting with time-varying filters

m/2m/2 +1

m/2 +2

m1

2

Time varying comb filter 1

Time varying comb filter 2

s(n)

s2(n)

s1(n)

set of filter coefficients

Mag

nitu

de1

2

m

1

Frequency

Sweep cycle duration = 20 ms. With m shiftings, each pair of comb filter processes for 20/m ms

19

Inten. dB

Freq

uenc

y (k

Hz)

12

45

3

0 -40

0

Time in msInten. dB

Freq

uenc

y (k

Hz)

-40

0

Time in ms

(a)

(b)

An idealized representation of magnitude response of the pair of time-varying comb filters using 4 shiftings for the (a) left ear (b) right ear.

0 5 10 15 20 25 30

0 5 10 15 20 25 30

12

45

3

0

20

1

2

3

4

1

Normalized frequency

Mag

nitu

de (d

B)

21

Time-varying comb filters

Set of linear phase 256-coeff. FIR filters with pre-calculated coefficients (designed using iterative use of frequency sampling technique).

Comb filter responses optimized for min. perceived spectral distortion: low passband ripple & high stopband attenuation, inter-band crossover gains adjusted for loudness balance.

Pass band ripple < 1 dB, Stop band attenuation > 30 dBGain at inter-band crossovers: -4 to -6 dBSweep cycle duration : 20 msNumber of shiftings: 2, 4, 8, 16

22

Listening tests for evaluation of the schemes

Test material: Closed set of 12 VCV syllables, formed with consonants / p, t, k, b, d, g, m, n, s, z, f, v / and vowel / a /

Subjects & listening condition: • Normal hearing subjects with loss simulated by Gaussian noise with short-time (~10 ms) SNRs of6 : -15 dB. MCL( 70–75 dB SPL)• Hearing impaired subjects with bilateral sensorineural loss. MCL.

Performance measurement • Response time statistics • Stimulus-response confusion matrix • Recognition scores • Rel. information trans. of consonantal features

23

Listening test set-up

Acoustically Isolated Chamber

Lowpass Filter and Audio Amplifier

Lowpass Filter and Audio Amplifier

PCL-208 D/A Ports

PC

RS232C

s1(t)

s2(t)

Subject terminal

Subject

24

Conclusions

• All the three schemes improve response time, recognition scores, & rel. info. tr. for overall and various speech features.

• Extent of improvement with a scheme related to nature of the loss - Severe high frequency hearing loss : Max. improvement with temporal splitting (17.9%).- Symmetrically low frequency hearing loss and symmetrically sloping high frequency hearing loss: max improvement with spectral splitting (17.5%) & combined splitting with 8 shiftings (20.5%).-Asymmetrical high frequency loss: temporal splitting (7.6%) & combined splitting (7.6%)(contd.)

25

• Spectral splitting more effective in reducing perceptual load.

• Overall max improvement in rec. scores with combined splitting with 8 shiftings.

• Temporal splitting mainly improved the duration feature perception.

• Spectral splitting mainly improved the the place feature perception.

• Combined splitting with 8 improved perception of both duration and place.

• Reception of the relatively robust consonantal features (voicing, manner, and nasality) not adversally affected by splitting.

• Personalized filter response gives additional improvement

26

Next• Listening tests with a larger number of S’s to establish relationship between processing parameters & nature of loss.

• Individualized multi-band compression.

• Implementation of the processing schemes as part of wearable hearing aids, with personalized parameter setting.

• Effect of binaural dichotic listening on non-speech signals & source localization to be investigated.

• Investigations with combination of consonant enhancement with dichotic presentation.

27

THANK YOU