Laboratory for Experimental ORL K.U.Leuven, Belgium Dept. of Electrotechn. Eng. ESAT/SISTA...

Laboratory for Experimental ORLK.U.Leuven, Belgium

Dept. of Electrotechn. Eng. ESAT/SISTAK.U.Leuven, Belgium

Combining noise reduction and binaural cue preservation for hearing

aids:

MWF-ITFMultichannel Wiener Filter with Interaural Transfer Function

Van den Bogaert T., Doclo S., Moonen M. and Wouters J.

ICASSP 2007

download at https://gilbert.med.kuleuven.be/~u0041407/2007ICASSP.pdf Contact: [email protected]

20-04-2007

ICASSP 2007 Van den Bogaert et al. 2

Overview

• Problem statement: binaural hearing aids, noise reduction and preservance of binaural cues

• Multichannel Wiener Filter approaches: – MWF: a standard N-microphone Multi-channel Wiener

Filter approach– MWF-ITF: Extension of MWF to an MWF with

integrated Interaural Transfer Function.– Experimental results: SRT and Localization

• Objective measures• Perceptual measures (N=5)

20-04-2007


Problem statement

• Hearing impairment reduction of speech intelligibility in background noise (even with amplification)– Signal processing to selectively enhance useful speech signal– Multiple microphones available: spectral + spatial processing– Most hearing impaired are fitted with hearing aids at both

ears

• Binaural hearing: everything relating to hearing simultaneously (vs bilateral) with two ears – Binaural cues, in addition to spectral and temporal cues, play

an important role in binaural noise reduction and sound localization. (important to preserve these cues)

– It has been reported that current bilateral noise reduction systems have a negative impact on binaural hearing [Van den Bogaert et al. 2005, Van den Bogaert et al. 2006]

20-04-2007


Problem statement

Bilateral system Binaural system

+More microphones = more performance?

- Need of binaural link

- If adaptive typically no controlon left/right proc.

Hard to preserve interaural cues(mic mismatch, imperfections, ...)-

more control on left/right processing?+

20-04-2007


Problem statement

• Main binaural cues– Interaural phase or interaural time differences

• ITD range: from 10µs to 700µs• Of the signal f<1300Hz and/or on the low frequency

envelope for complex sounds

– Interaural level differences • ILD range: from 1dB up to more than 30dB• Physically significant f>2000Hz

IPD/ITD

ILDSignal source

20-04-2007


Design criteria binaural noise reduction for HA’s:

- Maximize noise reduction by using all available microphone signals (binaural link assumed), 2 outputs needed

- Preserve the binaural cues

- Limit the amount of speech distortion

(HA constraints: Robustness of the system, low complexity, …)

Problem statement

20-04-2007


Overview of binaural noise reduction techniques

• Different approaches– BSS methods (Robert Aichner ICASSP 3days ago)– Fixed beamforming [e.g. Desloge 1997]

• Low complexity• Limited performance, only speech cues may be preserved (in ideal situations)

– CASA based techniques [e.g. Wittkop 2003]• Perfect preservation of speech/noise cues• Mostly for 2 microphones, “spectral substraction” like problems

– Adaptive beamforming, based on GSC structure passing the low freq part of the signal unproc. [e.g. Welker 1997]

• Preserves parts of the binaural cues• Substantial drop in noise reduction

– Binaural multi-channel wiener filter [e.g. Doclo 2002 Spriet 2004]• Speech cues are preserved• No assumptions about positions of sources and microphones• Noise cues may be distorted

Extension of MWF :

preservation of binaural speech and noise cues without substantially compromising noise reduction

performance

20-04-2007


Multichannel Wiener Filter

)(wN

)(wS

Speaker

Noise

1W )(

)(0Z

)(1Z

)(1Y )(0Y

W )(0

Filtered output:

)()()(

)()()(

111

000

wVwXwY

wVwXwY

mmm

mmm

Speech component

Noise component

2M microphones:

)1...(0 Mm

Goal: to estimate the speech component at the reference microphone of each hearing aid (r0, r1) typically the front omnidirectional one:

Standard multichannel Wiener filter

0WJMSE

A hearing aid listening scenario

20-04-2007



0

1

2 2

0, 0 0

11, 1

( )H H

r

HHr

XJ E

X

W X W VW

W VW X

Speech distortion Noise reductionTrade off parameter

Introduce trade off parameter noise reduction/speech distortion

1=SDWW R r

0WJMSE

To control or reduce speech distortion rewrite cost function:

Speech distortion weighted multichannel Wiener filter

In standard hearing aid beamforming, avoiding speech distortion is typically done by calibrating the speech reference path and removing the speech component in the noise ref path

20-04-2007



0

1

= , = ,x v M xx y v

M x v x

R R 0 rR r R R R

0 R R r

• Depends on second-order statistics of speech and noise, no assumption of speech and noise source (can be integrated in VAD)

• Perfectly preserves the interaural cues of the speech component since in the left and right hearing aid an estimate is made of the speech component in the front microphone of this hearing aids.

• Shifts the interaural cues of the noise component to the cues of the speech component !!!!

1=SDWW R r

Estimate, f(VAD)

Add term related to binaural cues of noise component to the MWF cost functionPossible cues: ITD, ILD, Interaural Transfer Function (ITF)

ITF-MWF

20-04-2007


Multichannel Wiener Filter – ITF extension

0

1

2 2

0, 0 0

11, 1

2 2

0 1 0 1

( ) =H H

r

tot HHr

H x H H v Hin in

XJ E

X

E ITF E ITF

W X W VW

W VW X

W X W X W V W V

Under assumption of a single noise source

You can do this for the speech and noise component

0 10

1 1 1

*0, 1,0, 0 1

*1, 1 11, 1,

( , )

( , )

r rrv vin

r vr r

E V VV r rITF

V r rE V V

R

R

Goal: the ITF of the noise component at the output = ITF at the input

Performance and influence of beta and alpha on Loc and SNR improvement performance?

20-04-2007


Experimental results

• Identification of HRTFs:– Binaural recordings on CORTEX MK2 artificial head – 2 omni-directional microphones on each hearing aid (d=1cm) – Hrtfs measured = -90:15:90, 90:30:270, 1m from head – Conditions: T60=140 ms (T60=590 ms added) fs=16 kHz, =1

• Objective evaluation:– AI weighted SNR improvement– ITD and ILD error

• Perceptual evaluation:– Headphone exp with record. hrtfs– SRT measurements (50% Sp. Intell)– Localization using prerecorded hrtfs,

S and N components are send seperately through the fixed filter, localize S and N in the room werethe hrtfs were recorded

20-04-2007


Experimental resultsLeft input

signalsRight input

signals

( ) ( ) ( ) Y X VFFT FFT

0 0 0x vZ Z Z 1 1 1x vZ Z Z Left output Right output

IFFT IFFT

Frequency-domain filtering

Off-line computation of

statistics

VAD

( ), ( )v x R R

Calculate filters

for this specific sc.

0

1

( )( ) =

( )

WW

W, ,

Stored filters are converged for a condition Sx Ny with Sx=speechweighted noise from angle x and Ny=babble noise from angle y

20-04-2007


Experimental results: objective evaluation

• Error measures correlated with design criteria:– Maximize speech intelligibility: Intelligibility weighted

SNR improvement (left/right)

– Minimize interaural cue distortion• ILD of speech and noise component

• ITD of speech and noise component

L i L ii

SNR I SNR importance of i-th frequencybin for speech intelligibility

x i x ii

ITD I ITD

1

* *0, 1, 0 10

{ } { }x i r r x xITD E X X E Z Z low-pass filter 1500 Hz

x xx out i in i

i

ILD ILD ILD

20-04-2007


00.05

0.10.15

0.2

0

0.2

0.4

0

5

10

15

ILD error speech component

IL

D [

dB]

00.05

0.10.15

0.2

0

0.2

0.4

0

5

10

15

ILD error noise component

IL

D [

dB]

00.05

0.10.15

0.2

0

0.2

0.4

0

0.5

1

1.5

ITD error speech component

IT

D [

rad]

00.05

0.10.15

0.2

0

0.2

0.4

0

0.5

1

1.5

ITD error noise component

IT

D [

rad]

objective evaluation: localization

S0N60

20-04-2007


00.05

0.10.15

0.2

00.1

0.20.3

0.40.5

0

5

10

15

20

25

SNR improvement left ear

S

NR

w [d

B]

00.05

0.10.15

0.2

00.1

0.20.3

0.40.5

0

5

10

15

20

25

SNR improvement right ear

SN

Rw [d

B]

objective evaluation: SRT

L i L ii

SNR I SNR

β

β

α

α

S0N60

Additions:For T60=0.59s Left perf. S0N60 drops to 5dB SNR AI, right perf drops to 7dB SNR AI noise reductionGoing from 2 to 4 microphones gives a gain of about 3dB SNR AI to 9dB SNR AI compared to 2 microphone performance

T60=0.14s

T60=0.14s

importance of i-th frequencyfor speech intelligibility

20-04-2007


SRT: perceptual evaluation

VU noise 60 deg, alpha=0

9,00

11,00

13,00

15,00

0,0 0,1 0,3 1,0 10,0

Beta

SR

T im

pro

ve

me

nt

– Adaptive SRT procedure to find 50% Speech Recept Threshold

– S0N60, dutch VU sentences, T60=140ms

– average SRT without processing = -9.2 dB

– SRT improvements in the range 11-13 dB

Binaural speech intelligibility advantage because of spatial seperation speech and noise component does not seem to compensate for loss in SNR improvement

Addition: performance drops to around 6dB SRT gain if T60=590ms (S0N90 tested for N=2)

20-04-2007


Localization: perceptual evaluation

• Condition SxN0: Speech arrives from angle x, with x from -90° till +90° in steps of 30°, noise arrives from 0 degrees.

• Perceptual procedure: calculate MWF filters offline trained on spatial condition SxNy. Now run a telephone ring arriving from angle x and angle y seperately through the filters and store the result. Play these wav files under headphones to the subject and ask to localize the telephone signal.

20-04-2007



-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

-90 -75 -60 -45 -30 -15 0 15 30 45 60 75 90

-90

-75

-60

-45

-30

-15

0

15

30

45

60

75

90

SxN0

alfa=0beta=0

Localization of Sx Localization of N0

alfa=0beta=10

20-04-2007



SxN0

Loc error Sx in SxN0 Loc error N0 in SxN0

0

10

20

30

40

50

60

70

80

-90 -60 -30 0 30 60 90

x(°)

e rr o

r (°

0

10

20

30

40

50

60

70

80

(°)

-10

0

10

20

30

40

50

60

70

80

0 0,1 0,3 1 10 100

beta

(°)

alpha=0

alpha=0,5

0

10

20

30

40

50

60

70

80

90

100

-90 -60 -30 0 30 60 90

x (°)

err o

r(°

)

0 0,1 0,3 1 10 100

beta

be ta=0

be ta=0,1

be ta=0,3

be ta=1

be ta=10

be ta=100

Loc error Sx, 5 subjects

Loc error N0, 5 subjects

20-04-2007


• Sum of localisation errors Sx and N0

• Parameters can be tuned to achieve better overal localization performance -> at the cost of some noise reduction

• There is a correlation between physical and perceptual evaluation, even for localization. However error measures far from perfect. (do not include diffuseness, …)


Loc error Sx + Loc error N0 5 subjects SxN0

0

10

20

30

40

50

60

70

80

0 0,1 0,3 1 10 100

beta

(°)

a l p h a = 0

a l p h a = 0 , 5

20-04-2007


Conclusions

• (Speech distortion weighted) MWF preserves the speech cues, not the noise cues

• MWF-ITF enables, by constraining the filters W to an area where noise ITF is preserved, a trade off between preservance of speech and noise cues and noise reduction performance (a solution but not the perfect solution: multiple spectral overlapping noise sources, ...)

• Preserving localization cues did not show a large benefit (due to the spatial seperation of speech and noise) / reduction (due to the extra constraints set on W) in SRT score.

20-04-2007


Acknowledgements

download at https://gilbert.med.kuleuven.be/~u0041407/2007ICASSP.pdf Contact: [email protected]

20-04-2007


objective evaluation: SRT (additions)

• The gain of going from 2 mics on one HA to 3 or 4 mics (low reverb):– Single Noise Scenario:

• 2 mic performance: 5 to 19 dB SNR AI improvement• 2 to 3 mics: +2/+5 dB SNR AI improvement• 3 to 4 mics: +1/+4 dB SNR AI improvement• (max if noise source is at position of 2 mics -> adding a

good SNR signal as 3rd or 4th microphone)

– Multiple noise sources (3 noise sources)• 2 mic performance: around 7 dB SNR AI improvement• 2 to 3 mics: +2dB SNR AI improvement• 3 to 4 mics: +2dB SNR AI improvement

20-04-2007


Localization: objective evaluation

-80 -60 -40 -20 0 20 40 60 800

10

20

30

40

50

60

Angle speech source

ITD

err

or [

%]

ITD error speech (VU_man + auditec 0deg, = 0, SNR=0dB)

-80 -60 -40 -20 0 20 40 60 800

10

20

30

40

50

60

Angle speech source

ITD

err

or [

%]

ITD error noise (VU_man + auditec 0deg, = 0, SNR=0dB)

beta = 0

beta = 0.1

beta = 0.3

beta = 1

beta = 10

beta = 100

SxN0

large changes direction of speech component to noise component increase weight (cf. physical and perceptual evaluation)

Laboratory for Experimental ORL K.U.Leuven, Belgium Dept. of Electrotechn. Eng. ESAT/SISTA...

Documents

Transcript of Laboratory for Experimental ORL K.U.Leuven, Belgium Dept. of Electrotechn. Eng. ESAT/SISTA...