Interrupted speech perception

38
Interrupted speech perception Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota

description

Interrupted speech perception. Su-Hyun Jin, Ph.D. University of Texas & Peggy B. Nelson, Ph.D. University of Minnesota. Introduction. In everyday life, speech is not equally intelligible over time due to background noise - PowerPoint PPT Presentation

Transcript of Interrupted speech perception

Page 1: Interrupted speech perception

Interrupted speech perceptionSu-Hyun Jin, Ph.D.University of Texas

&

Peggy B. Nelson, Ph.D.University of Minnesota

Page 2: Interrupted speech perception

Introduction In everyday life, speech is not equally intelligible over time due

to background noise quite fluctuating rather than steady in terms of time, amplitude

and frequency Normal hearing (NH) listeners can take advantage of fluctuating

nature of noise Masking release: improvements in speech recognition

in fluctuating noise compared to performance in steady noise.

Listener’s ability to utilize the momentary decrease (dips) in fluctuating noise to resolve the background fluctuations in order to extract speech information (Dubno et al., 2002; Nelson et al. 2003; Jin & Nelson, 2006)

Page 3: Interrupted speech perception

Speech and Noise (NH)

Quiet speech

Speech in noise

Speech in fluctuating noise

Page 4: Interrupted speech perception

Introduction

Effect of noise on HI speech perception only little or no release from masking in fluctuating

noise Even in listeners with mild hearing loss ( Bacon et al.,

1998; Dubno et al. 2002) speech recognition in steady noise was close to normal SOME significantly worse than normal in fluctuating noise

Speech perception in fluctuating noise may provide a more sensitive measure of impairment due to hearing loss

Page 5: Interrupted speech perception

Jin & Nelson (2006)

Investigated the relationship between the amount of masking release (MR) and hearing sensitivity and temporal resolution in NH and HI listeners Sentence and consonant recognition in quiet,

steady and fluctuating noise Hearing sensitivity and forward masking

Page 6: Interrupted speech perception

Jin & Nelson (06)Amplification and shaping

Two-stage process to amplify speech and noise for HI listeners Shaping applied based on

half-gain rule to compensate for hearing loss configuration

Overall amplification added to bring listeners to maximum sentence recognition (90% or better) in quiet

Process applied to speech and noise for each HI listener individually

0

10

20

30

40

50

60

70

80

90

500 1000 2000 4000 8000

Frequency

dB S

PL HI1 threshold

HI1 Speech

HI1 Noise

Page 7: Interrupted speech perception

Jin & Nelson (06) Role of hearing sensitivity

Listeners with sensorineural hearing loss showed reduced hearing sensitivity The effect of noise on HI listeners becomes more

detrimental than it is to NH listeners because they already have reduced redundancy of the speech signal in quiet (Van Tasell, 1993). fluct

Role of temporal resolution Compared to NH, HI listeners are more affected by non-

simultaneous maskers Strong correlation between masking release and forward

masking threshold (Dubno et al., 2002) Observed that syllable recognition in fluctuating noise

might be associated with age-related increases in forward-masked thresholds

Page 8: Interrupted speech perception

Result 1: sentence recognitionPercent correct keyword identification at -5 dB SNR

0

10

20

30

40

50

60

70

80

90

100

steady 8 Hz 16 Hz quiet

Condition

Per

cen

t C

orr

ect

Key

wo

rd

NH Mean

HI8

HI9

HI5

HI1

HI4

HI3

HI6

HI7

HI2

Page 9: Interrupted speech perception

Result 1: Sentence recognition Masking release at -5 dB SNR

-10

0

10

20

30

40

50

60

70

80

90

Gate Frequency

Pe

rce

nt

Dif

fere

nc

es NHAVG

HI8

HI9

HI5

HI1

HI4

HI3

HI6

HI7

HI2

Page 10: Interrupted speech perception

Result 1: Speech recognition for both sentence and syllable identification In Quiet and Steady noise

There were no significant NH and HI group differences (p > 0.177).

In Gated noise Significant improvements in gated noise (over

steady) were seen for both NH and HI listeners. NH listeners could take advantage of gating about

to regain 80% of their quiet performance whereas HI listeners only gained 15% to 55%.

Page 11: Interrupted speech perception

Result 1b: Percent correct syllable

identification and MR at -5 dB SNR

0

10

20

30

40

50

60

70

80

90

100

s teady 8 Hz 16 Hz quiet

Condition

Pe

rce

nt

Co

rre

ct

NH Mean

HI8

HI9

HI5

HI1

HI4

HI3

HI6

HI7

HI2

0

10

20

30

40

50

60

Gate Frequency

Per

cen

t D

iffe

ren

ce (

MR

)

NHAVG

HI8

HI9

HI5

HI1

HI4

HI3

HI6

HI7

HI2

Page 12: Interrupted speech perception

Result 2:FWD Masked Threshold

Masked thresholds for HI listeners were higher than those for NH listeners Those with lower thresholds in quiet showed

better thresholds in the presence of noise. The slopes of recovery function were

shallower for HI than NH listeners Those HI listeners with close to normal recovery

function showed relatively larger amount of masking release in fluctuating noise cv?

Page 13: Interrupted speech perception

Discussion

Relation between MR and other measures High negative correlation (r-0.8) between the amount of

MR and the results of forward masking for both NH and HI groups

When the results of HI listeners were analyzed, only a few factors retained a strong correlation with MR. Hearing thresholds at 0.5 and 1.0 kHz, and FWD thresholds at

2 kHz The MR for sentence recognition and CV syllable recognition

were analyzed separately. ??

Page 14: Interrupted speech perception

Result : regression analyses

The strength of the relationship between MR and the predictors: For sentence recognition in gated noise,

hearing sensitivity at low-to-mid frequencies (0.5 and 1 k Hz) accounted for a substantial proportion of variance in the MR.

For consonant recognition, forward masked thresholds contributed primarily to the variancein the MR.

Page 15: Interrupted speech perception

Discussion

HI listeners who performed close to normal in quiet and in steady noise still show reduced masking release for both sentence and consonant recognition in gated noise The overall pattern of masking release measured using

sentences and using CV syllables was similar.

Page 16: Interrupted speech perception

Discussion

Several factors seemed to contribute to the MR Hearing sensitivity for low-to-mid frequencies (0.5

kHz and 1.0 kHz) were strongly related to the MR for sentence recognition but not for syllable identification.

Forward masking thresholds were more strongly related to the MR for CV syllable identification. Consistent with Dubno et al. (2002)

Page 17: Interrupted speech perception

Follow-up

Continued to investigate additional factors that might contribute the reduced sentence recognition in fluctuating noise for HI listeners Auditory integration Frequency resolution

Page 18: Interrupted speech perception

Follow-up

Role of spectral resolution Hearing impairment often associated with reduced frequency

selectivity. Reduced auditory representation of spectral peaks and valleys

in speech (Miller et al., 1997) Cochlear implant listeners

Had little MR for sentence recognition in fluctuating noise (Nelson et al., 2003).

Are known to have normal like temporal resolution (Nelson & Donaldson, 2001) but limited spectral resolution.

Reduced MR may be related to broader auditory filters

Page 19: Interrupted speech perception

Follow-up

Role of auditory integration Understanding speech in real life requires a listener to

analyze complex sounds and separate the acoustic characteristics of the input signal from background noise.

This process has been known as auditory stream segregation (Bregman, 1990).

Listeners seem to be able to segregate into streams when sounds are different in spectral cues (Rose & Moore), or in the harmonicity and fundamental frequency (F0) of a signal (Qin & Oxenham, 2003).

Page 20: Interrupted speech perception

Follow-up

Two tasks Auditory filter characteristics Auditory integration: interrupted sentence recognition

the same NH and HI listeners who participated in Jin & Nelson (2006)

Analysis Examine the relationship between the MR from Jin

& Nelson (2006) and the auditory filter shape and interrupted speech recognition

Page 21: Interrupted speech perception

Follow-up Auditory filter characteristics

Used the filter shape equation (Patterson et al., 1982) Estimate equivalent rectangular bandwidths (ERB) and slope

(p) of the auditory filters for 2000 and 4000 Hz

Interrupted IEEE sentence recognition by silence gap IEEE sentences were gated at rates of 1, 2, 4, 8 and 16 Hz. There was no noise (no masking). Depending on the gate frequencies, whole or only parts of

words were available to listeners. Percent correct keyword identification was recorded.

Page 22: Interrupted speech perception

Speech in gated noise VS

interrupted speech silent gaps

Fluctuating noise at 8 Hz

Interrupted sentence at 8 Hz

Page 23: Interrupted speech perception

Follow-upResult 1: frequency resolution

Compared to NH group, HI listeners showed greater ERBs and shallower slopes for both 2 kHz and 4 kHz. The average ERB for HI listeners (for both 2 kHz

and 4 kHz) were about 2.2-2.7 times to those of NH group.

Page 24: Interrupted speech perception

Follow-upResult 2: interrupted sentence recognition The average percent correct scores of the NH

group at each gate frequency were higher than the scores of HI listeners.

The relationship between sentence recognition in gated noise and interrupted sentence recognition for both NH and HI listeners were significant (r ≈ 0.8). when the scores of HI listeners only were

compared, the correlation remained strong (r ≥ 0.8)

Page 25: Interrupted speech perception

0

10

20

30

40

50

60

70

80

90

100

1 2 4 8 16

Gate Frequency (Hz)

Pe

rce

nt

Co

rre

ct

(%)

NHAVG

HI1

HI2

HI4

HI3

HI5

HI6

HI8

HI7

HI9

Follow-upResult 2: interrupted sentence recognition

Page 26: Interrupted speech perception

Discussion Several factors seemed to contribute to the MR

Hearing sensitivity for low-to-mid frequencies (0.5 kHz and 1.0 kHz) as well as the auditory filter shape at 2 and 4 kHz were strongly related to the MR for sentence recognition.

Understanding speech interrupted either by noise or silence might require a similar underlying integration process. Percent correct scores for interrupted sentence

recognition and sentence recognition in gated noise were strongly correlated.

The hearing sensitivity for the low-to-mid frequencies and ERBs that were significant predictors for both the MR and interrupted sentence recognition

Page 27: Interrupted speech perception

Follow-up 2:interrupted sentence recognition The degree of MR in sentence recognition seemed to be correlated with low-mid frequency hearing sensitivity Gaëtan & Christophe (2002) found older listeners with mild

HL put more perceptual weight on mid frequency band (750-1750 Hz) compared to NH listeners.

Qin & Oxenham (2003) suggested that a strong pitch cue is important for a listener to segregate speech from noise.

Low-mid frequency information may be more important for understanding sentence with competing noise for HI listeners Reducing low-frequency gain to improve comfort in noise

may have unwanted consequences for HI listeners

Page 28: Interrupted speech perception

Follow-up 2

Task: Understanding IEEE sentences in quiet in steady noise in gated noise gated with silent gaps

Participants 10 young adult NH listeners

Page 29: Interrupted speech perception

Follow-up 2

Stimuli Both speech and noise were either

unprocessed (natural) or processed through 4 different filters

Band-pass (bp) filter cutoff frequencies: 0.5-0.75 kHz, 0.75-1 kHz, 1-2 kHz, and 2-3 kHz

Page 30: Interrupted speech perception

Follow-up 2

Speech IEEE sentences spoken by ten talkers presented at comfortable level (70-75 dB SPL) for

individual NH listeners Noise

Long-term spectrum of speech (SNR of -5 dB) steady or fluctuating Gate frequency: 8 Hz the RMS amplitude of noise was adjusted relative to the

RMS of the target speech

Page 31: Interrupted speech perception

Results

  Quiet Noise_Steady Noise_Gated

Natural 100 2.2 77.4

bp 0.5_0.75 kHz 92.3 0.6 28.2

bp 0.75_1 kHz 95.5 4.1 54.4

bp 1_2 kHz 80.6 0.9 16.1

bp 2_3 kHz 96.9 12.5 76.7

Page 32: Interrupted speech perception

Results In quiet, performance was similar across filter

conditions (above 90 %) except band-pass filtered at 1 kHz-2 kHz.

In steady noise at -5 SNR, the performance was quite low (below 5 % ). except when only mid-to-high frequency information

was audible (bp 2-3 kHz) In gated noise, performance was significantly

better than in steady noise. NH listeners were able to use the limited spectral

information in the dips of noise to separate speech from noise

except bp 1_2 kHz poorer performance both in quiet and steady noise

compared to other filtering conditions

Page 33: Interrupted speech perception

0

10

20

30

40

50

60

70

80

90

100

bp0.5_0.75k bp0.75_1k bp1_2k bp2_3k Natural

Quiet steady noise gate noise

0

10

20

30

40

50

60

70

80

90

100

bp0.5_0.75k bp0.75_1k bp1_2k bp2_3k Natural

Percent correct key word identification

Masking Release

Page 34: Interrupted speech perception

0

10

20

30

40

50

60

70

80

bp 0.5_0.75k bp0.75_1k bp1_2k bp2_3k Natural

Masking release Gated speech

Key word identification of interrupted sentences

VSMasking Release

Page 35: Interrupted speech perception

Results

the performance in interrupted speech recognition showed similar pattern to the performance in gated noise If only limited spectral information is available, it

results in low performance level in both gated noise and interrupted speech (for example, bp1_2 kHz).

With more spectral cues available in dips, the performance was better in both conditions.

Page 36: Interrupted speech perception

Discussion

NH listeners were able to use limited spectral information to understand speech in quiet and gated noise.

Using high frequency cues in filtered speech such as bp2_3 kHz, NH listeners were able to segregate speech from gated noise better and showed greater amount of masking release. the high frequency cues might be more important for speech

recognition in complex noise background than the low frequency ones for NH listeners.

it would be less available for HI listeners

Page 37: Interrupted speech perception

Discussion

Similar performance in recognition of speech with gated noise and recognition of gated speech Understanding interrupted speech is dominated

by the information in the dips. (Limited information in the dips = limited masking release)

suggesting that masking release is determined by listeners' ability to decode the information in the dips. (consistent with Kwon and Turner, 2001.)

Page 38: Interrupted speech perception

Future directions???????

Investigate recognition of narrow band pass filtered speech from HI listeners Different degrees and configuration of HL

Compare the perceptual weight function of NH and HI listeners by using sentence recognition in different types of noise.

Implications