More Perception + Fricative Acoustics

44
More Perception + Fricative Acoustics March 31, 2011

description

More Perception + Fricative Acoustics. March 31, 2011. To Begin With…. Today: Some more thoughts on perception And then a brief review of obstruent acoustics On Tuesday, we’ll be doing: A brief description of vocal tract musculature. Static palatography demo! - PowerPoint PPT Presentation

Transcript of More Perception + Fricative Acoustics

Page 1: More Perception +  Fricative Acoustics

More Perception + Fricative Acoustics

March 31, 2011

Page 2: More Perception +  Fricative Acoustics

To Begin With…• Today:

• Some more thoughts on perception

• And then a brief review of obstruent acoustics

• On Tuesday, we’ll be doing:

• A brief description of vocal tract musculature.

• Static palatography demo!

• You’re welcome to bring in a camera, if you so desire.

• Also, a link:

• http://sakurakoshimizu.blogspot.com/

Page 3: More Perception +  Fricative Acoustics

Where were we?• In Categorical Perception:

• All stimuli within a category boundary should be labeled the same.

Page 4: More Perception +  Fricative Acoustics

Discrimination• Original task: ABX discrimination

• Stimuli across category boundaries should be 100% discriminable.

• (= different labels)

• Stimuli within category boundaries should not be discriminable at all.

• (= same labels)

• In practice, categorical perception means:

• the discrimination function can be determined from the identification function.

Page 5: More Perception +  Fricative Acoustics

Discrimination• In this discrimination graph--

• Solid line is the observed data

• Dashed line is the predicted data

(on the basis of the identification scores)

Note: the actual listeners did a little bit better than the predictions.

Page 6: More Perception +  Fricative Acoustics

Categorical, Continued• Categorical Perception was also found for VOT distinctions.

• And for stop/glide/vowel distinctions:

10 ms transitions: [b] percept

60 ms transitions: [w] percept

200 ms transitions: [u] percept

Page 7: More Perception +  Fricative Acoustics

Interpretation• Main idea: in categorical perception, the mind translates an acoustic stimulus into a phonemic label. (category)

• The acoustic details of the stimulus are discarded in favor of an abstract representation.

• A continuous acoustic signal:

• Is thus transformed into a series of linguistic units:

Page 8: More Perception +  Fricative Acoustics

The Next Level• Interestingly, categorical perception is not found for non-speech stimuli.

• Miyawaki et al: tested perception of an F3 continuum between /r/ and /l/.

Page 9: More Perception +  Fricative Acoustics

The Next Level• They also tested perception of the F3 transitions in isolation.

• Listeners did not perceive these transitions categorically.

Page 10: More Perception +  Fricative Acoustics

The Implications• Interpretation: we do not perceive speech in the same way we perceive other sounds.

• “Speech is special”…

• and the perception of speech is modular.

• A module is a special processor in our minds/brains devoted to interpreting a particular kind of environmental stimuli.

Page 11: More Perception +  Fricative Acoustics

Module Characteristics• You can think of a module as a “mental reflex”.

• A module of the mind is defined as having the following characteristics:

1. Domain-specific

2. Automatic

3. Fast

4. Hard-wired in brain

5. Limited top-down access (you can’t “unperceive”)

• Example: the sense of vision operates modularly.

Page 12: More Perception +  Fricative Acoustics

A Modular Mind Modelcentral

processes

judgment, imagination, memory, attention

modules vision hearing touch speech

transducers eyes ears skin etc.

external, physical reality

Page 13: More Perception +  Fricative Acoustics

More Evidence for Modularity• It has also been observed that speech is perceived multi-modally.

• i.e.: we can perceive it through vision, as well as hearing (or some combination of the two).

• We’re perceiving “gestures”

• …and the gestures are abstract.

• Interesting evidence: McGurk Effect

Page 14: More Perception +  Fricative Acoustics

McGurk Effect, revealedAudio Visual Perceived

ba + ga da

ga + ba ba, bga, gba

• Some interesting facts:

• The McGurk Effect is exceedingly robust.

• Adults show the McGurk Effect more than children.

• Americans show the McGurk Effect more than Japanese.

Page 15: More Perception +  Fricative Acoustics

Original McGurk Data Auditory Visual

• Stimulus: ba-ba ga-ga

• Response types:

Auditory: ba-ba Fused: da-da

Visual: ga-ga Combo: gabga, bagba

Age Auditory Visual Fused Combo

3-5 19% 36 81 0

7-8 36 0 64 0

18-40 2 0 98 0

Page 16: More Perception +  Fricative Acoustics

Original McGurk Data Auditory Visual

• Stimulus: ga-ga ba-ba

• Response types:

Auditory: ba-ba Fused: da-da

Visual: ga-ga Combo: gabga, bagba

Age Auditory Visual Fused Combo

3-5 57% 10 0 19

7-8 36 21 11 32

18-40 11 31 0 54

Page 17: More Perception +  Fricative Acoustics

Audio-Visual Sidebar• Visual cues affect the perception of speech in non-mismatched conditions, as well.

• Scientific studies of lipreading date back to the early twentieth century

• The original goal: improve the speech perception skills of the hearing-impaired

• Note: visual speech cues often complement audio speech cues

• In particular: place of articulation

• However, training people to become better lipreaders has proven difficult…

• Some people got it; some people don’t.

Page 18: More Perception +  Fricative Acoustics

Sumby & Pollack (1954)• First investigated the influence of visual information on the perception of speech by normal-hearing listeners.

• Method:

• Presented individual word tokens to listeners in noise, with simultaneous visual cues.

• Task: identify spoken word

• Clear:

• +10 dB SNR:

• + 5 dB SNR:

• 0 dB SNR:

Page 19: More Perception +  Fricative Acoustics

Sumby & Pollack data

Auditory-Only Audio-Visual

• Visual cues provide an intelligibility boost equivalent to a 12 dB increase in signal-to-noise ratio.

Page 20: More Perception +  Fricative Acoustics

Tadoma Method

• Some deaf-blind people learn to perceive speech through the tactile modality, by using the Tadoma method.

Page 21: More Perception +  Fricative Acoustics

Audio-Tactile Perception• Fowler & Dekle: tested ability of (naive) college students to perceive speech through the Tadoma method.

• Presented synthetic stops auditorily

• Combined with mismatched tactile information:

• Ex: audio /ga/ + tactile /ba/

• Also combined with mismatched orthographic information:

• Ex: audio /ga/ + orthographic /ba/

• Task: listeners reported what they “heard”

• Tactile condition biased listeners more towards “ba” responses

Page 22: More Perception +  Fricative Acoustics

Fowler & Dekle data

orthographic mismatch condition

tactile mismatch condition

read “ba”

felt “ba”

Page 23: More Perception +  Fricative Acoustics

Another Piece of the Puzzle• Another interesting finding which has been used to argue for the “speech is special” theory is duplex perception.

• Take an isolated F3 transition:

and present it to one ear…

Page 24: More Perception +  Fricative Acoustics

Do the Edges First!• While presenting this spectral frame to the other ear:

Page 25: More Perception +  Fricative Acoustics

Two Birds with One Spectrogram

• The resulting combo is perceived in duplex fashion:

• One ear hears the F3 “chirp”;

• The other ear hears the combined stimulus as “da”.

Page 26: More Perception +  Fricative Acoustics

Duplex Interpretation• Check out the spectrograms in Praat.

• Mann and Liberman (1983) found:

• Discrimination of the F3 chirps is gradient when they’re in isolation…

• but categorical when combined with the spectral frame.

• (Compare with the F3 discrimination experiment with Japanese and American listeners)

• Interpretation: the “special” speech processor puts the two pieces of the spectrogram together.

Page 27: More Perception +  Fricative Acoustics

fMRI data• Benson et al. (2001)

• Non-Speech stimuli = notes, chords, and chord progressions on a piano

Page 28: More Perception +  Fricative Acoustics

fMRI data• Benson et al. (2001)

• Difference in activation for natural speech stimuli versus activation for sinewave speech stimuli

Page 29: More Perception +  Fricative Acoustics

Mirror Neurons• In the 1990s, researchers in Italy discovered what they called mirror neurons in the brains of macaques.

• Macaques had been trained to make grasping motions with their hands.

• Researchers recorded the activity of single neurons while the monkeys were making these motions.

• Serendipity:

• the same neurons fired when the monkeys saw the researchers making grasping motions.

• a neurological link between perception and action.

• Motor theory claim: same links exist in the human brain, for the perception of speech gestures

Page 30: More Perception +  Fricative Acoustics

Motor Theory, in a nutshell• The big idea:

• We perceive speech as abstract “gestures”, not sounds.

• Evidence:

1. The perceptual interpretation of speech differs radically from the acoustic organization of speech sounds

2. Speech perception is multi-modal

3. Direct (visual, tactile) information about gestures can influence/override indirect (acoustic) speech cues

4. Limited top-down access to the primary, acoustic elements of speech

Page 31: More Perception +  Fricative Acoustics

Moving On…• One important lesson to take from the motor theory perspective is:

• The dynamics of speech are generally more important to perception than static acoustic cues.

• Note: visual chimerism and March Madness.

Page 32: More Perception +  Fricative Acoustics

Auditory Chimeras• Speech waveform + music spectrum:

• Music waveform + speech spectrum:

frequency bands

1 2 4 8 16 32

frequency bands

1 2 4 8 16 32

Source: http://research.meei.harvard.edu/chimera/chimera_demos.html

Originals:

Page 33: More Perception +  Fricative Acoustics

Auditory Chimeras• Speech1 waveform + speech2 spectrum:

• Speech2 waveform + speech1 spectrum:

frequency bands

1 2 4 6 8 16

frequency bands

1 2 4 6 8 16

Originals:

Page 34: More Perception +  Fricative Acoustics

Finally, Fricatives• The last type of sound we need to consider in speech acoustics is an aperiodic, continuous noise.

• Ideally:

• Q: What would the spectrum of this waveform look like?

Page 35: More Perception +  Fricative Acoustics

White Noise Spectrum• Technical term: White noise

• has an unlimited range of frequency components

• Analogy: white light is what you get when you combine all visible frequencies of the electromagnetic spectrum

Page 36: More Perception +  Fricative Acoustics

Turbulence• We can create aperiodic noise in speech by taking advantage of the phenomenon of turbulence.

• Some handy technical terms:

• laminar flow: a fluid flowing in parallel layers, with no disruption between the layers.

• turbulent flow: a fluid flowing with chaotic property changes, including rapid variation in pressure and velocity in both space and time

• Whether or not airflow is turbulent depends on:

• the volume velocity of the fluid

• the area of the channel through which it flows

Page 37: More Perception +  Fricative Acoustics

Turbulence• Turbulence is more likely with:

• a higher volume velocity

• less channel area

• All fricatives therefore require:

• a narrow constriction

• high airflow

Page 38: More Perception +  Fricative Acoustics

Fricative Specs• Fricatives require great articulatory precision.

• Some data for [s] (Subtelny et al., 1972):

• alveolar constriction 1 mm

• incisor constriction 2-3 mm

• Larger constrictions result in -like sounds.

• Generally, fricatives have a cross-sectional area between 6 and 12 mm2.

• Cross-sectional areas greater than 20 mm2 result in laminar flow.

• Airflow = 330 cm3/sec for voiceless fricatives

• …and 240 cm3/sec for voiced fricatives

Page 39: More Perception +  Fricative Acoustics

Turbulence Sources• For fricatives, turbulence is generated by forcing a stream of air at high velocity through either a narrow channel in the vocal tract or against an obstacle in the vocal tract.

• Channel turbulence

• produced when airflow escapes from a narrow channel and hits inert outside air

• Obstacle turbulence

• produced when airflow hits an obstacle in its path

Page 40: More Perception +  Fricative Acoustics

Channel vs. Obstacle• Almost all fricatives involve an obstacle of some sort.

• General rule of thumb: obstacle turbulence is much noisier than channel turbulence

• [f] vs.

• Also: obstacle turbulence is louder, the more perpendicular the obstacle is to the airflow

• [s] vs. [x]

• [x] is a “wall fricative”

Page 41: More Perception +  Fricative Acoustics

Sibilants• Alveolar, dental and post-alveolar fricatives form a special class (the sibilants) because their obstacle is the back of the upper teeth.

• This yields high intensity turbulence at high frequencies.

Page 42: More Perception +  Fricative Acoustics

vs.

“shy” “thigh”

Page 43: More Perception +  Fricative Acoustics

Fricative Noise• Fricative noise has some inherent spectral shaping

• …like “spectral tilt”

• Note: this is a source characteristic

• This resembles what is known as pink noise:

• Compare with white noise:

Page 44: More Perception +  Fricative Acoustics

Fricative Shaping• The turbulence spectrum may be filtered by the resonating tube in front of the fricative.

• (Due to narrowness of constriction, back cavity resonances don’t really show up.)

• As usual, resonance is determined by length of the tube in front of the constriction.

• The longer the tube, the lower the “cut-off” frequency.

• A basic example:

• [s] vs.