Conclusions Constriction Type does influence AV speech perception when it is visibly distinct...

1
Conclusions Constriction Type does influence AV speech perception when it is visibly distinct •Constriction is more effective than Articulator in this stimulus context critical constriction degree (fricatives) shows the strongest visual influence Active Articulator had little visual effect labials did not have greater effect than linguals •However, passive articulator differences may account for the strong /v/-/D/ effects Articulatory Phonology implications for AV speech perception/production Précis In audio-visual (AV) speech perception the two modalities convey largely complementary information (V: Place, A: Manner). But place can be low visibility, and manner visible. Articulatory Phonology and ecological/direct realist views imply that examining visible vs. audible gestural structure may offer novel insights.Perceptual effects of active articulator vs. constriction degree were examined in a McGurk task using anterior consonants that differ visibly on both dimensions. Visual impact was greatest for incongruent A- V signals that used different articulators but same constriction type, stronger for fricatives than stops/glides, yet failed to yield an articulator effect. Thus, constriction affects AV perception, more so than active articulator, in identification of visually distinct anterior consonants. Background Audio-visual (AV) speech perception shows modality-specific contributions (MacDonald & McGurk, 1978; VPAM : Summerfield, 1987): Audio provides primarily manner information Visual provides place of articulation information Yet, some qualifications re: those assumptions: place and manner imperfectly related to visibility: place (POA) visibility varies labials vs. non-labials also some visibility for some coronals face dynamics re: other POA info (below) manner also varies: stops - fricatives - glides • unclear how narrowly to define POA, e.g. /b v/ • SAME: labial ( broad transcripttion) DIFFERENT: labiodental vs. bilabial (narrow transcription) dynamic visual speech information is distributed across the talking face/head (Yehia et al., 1998) • correlates with tongue as well as lip and jaw movements Results : Experiment 1 Gestural Incongruity Type main effect, p < .0001 Visual influence was strongest when A and V tokens differed in Articulator but shared the same Constriction degree Gestural Incongruity x Articulator, p = .0084 The preceding effect was more pronounced when the video token used lips than tongue tip Results cont’d: Experiment 1 Constriction Type main effect, p = .0001 Visual influence on perception was greater for fricatives than stops Gestural Incongruity x Constriction, p < .0001 /v/-/D/ pairs showed the strongest visual effect, followed by video stop paired with opposite- articulator fricative Articulator x Constriction, p < .017 Both fricatives had strong visual effects, but labial stop > lingual stop Experiment 2 Gestural Incongruity Type main effect, p < .0001 Replicated that of Experiment 1 Constriction Type main effect, p = .0001 Again, fricatives had a larger visual effect than stops; in Exp. 2 they also superceded glides Results cont’d: Experiment 2 Gestural Incongruity x Articulator, p < .053 marginal: largest visual effects for /v/-/D/; /b/-/d/ pairs and video fricative + audio stop/glide yielded next largest visual effect Gestural Incongruity x Constriction, p < .0001 Replication/extension of Exp. 1 interaction effect. /v/-/D/ showed glides with opposite-articulator stop/fricative was next-strongest. AVSP’05 Influences of Visible Place Versus Manner Distinctions Influences of Visible Place Versus Manner Distinctions on Perception of Audio-Visual English CV Syllables on Perception of Audio-Visual English CV Syllables Catherine T. Best & Daniel Lazarek Catherine T. Best & Daniel Lazarek [email protected] [email protected] Research Question: How do visible distinctions in active articulator and constriction degree contribute to AV speech perception? 0 0.2 0.4 0.6 0.8 1 Incongruent Pair x Articulator lip tongue t 0 0.2 0.4 0.6 0.8 1 Incongruent Pair x Constriction fricati stops 0 0.2 0.4 0.6 0.8 1 Incongruent P Type Same Articulato Different Constricti Different Articulato Same Constricti Different Articulato Different Constricti 0 0.2 0.4 0.6 0.8 1 Video Constriction stops fricative glides 0 0.2 0.4 0.6 0.8 1 Incongruent Pa Video Articula lip tongue t Same Articulato Different Constriction Different Articulator, Sa Constriction Different Articulator, Different Constriction 0 0.2 0.4 0.6 0.8 1 Incongruent Pairs x Constriction fricativ stops glide Method Stimuli anterior Cs (USA English) that differ visibly re: Active Articulator •lower lip tongue tip/blade Constriction: closed (stop) critical (fricative) narrow (glide) (included only in Exp. 2) Subjects English (USA) : Exp 1 (n =14), Exp 2 (n =12) Task report C heard: AV-congruent & AV- incongruent Data Visual Speech Index (VSI), calculated on proportion correct audio identifications VSI = [AVcongruent - AVincongruent] lip tongue closed /b/ /d/ critical /v/ /D/ narrow (Exp 2 only) /w/ /j / 0 0.2 0.4 0.6 0.8 1 Video Articula Video Manner lips, stop tongue tip, stop lips, frica tongue tip, fricative /b/ /d/ /v/ /D/ video:

Transcript of Conclusions Constriction Type does influence AV speech perception when it is visibly distinct...

Page 1: Conclusions  Constriction Type does influence AV speech perception when it is visibly distinct Constriction is more effective than Articulator in this.

Conclusions Constriction Type does influence AV speech

perception when it is visibly distinct • Constriction is more effective than Articulator

in this stimulus context

• critical constriction degree (fricatives) shows the strongest visual influence

Active Articulator had little visual effect • labials did not have greater effect than linguals

• However, passive articulator differences may account for the strong /v/-/D/ effects

Articulatory Phonology implications for AV speech perception/production research• gestural parameters may offer better (or

additional) guidance than phonetic features

PrécisIn audio-visual (AV) speech perception the two modalities convey largely complementary information (V: Place, A: Manner). But place can be low visibility, and manner visible. Articulatory Phonology and ecological/direct realist views imply that examining visible vs. audible gestural structure may offer novel insights.Perceptual effects of active articulator vs. constriction degree were examined in a McGurk task using anterior consonants that differ visibly on both dimensions. Visual impact was greatest for incongruent A-V signals that used different articulators but same constriction type, stronger for fricatives than stops/glides, yet failed to yield an articulator effect. Thus, constriction affects AV perception, more so than active articulator, in identification of visually distinct anterior consonants.

Background

Audio-visual (AV) speech perception shows modality-specific contributions (MacDonald & McGurk, 1978; VPAM : Summerfield, 1987): Audio provides primarily manner information Visual provides place of articulation information

Yet, some qualifications re: those assumptions: place and manner imperfectly related to visibility:

• place (POA) visibility varies• labials vs. non-labials• also some visibility for some coronals• face dynamics re: other POA info (below)

• manner also varies: stops - fricatives - glides• unclear how narrowly to define POA, e.g. /b v/

• SAME: labial ( broad transcripttion)• DIFFERENT: labiodental vs. bilabial (narrow transcription)

dynamic visual speech information is distributed across the talking face/head (Yehia et al., 1998)

• correlates with tongue as well as lip and jaw movements

• this info can guide intelligible audio synthesis

Articulatory Phonology (Browman & Goldstein, 1992, 2000) suggests an alternative: A-V perception re: articulatory gestures (cf Fowler & Dekle, 1991) Active articulator: lower lip vs. tongue tip/blade Constriction degree: closed - critical - narrow

Results: Experiment 1

Gestural Incongruity Type main effect, p < .0001 Visual influence was strongest when A and V

tokens differed in Articulator but shared the same Constriction degree

Gestural Incongruity x Articulator, p = .0084 The preceding effect was more pronounced

when the video token used lips than tongue tip

Results cont’d: Experiment 1 Constriction Type main effect, p = .0001

Visual influence on perception was greater for fricatives than stops

Gestural Incongruity x Constriction, p < .0001

/v/-/D/ pairs showed the strongest visual effect, followed by video stop paired with opposite-articulator fricative

Articulator x Constriction, p < .017

Both fricatives had strong visual effects, but labial stop > lingual stop

Experiment 2 Gestural Incongruity Type main effect, p < .0001

Replicated that of Experiment 1

Constriction Type main effect, p = .0001

Again, fricatives had a larger visual effect than stops; in Exp. 2 they also superceded glides

Results cont’d: Experiment 2

Gestural Incongruity x Articulator, p < .053 marginal: largest visual effects for /v/-/D/; /b/-/d/

pairs and video fricative + audio stop/glide yielded next largest visual effect

Gestural Incongruity x Constriction, p < .0001 Replication/extension of Exp. 1 interaction

effect. /v/-/D/ showed strongest effect by far. Video glides with opposite-articulator stop/fricative was next-strongest.

AVSP’05

Influences of Visible Place Versus Manner DistinctionsInfluences of Visible Place Versus Manner Distinctionson Perception of Audio-Visual English CV Syllableson Perception of Audio-Visual English CV Syllables

Catherine T. Best & Daniel LazarekCatherine T. Best & Daniel [email protected]@uws.edu.au

Research Question: How do visible distinctions in active articulator and constriction degree contribute to AV speech perception?

0

0.2

0.4

0.6

0.8

1

Incongruent Pair x Video Articulator

VSI Score

lips tongue tip

0

0.2

0.4

0.6

0.8

1

Incongruent Pair x Video Constriction

VSI Score

fricativesstops

0

0.2

0.4

0.6

0.8

1

Incongruent Pair Type

VSI Score

SameArticulator,DifferentConstriction

DifferentArticulator,SameConstriction

DifferentArticulator,DifferentConstriction

0

0.2

0.4

0.6

0.8

1

Video Constriction Type

Visual Perception Index Score

stops

fricatives

glides

0

0.2

0.4

0.6

0.8

1

Incongruent Pair xVideo Articulator

VSI Score

lips tongue tip

Same Articulator, Different ConstrictionDifferent Articulator, Same ConstrictionDifferent Articulator, Different Constriction

0

0.2

0.4

0.6

0.8

1

Incongruent Pairs x Video Constriction

VSI Score

fricativesstops glides

MethodStimuli anterior Cs (USA English) that differ visibly re:

Active Articulator• lower lip• tongue tip/blade

Constriction:• closed (stop)• critical (fricative)• narrow (glide) (included only in Exp. 2)

Subjects English (USA) : Exp 1 (n =14), Exp 2 (n =12)

Task report C heard: AV-congruent & AV-incongruent

Data Visual Speech Index (VSI), calculated on

proportion correct audio identificationsVSI = [AVcongruent - AVincongruent]

lip tongueclosed /b/ /d/

critical /v/ /D/narrow (Exp 2 only) /w/ /j/

0

0.2

0.4

0.6

0.8

1

Video Articulator x Video Manner

VSI Score

lips, stop

tongue tip,stop

lips, fricative

tongue tip,fricative

/b/ /d/ /v/ /D/

video: