Klaus J. Kohler University of Kiel, Germany

84
Speech Sounds, Speech Reduction, Speech Rhythm in the Transmission of Meaning Foundations of Communicative Phonetic Science Klaus J. Kohler University of Kiel, Germany Talk at the Institute of Linguistics of the Academia Sinica (ILAS) 15 August 2011

description

Speech Sounds, Speech Reduction, Speech Rhythm in the Transmission of Meaning Foundations of Communicative Phonetic Science. Klaus J. Kohler University of Kiel, Germany. Talk at the Institute of Linguistics of the Academia Sinica (ILAS) 15 August 2011. - PowerPoint PPT Presentation

Transcript of Klaus J. Kohler University of Kiel, Germany

Page 1: Klaus J. Kohler University of Kiel, Germany

Speech Sounds, Speech Reduction, Speech Rhythm in the Transmission of Meaning

Foundations of Communicative Phonetic Science

Klaus J. Kohler

University of Kiel, Germany

Talk at the Institute of Linguistics of the Academia Sinica (ILAS)15 August 2011

Page 2: Klaus J. Kohler University of Kiel, Germany

1 Goals and principles of Phonetic Science

• We speech scientists need to reflect on what we are doing– how does our daily routine work relate to scientific

goals and principles of our subject?– are we asking the right questions to elucidate speech

communication?– or playing highly competitive academic games?

• This is what my talk is about– the foundations of a communicative phonetic science– in a discussion of the transmission of meaning in 3

time-honoured aras of phonetic study

Page 3: Klaus J. Kohler University of Kiel, Germany

¶speech sounds in words

¶ their reduction in connected speech

¶ their chaining to form rhythmic patterns

Page 4: Klaus J. Kohler University of Kiel, Germany

• The subject matter of Phonetics– human speech, based on

¶ the language faculty of homo sapiens

¶ the socio-cultural systems of individual languages– dependent on a physical, acoustic carrier

¶generated by physiological and articulatory processes

¶ for the transmission of communicative meaning from a speaker to a hearer

Page 5: Klaus J. Kohler University of Kiel, Germany

• theories and methodologies– academic division between humanities and natural

sciences– 2 approaches have shaped the study of speech

¶ linguistics

¶speech signal analysis– good as long as converging on same ultimate goal– has hardly happened in speech science

¶phonology as a study field in the humanities

¶phonetics as a study field in the sciences

Page 6: Klaus J. Kohler University of Kiel, Germany

– historical accident

¶speech measurement was developed

¶ first instrumental and experimental phoneticians

Rousselot

his doctoral student and later director of the Phonetics Lab at Hamburg University Panconcelli-Calzia

"Phonetik als Naturwissenschaft" (Berlin 1948)

Page 7: Klaus J. Kohler University of Kiel, Germany

"Phonetics is a part of physiology, like walking, running, jumping it belongs to the study of motion" (p. 8)

"so, the former wild ear-phonetician, the philologus auricularius furibundus, has become ever more rare" (p.18)

Page 8: Klaus J. Kohler University of Kiel, Germany

the American psychologist Scripture

"All scientifically valid statements concerning speech – like those concerning anthing else – are figurative or imaginary assemblages of words that have no real meaning except for the measurement-numbers they contain. There can be no science of speech that is not based on measurement-numbers. The 'nature of speech' is a summaryof these numbers…

Page 9: Klaus J. Kohler University of Kiel, Germany

The investigator might be – and preferably should be – congenitally deaf and totally ignorant of any notions concerning sound and speech. The entire statement of the results is confined to the measurement-numbers and their harmonies." Archives Néerlandaises de Phonétique Expérimentale 12: 1936.

Page 10: Klaus J. Kohler University of Kiel, Germany

– this false objectivity was countered by phonology of the Prague Circle and the American Structuralists¶who divided the study of speech

° into a humanities subject phonology° and a science subject phonetics

– this dichotomy has been with us ever since– it is a purely academic frame of reference

¶no reality in the speech communication process¶equally uninteresting to devise interfaces

between the two constructs° as in Laboratory Phonology

Page 11: Klaus J. Kohler University of Kiel, Germany

• We need to start afresh– develop the study of speech as a unitary

communicative speech science

¶which we should happily call PHONETICS

¶ in which every speech phenomenon may be studied from 4 perspectives

¶ that must converge on the central goal to understand how humans communicate meaning with speech in languages

Page 12: Klaus J. Kohler University of Kiel, Germany

¶auditory assessment and linguistic categorization

¶speech signal analysis

¶speech perception and understanding

¶communicative function– in a step-wise progression

¶ from the isolated word and sentence

¶ to complex phonetic patterns in speech interaction

Page 13: Klaus J. Kohler University of Kiel, Germany

• This paper deals with 3 areas of phonetic study– phonetic sounds and phonemes for differentiation

of lexical meaning in word citation– sound reduction (and elaboration) at the utterance

level to index interactional meaning– rhythm in language and speech

¶a typology of stress and syllable timing

¶a guiding function in speech interaction.

Page 14: Klaus J. Kohler University of Kiel, Germany

2 The study of segments in speech science

auditory assessment and linguistic categorization– impressionistic sound classification according to IPA

¶auditory assessment by analytically trained ear

¶articulatory labels for consonants

¶auditory classification in relation to recordings of Cardinal Vowels

° projection onto articulatory chart

Page 15: Klaus J. Kohler University of Kiel, Germany
Page 16: Klaus J. Kohler University of Kiel, Germany

Phonetic Association of China (JIPA 2011, Fang Hu)

Page 17: Klaus J. Kohler University of Kiel, Germany

Cardinal Vowel Chart

Page 18: Klaus J. Kohler University of Kiel, Germany

Phonetic Association of China (JIPA 2011, Fang Hu)

Page 19: Klaus J. Kohler University of Kiel, Germany

American English vowels IPA Handbook, p.42

Page 20: Klaus J. Kohler University of Kiel, Germany

Beijing vowels Wai-Sum Lee & Eric Zee, JIPA33 (2003)

Page 21: Klaus J. Kohler University of Kiel, Germany

– transcription of sounds– grouping of sounds to phonemes

¶ lexical differentiation

¶concept already in broad vs narrow transcription– typology of phoneme systems

¶ Ian Maddieson

¶Eric Zee:"Vowel typology in Chinese", ICPhS 2007

Page 22: Klaus J. Kohler University of Kiel, Germany

– these linguistic studies provide basic knowledge of the pronunciation of words in languages

¶ pronouncing dictionaries

¶devising economical alphabetic writing systems for unwritten languages

¶dialectology

¶but little contribution to the phonetics of every-day spontaneous speech interaction in communicative settings

Page 23: Klaus J. Kohler University of Kiel, Germany

speech signal analysis– acoustic measurement of formants, usually F1/F2– Peterson and Barney 1952

Page 24: Klaus J. Kohler University of Kiel, Germany

– mixture of dialects, male-female, adult-child

¶perceptual equivalence not clear

¶ filling phonological categories with (acoustic) substance

– similar studies for other languages¶e.g. Hong Kong Cantonese vowels LEE Wai-Sum,

PCC2008

¶separation of gender and age

¶but still only vowels in isolated words– we have limited information on physical

manifestation of vowels in speech interaction

Page 25: Klaus J. Kohler University of Kiel, Germany

LEE Wai-Sum, PCC2008

Page 26: Klaus J. Kohler University of Kiel, Germany

speech perception and understanding– in these F1/F2 measurements the relationship

between acoustic variability and the perception of the “same sound” category remains opaque

– Traunmüller, Stockholm, changed vowel perception along different dimensions, starting from natural [i] production

Page 27: Klaus J. Kohler University of Kiel, Germany

¶ raising 1st formant changes vowel height 1-Bark steps, starting at 2.5 Bark

Page 28: Klaus J. Kohler University of Kiel, Germany

¶ raising first formant and f0 in unison

° 1-Bark steps, f0 starting at 1.5 Bark

¶changes vocal effort; keeps same vowel quality

Page 29: Klaus J. Kohler University of Kiel, Germany

¶ raising f0 and all formants in unison, 1-Bark steps

¶changes speaker size, keeps same vowel quality

Page 30: Klaus J. Kohler University of Kiel, Germany

• These examples show– acoustic data have no value in themselves

¶particularly when reduced to F1/F2 in vowels– unless they are related to perception– not insightful to ask what the F1/F2 values are for

the vowel phonemes of a language

Page 31: Klaus J. Kohler University of Kiel, Germany

• Here is an example of misguided data interpretation on the basis of restricted F1/F2 measurements.

• J. Harrington, S. Palethorpe, C. I. Watson

Does the Queen speak the Queen’s English?

Nature 408, 927-928 (2000).

Monphthongal vowel changes in Received Prounuciation: an acoustic analysis of the Queen’s Christmas Broadcasts.

JIPA 30, 63-78 (2000).

• J. Harrington. An acoustic analysis of ‘happy tensing’ in the Queen’s Christmas broadcasts.

JPhon 3, 439-457 (2006).

Page 32: Klaus J. Kohler University of Kiel, Germany

• Researchers like John Wells described changes in monophthongs of younger RP speakers on the basis of mainly auditory familiarity with SSB– e.g. greater opening of in> CV4 – or closing, tensing of final unstressed in happy >

• Harrington picked up this descriptive phonetic thread

– wove it into an investigation of sociophonetic change – by a spectral F1/F2 analysis of the monophthongs

Page 33: Klaus J. Kohler University of Kiel, Germany

• He started from the following question/premisses:– Does Queen Elizabeth II, the representative of the

RP establishment, follow this trend in her Christmas broadcasts since 1952, when those of the 50s, the 80s and the 90s are compared?

– If so, this would indicate “a drift in the Queen’s accent towards one that is characteristic of speakers who are younger and/or lower in the social hierarchy” (Nature)

– Thus the question becomes one of social change.– The answer to this question can be given by F1/F2

measurement independent of auditory assessment.

Page 34: Klaus J. Kohler University of Kiel, Germany

• The analysis rests on two highly dubious assumptions – that change in a speaker’s accent can be measured by

looking at selected segments in selected words

¶phonetic variation in the “same” phonemes

¶narrowed down to a sub-class of vowels

¶excised and thus treated as context-free– that F1 and F2 reflect the articulatory parameters

low/high and front/back, respectively

Page 35: Klaus J. Kohler University of Kiel, Germany

• An upper-class English army officer does not sound like one because of the way he pronounces certain phonemic segments but because of his general basis of articulation, his voice quality, his prosodic patterns.

General Sir Mike Jackson • Similarly, the Queen sounds like the young sovereign

of the 50s or the older sovereign of later decades, especially because of drastic differences in pitch level.

– 1957: above 200Hz up to 450 Hz– 2008: between 180 Hz and 250 Hz

1957 2008

Page 36: Klaus J. Kohler University of Kiel, Germany

• This large pitch change has two consequencies for the determination of vowel quality– Traunmüller’s studies show that perceptual quality

of vowel height is determined by the distance between f0 and F1

¶ thus acoustic F1 measurements do not represent a speaker’s vowels adequately

¶ they need a complementary auditory evaluation– high f0 makes the analysis of F1 in high vowels

very difficult and unrealiable.

Page 37: Klaus J. Kohler University of Kiel, Germany

• Consequently H.’s conclusion that final is closer to in the 90s than in the 50s on account of lower F1 should not be derived from his data.

• On the contrary, auditory comparison of the vowels in the two data sets proves that the conclusion is wrong– this vowel has not changed as a perceptual entity– F1 measurements give the wrong picture– the result is a statistical artefact– problem of context-free treatment of vowels.

Page 38: Klaus J. Kohler University of Kiel, Germany

• It becomes evident when the context of the word happy in a Christmas broadcast is considered– Happy Christmas– the tongue continues to move up to complete

closure for [k]– consequently in this context must be expected

to be higher than in e.g. happy family, or phrase-final in, e.g., happy, and united family, or even sentence-final in tremendously happy

– the different contexts should thus not be conflated

Page 39: Klaus J. Kohler University of Kiel, Germany

happy_1958fin-1957chr.wav

happy_1997fin-2008chr.wav

happy_1958fin-1997fin.wav

happy_1957chr-2008chr.wav

• These data show two things

1) the auditory quality of is contextually conditionedcloser before – opener phrase-final

2) the 50s and 90s data are perceptually congruent.

Page 40: Klaus J. Kohler University of Kiel, Germany

• Further contextual conditions are to be expected for this and all the other words analysed– the frequencies of occurrence of these various

conditions will vary across the two data sets – but F1 averages in each set are computed for all

the words, irrespective of phonetic contexts– therefore acoustic measurement without auditory

control can produce differences one way or the other

– the result then becomes a statistical artefact.

Page 41: Klaus J. Kohler University of Kiel, Germany

• The noteworthy change in the Queen’s speech is– not a move towards sound features of younger and

more middle-class speakers of RP– but a lowering of her pitch level– occurring in the 60s

¶not due to ageing

¶but in a deliberate effort to overcome her former girlish voice and to sound more authoritative

¶no doubt with help from advisers

Page 42: Klaus J. Kohler University of Kiel, Germany

• From this perspective, changes in the Queen’s speech do not reflect social changes in Britain in the 60s– as H. suggests– and as the press marketed his study

e.g. the Telegraph website 21 February 2007

“Is the Queen’s English now more common?”

“How Queen’s English has grown more like ours”– the exact opposite seems to be the case

transmitting authority to her subjects • So, speech scientists should be more careful in their

acoustic analyses and their sociophonetic deductions.

Page 43: Klaus J. Kohler University of Kiel, Germany

communicative function– even when physical measures are related to

linguistic categories via the listener

¶more variability needs to be considered for the modelling of speech communication

¶we need to transcend the isolated sound and the phoneme in word citation forms

¶and look at sound patterns in utterances– these patterns are determined by stylistic and

communicative functions

Page 44: Klaus J. Kohler University of Kiel, Germany

• Francis Nolan "Overview of English Connected Speech Processes" AIPUK 31 (1996)

– orthographicI don't suppose you could make it for eighteen hundred

– word pronunciations

Page 45: Klaus J. Kohler University of Kiel, Germany

– careful

– naturalŒ

– casual

Page 46: Klaus J. Kohler University of Kiel, Germany

• Example from American English: N. Warner’s www– “I don‘t even know what we‘re gonna do.”

¶ ¶ canonical word transcription

Page 47: Klaus J. Kohler University of Kiel, Germany
Page 48: Klaus J. Kohler University of Kiel, Germany

• These phrase-phonetic patterns are regular in relation to communicative situations– and can go even further– Sarah Hawkins & Rachel Smith "Polysp: a

polysytemic, phonetically rich approach to speech understanding", Italian Journal of Linguistics, 13 (2002)

¶conveying the meaning of I DO NOT KNOW

¶ I don't know 2 speakers

¶ (I) dunno 2 speakers

Page 49: Klaus J. Kohler University of Kiel, Germany

¶2 expanded forms, different kinds of exasperation

¶2 forms reduced to dynamically changing vocalic resonances, rudiments of three syllables

° from more open to less open

° with increasing lip narrowing

° most extreme form can be ° not slurred drunken speech

° but casual speech when otherwise occupied

Page 50: Klaus J. Kohler University of Kiel, Germany

• These stylistic sound patterns in relation to communicative function are far from well described– there are only very few languages for which we

have some communicative realization rules

¶English

¶even more so for German, due to work at Kiel– and how listeners are able to decode such variously

degraded speech successfully is an open question

¶not just a matter of surface signal perception

¶but top-down interpretation within the communicative situation

Page 51: Klaus J. Kohler University of Kiel, Germany

– and information about such stylistic sound patterns in tone languages seems to be lacking altogether

– here is a great task for you for the future– to develop communicative phonetic science in

Mandarin

Page 52: Klaus J. Kohler University of Kiel, Germany

3 The study of rhythm in speech science

auditory assessment and linguistic categorization– stress and syllable timing of whole languages

¶certainly some truth in this intuitive evaluation by the descriptive phoneticians, of rhythmic differences between e.g. French and Spanish vs English and German

Page 53: Klaus J. Kohler University of Kiel, Germany

Humpty Dumpty sat on a wall,Humpty Dumpty had a great fall,And all the king's horses,And all the king's men,Couldn’t put Humpty together again.

Page 54: Klaus J. Kohler University of Kiel, Germany

F0 patterns without and with pitch accent timing

Page 55: Klaus J. Kohler University of Kiel, Germany

speech signal analysis– filling pre-established linguistic categories with

substance– restricted to duration variable– measurement of foot and syllable durations to

substantiate rhythm types of stress and syllable timing¶ doomed to failure ¶ because no thresholds for rhythmicity ever

determined theoretically outside the measured data

Page 56: Klaus J. Kohler University of Kiel, Germany

– new measures for rhythm categories: Grabe, Low¶ raw and normalized Pairwise Variability Indices

rPVI and nPVI of segment durations

where m=number of sections, d=dur of kth section¶ vocalic nPVI against intervocalic rPVI in Cartesian

diagram shows language clustering¶ these clusters are related to rhythm types

Page 57: Klaus J. Kohler University of Kiel, Germany

° just a way of data sorting, not objective explanation

Page 58: Klaus J. Kohler University of Kiel, Germany

¶ return to phonological interpretation (Dauer)

° systemic phonological elements and their concatenation rules determine rhythm

° Polish, Catalan new rhythm types (Nespor)

»Polish has heavy consonant clusters but no vowel reduction

»for Catalan it is the other way tound° other Slavonic and Germanic languages have both° Japanese, Spanish have neither

Page 59: Klaus J. Kohler University of Kiel, Germany

speech perception – relationship between physical signal properties and

the perception of prominence patterns by the listener

– active construction process bottom-up and top-down

– well known from psychoacoustic experiments – rhythm is in the mind of the listener

Page 60: Klaus J. Kohler University of Kiel, Germany

Grouping of tone sequences by loudness, timing, f0

Fig. adapted from

H

andel 1989, p. 387

STIMULUS PERCEPT

Page 61: Klaus J. Kohler University of Kiel, Germany

Sequences of ba syllables: single events, trochaic and dactylic groupings by f0

Page 62: Klaus J. Kohler University of Kiel, Germany

Sequences of ba syllables: single events, trochaic and iambic groupings by syllabic duration, flat f0

Page 63: Klaus J. Kohler University of Kiel, Germany

Sequences of ba syllables: single events, trochaic and iambic groupings by syllabic energy, flat f0

Page 64: Klaus J. Kohler University of Kiel, Germany

• recurring fundamental frequency patterns• recurring syllabic duration patterns• recurring syllabic loudness patterns• all perceived as chunking of speech

– creating waning or waxing prominence patterns that occur with some degree of regularity over time

What characterizes speech rhythm?

Page 65: Klaus J. Kohler University of Kiel, Germany

• the waning and waxing prominence patterns in simple articulated syllable sequences recapture what is known from psychoacoustic experiments

• rhythm in running speech far more complex– more complex syllable structures– even in the simplest CV language like Japanese

different vowels and consonants– progressive increase of complexity in European

languages from Romance to Germanic to Slavonic• but always combination of the three properties for the

creation of recurring prominence patterns over time

Page 66: Klaus J. Kohler University of Kiel, Germany

• at this point the factor ‘language’ comes in – the parameters pitch, syllable timing and energy

will be combined differently to create rhythmicity in different languages

– testing different types of languages with very different rhythmical structures, e.g.¶Germanic: English, German¶Romance: French, Spanish¶ tone languages: Mandarin, Cantonese

Page 67: Klaus J. Kohler University of Kiel, Germany

– in Germanic languages, rhythmic structuring is associated with lexically stressed syllables

¶ that receive prominence peaks

¶creating waxing and waning prominence patterns over time

¶with a tendency to compress syllables in between

¶but no isochrony, flexibility according to syllable structures within margins of regularity

Page 68: Klaus J. Kohler University of Kiel, Germany

– this regularity occurs in most stylized form in verse metre

– in spontaneous speech regularity constantly disturbed– but also emerges from phraseology

¶“bow and arrow”, “Pfeil und Bogen”

¶“Pride and Prejudice”, “Sense and Sensibility”

¶ding dong, sing-song, ping-pong, flip-flop, wishy- washy

Page 69: Klaus J. Kohler University of Kiel, Germany

round, flowing, soft pointed, broken, hard

"Maluma" "Takete"

continuous sonority, nasals, back vowels

sonority broken by plosives; front vowels

“Maluma und Takete” more rhythmical than reverse order since sonority broken at end, as in “thunder and lightning”

Gestalt Psychology, Wolfgang Köhler: Psychologische Probleme 1933

Page 70: Klaus J. Kohler University of Kiel, Germany

– what happens in a tone language, where tonal patterns are first of all linked to syllables?

– where tonal structuring of utterances does thus not follow the rhythmic patterning by phrasal pitch

– how does a Mandarin Chinese speaker realise the English Humpty Dumpty nursery rhyme?

Page 71: Klaus J. Kohler University of Kiel, Germany

Chin.-Eng. Humpty Dumpty sat on a wall

H N L H N L L L L F

Page 72: Klaus J. Kohler University of Kiel, Germany

Chin.-Eng. and all the king’s horses

L H N H H H

Page 73: Klaus J. Kohler University of Kiel, Germany

– perceptual impression of tone sequences instead of melodic accent patterns

– speaks clearly against the assumptions of a tone sequence model such as AM/ToBI for a non-tone language like English

– in the realization of English melodic patterns there is a transfer of the Mandarin Chinese prosodic structures ¶ the basis are the tone sequences from a repertoire

of 4 + 1 tones¶selection is an empirical question:

Page 74: Klaus J. Kohler University of Kiel, Germany

° frequencies of Chinese tones and their combinations in words

° link of the neutral tone with function words

° association of high tones with focus

° sandhi rules

¶superimposed there are phrase prosodies– how is verse metre organized in Chinese poetry

and realised in recitation?– how are nursery rhymes structured and recited?

Page 75: Klaus J. Kohler University of Kiel, Germany

communicative function – rhythm aids intelligibility: rhythmic beats guide the

listener, allowing the projection of events to come– so rhythm has an essential communicative function

in transmission of meaning from speaker to listener– this is where rhetorical proficiency comes in– good rhetoricians, such as Martin Luther King,

capture listeners by commanding all the verbal and rhythmical registers of meaning transmission

– but there are good and bad rhythmic speakers in everyday performance

Page 76: Klaus J. Kohler University of Kiel, Germany

– especially the pool of informants linguists and phoneticians usually dip into

¶ today’s student population¶subjects should not be used indiscriminately for

production experiments in the study of rhythm¶doubtful value of production data unless initial

screening of subjects

Page 77: Klaus J. Kohler University of Kiel, Germany

Ex 1: good rhythmicity: IViE c-rea1-m1

clear regular rhythmical beats

salient groupings by pitch bracketing

Ex 2: mediocre rhythmicity: IViE c-rea1a-f6

no clear regular beats

groupings by pitch bracketing not salient

English examples of good/mediocre rhythmicity

Page 78: Klaus J. Kohler University of Kiel, Germany

– it may not be a very serious public concern when academics arrive at the wrong generalizations about speech and language because they rely on the wrong data

– but it becomes of the utmost importance to the general public when announcements at airports, stations, in trains or on planes are poorly intelligible because the untrained speakers lack rhythmicity

Page 79: Klaus J. Kohler University of Kiel, Germany

• I have presented a paradigm of

Communicative Phonetic Science

for the analysis of meaning transmission by sounds and rhythm.

4 Conclusion

Page 80: Klaus J. Kohler University of Kiel, Germany

– It develops the study of speech as a unitary communicative speech science

¶ in which every speech phenomenon is studied from 4 perspectives

¶ that must converge on the central goal to understand how humans communicate meaning by speech in languages

Page 81: Klaus J. Kohler University of Kiel, Germany

¶auditory assessment and linguistic categorization

¶speech signal analysis

¶speech perception and understanding

¶communicative function– in a step-wise progression

¶ from the isolated word and sentence

¶ to complex phonetic patterns in speech interaction

Page 82: Klaus J. Kohler University of Kiel, Germany

– it does not assume phonological invariance– but establishes regularities of flexible sound

patterns– in relation to communicative situations and

functions– in the languages of the world

Page 83: Klaus J. Kohler University of Kiel, Germany

• It subscribes to a quotation from Albert Einstein

"It would be possible to describe everything scientifically, but it would make no sense, it would be without meaning, as if you describe a Beethoven symphony as a variation of wave pressures."

Page 84: Klaus J. Kohler University of Kiel, Germany

It provides an analysis framework

From Sound to Sense

for phoneticians world-wide to advance,

gradually but steadily,

our knowledge of human communication