Intonational Variation in Spoken Dialogue Systems
-
Upload
lucy-gregory -
Category
Documents
-
view
35 -
download
0
description
Transcript of Intonational Variation in Spoken Dialogue Systems
![Page 1: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/1.jpg)
04/19/231
AT&T Labs AT&T Labs ResearchResearch
Intonational Variation in Spoken Intonational Variation in Spoken Dialogue SystemsDialogue Systems
Generation and UnderstandingJulia Hirschberg
Charles UniversityMarch 2001
![Page 2: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/2.jpg)
Julia 04/19/23 2
Talking to a Machine….and Talking to a Machine….and Getting an AnswerGetting an Answer
• Today’s spoken dialogue systems make it possible to accomplish real tasks, over the phone, without talking to a person Real-time speech technology enables real-time
interaction Speech recognition and understanding is ‘good
enough’ for limited, goal-directed interactions Careful dialogue design can be tailored to
capabilities of component technologies• Limited domain• Judicious use of system initiative vs. mixed initiative
![Page 3: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/3.jpg)
Julia 04/19/23 3
Some RepresentativeSome RepresentativeSpoken Dialogue SystemsSpoken Dialogue Systems
1980+ 1990+ 1993+ 1995+ 1997+ 1999+
Mixed Initiative
System Initiative
Banking(ANSER)
Deployed
ATIS(DARPA Travel)
MITGalaxy/Jupiter
DirectoryAssistant (BNR)
Multimodal Maps(Trains, Quickset)
Customer Care(HMIHY – AT&T)
Communications(Wildfire, Portico)
Train Schedule(ARISE)
Communicator(DARPA Travel)
Brokerage(Schwab-Nuance)
Air Travel(UA Info-SpeechWorks)
E-MailAccess(myTalk)
User
![Page 4: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/4.jpg)
04/19/234
AT&T Labs AT&T Labs ResearchResearch
But we have a long way to go…But we have a long way to go…
![Page 5: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/5.jpg)
Julia 04/19/23 5
Course OverviewCourse Overview
• Spoken Dialogue Systems today Evaluating their weaknesses Role of intonational variation
• Importance of corpora and conventions for annotating them
• Intonational ‘meanings’ • Prosody in Speech Generation• Prosody in Speech Recognition/
Understanding
![Page 6: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/6.jpg)
Julia 04/19/23 6
Course OverviewCourse Overview
• Spoken Dialogue Systems today Evaluating their strengths and weaknesses Role of intonational variation
• Importance of corpora and conventions for annotating them
• Intonational ‘meanings’ • Prosody in Speech Generation• Prosody in Speech Recognition/
Understanding
![Page 7: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/7.jpg)
Julia 04/19/23 7
Evaluating Dialogue SystemsEvaluating Dialogue Systems
• PARADISE framework (Walker et al ’00)• “Performance” of a dialogue system is
affected both by what gets accomplished by the user and the dialogue agent and how it gets accomplishedMaximizeMaximize
Task SuccessTask Success Minimize Minimize
CostsCosts
EfficiencyEfficiencyMeasuresMeasures
QualitativeQualitativeMeasuresMeasures
![Page 8: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/8.jpg)
Julia 04/19/23 8
Task SuccessTask Success
Attribute Attribute ValueValueSelection CriterionSelection Criterion Kim Kim oror Meeting MeetingTimeTime 10:30 a.m.10:30 a.m.PlacePlace 2D5162D516
•Task goals seen as Attribute-Value MatrixELVIS e-mail retrieval taskELVIS e-mail retrieval task (Walker et al ‘97)(Walker et al ‘97)
““Find the Find the timetime and and placeplace of your of your meetingmeeting with with KimKim.”.”
•Task success defined by match between AVM values at end of with “true” values for AVM
![Page 9: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/9.jpg)
Julia 04/19/23 9
MetricsMetrics
• Efficiency of the Interaction:User Turns, System Turns, Elapsed Time
• Quality of the Interaction: ASR rejections, Time Out Prompts, Help Requests, Barge-Ins, Mean Recognition Score (concept accuracy), Cancellation Requests
• User Satisfaction• Task Success: perceived completion,
information extracted
![Page 10: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/10.jpg)
Julia 04/19/23 10
Experimental ProceduresExperimental Procedures
• Subjects given specified tasks• Spoken dialogues recorded• Cost factors, states, dialog acts
automatically logged; ASR accuracy,barge-in hand-labeled
• Users specify task solution via web page• Users complete User Satisfaction surveys• Use multiple linear regression to model
User Satisfaction as a function of Task Success and Costs; test for significant predictive factors
![Page 11: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/11.jpg)
Julia 04/19/23 11
User User SatisfactionSatisfaction::Sum of Many MeasuresSum of Many Measures
• Was Annie easy to understand in this conversation? (TTS Performance)
• In this conversation, did Annie understand what you said? (ASR Performance)
• In this conversation, was it easy to find the message you wanted? (Task Ease)
• Was the pace of interaction with Annie appropriate in this conversation? (Interaction Pace)
• In this conversation, did you know what you could say at each point of the dialog?
(User Expertise)• How often was Annie
sluggish and slow to reply to you in this conversation? (System Response)
• Did Annie work the way you expected her to in this conversation? (Expected Behavior)
• From your current experience with using Annie to get your email, do you think you'd use Annie regularly to access your mail when you are away from your desk? (Future Use)
![Page 12: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/12.jpg)
Julia 04/19/23 12
Performance Functions from Performance Functions from Three SystemsThree Systems
• ELVIS User Sat.= .21* COMP + .47 * MRS - .15 * ET
• TOOT User Sat.= .35* COMP + .45* MRS - .14*ET
• ANNIE User Sat.= .33*COMP + .25* MRS +.33* Help
COMP: User perception of task completion (task success)
MRS: Mean recognition accuracy (cost) ET: Elapsed time (cost) Help: Help requests (cost)
![Page 13: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/13.jpg)
Julia 04/19/23 13
Performance ModelPerformance Model
• Perceived task completion and mean recognition score are consistently significant predictors of User Satisfaction
• Performance model useful for system development Making predictions about system modifications Distinguishing ‘good’ dialogues from ‘bad’
dialogues
• But can we also tell on-line when a dialogue is ‘going wrong’
![Page 14: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/14.jpg)
Julia 04/19/23 14
Course OverviewCourse Overview
• Spoken Dialogue Systems today Evaluating their weaknesses Role of intonational variation
• Importance of corpora and conventions for annotating them
• Intonational ‘meanings’ • Prosody in Speech Generation• Prosody in Speech Recognition/
Understanding
![Page 15: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/15.jpg)
Julia 04/19/23 15
How to Predict Problems How to Predict Problems ‘On-Line’?‘On-Line’?
• Evidence of system misconceptions reflected in user responses (Krahmer et al ‘99, ‘00) Responses to incorrect verifications
• contain more words (or are empty)
• show marked word order (especially after implicit verifications)
• contain more disconfirmations, more repeated/corrected info
‘No’ after incorrect verifications vs. other ynq’s• has higher boundary tone
• wider pitch range
• longer duration
• longer pauses before and after
• more additional words after it
![Page 16: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/16.jpg)
Julia 04/19/23 16
• User information state reflected response (Shimojima et al ’99, ‘01) Echoic responses repeat prior information – as
acknowledgment or request for confirmationS1: Then go to Keage station.
S2: Keage.
Experiment:• Identify ‘degree of integration’ and prosodic features
(boundary tone, pitch range, tempo, initial pause)• Perception studies to elicit ‘integration’ effect
Results: fast tempo, little pause and low pitch signal high integration
![Page 17: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/17.jpg)
04/19/2317
AT&T Labs AT&T Labs ResearchResearch
Can Prosodic Information Help Can Prosodic Information Help Identify Dialogue System Identify Dialogue System
Problems ‘On Line’?Problems ‘On Line’?
![Page 18: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/18.jpg)
Julia 04/19/23 18
MotivationMotivation
• Prosody conveys information about: The state of the interaction:
• Is the user having trouble being understood?• Is the user having trouble understanding the system?
What the speaker is trying to convey• Is this a statement or a question?
The structure of the dialogue• Is the user or the system trying to start a new topic?
The emotions of the speaker• Is the speaker getting angry, frustrated?
![Page 19: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/19.jpg)
Julia 04/19/23 19
Past Research Issues and Past Research Issues and ApplicationsApplications
• How prosodic variation influences ‘meaning’ Focus or contrast
Given/new
• How prosodic variation is related to other linguistic components Syntax
Semantics
• How to model prosodic variation effectively
• Applications: Text-to-Speech
![Page 20: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/20.jpg)
Julia 04/19/23 20
Current TrendsCurrent Trends
• New description schemes (e.g. ToBI)
• Corpus-based research and machine learning
• Emphasis on evaluation of algorithms and systems (NLE ‘00 special issue)
• Investigation of spontaneous speech phenomena and variation in speaking style
• Applications to CTS, ASR and SDS
![Page 21: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/21.jpg)
Julia 04/19/23 21
Course OverviewCourse Overview
• Spoken Dialogue Systems today Evaluating their weaknesses Role of intonational variation
• Importance of corpora and conventions for annotating them
• Intonational ‘meanings’ • Prosody in Speech Generation• Prosody in Speech Recognition/
Understanding
![Page 22: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/22.jpg)
Julia 04/19/23 22
CorporaCorpora
• Public and semi-public databases ATIS, SwitchBoard, Call Home (NIST/DARPA/LDC) TRAINS/TRIPS (U. Rochester) FM Radio (BU)
• Private collections Acquired for speech or dialogue research (e.g.
August, Gustafson & Bell ’00) Meeting, call center, focus group collections Accidentally collected
• The Web Mud/Moo dialogues
![Page 23: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/23.jpg)
Julia 04/19/23 23
To(nes and)B(reak)I(ndices)To(nes and)B(reak)I(ndices)
• Developed by prosody researchers in four meetings over 1991-94
• Goals: devise common labeling scheme for Standard
American English that is robust and reliable
promote collection of large, prosodically labeled, shareable corpora
• ToBI standards also proposed for Japanese, German, Italian, Spanish, British and Australian English,....
![Page 24: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/24.jpg)
Julia 04/19/23 24
• Minimal ToBI transcription: recording of speech
f0 contour
ToBI tiers: • orthographic tier: words
• break-index tier: degrees of junction (Price et al ‘89)
• tonal tier: pitch accents, phrase accents, boundary tones (Pierrehumbert ‘80)
• miscellaneous tier: disfluencies, non-speech sounds, etc.
![Page 25: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/25.jpg)
Julia 04/19/23 25
Sample ToBI Labeling
![Page 26: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/26.jpg)
Julia 04/19/23 26
• Online training material,available at: http://www.ling.ohio-state.edu/phonetics/ToBI/
• Evaluation Good inter-labeler reliability for expert and
naive labelers: 88% agreement on presence/absence of tonal category, 81% agreement on category label, 91% agreement on break indices to within 1 level (Silverman et al. ‘92,Pitrelli et al ‘94)
![Page 27: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/27.jpg)
Julia 04/19/23 27
Course OverviewCourse Overview
• Spoken Dialogue Systems today Evaluating their weaknesses Role of intonational variation
• Importance of corpora and conventions for annotating them
• Intonational ‘meanings’ • Prosody in Speech Generation• Prosody in Speech Recognition/
Understanding
![Page 28: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/28.jpg)
Julia 04/19/23 28
Pitch Accent/Prominence in Pitch Accent/Prominence in ToBIToBI
• Which items are made intonationally prominent and how?
• Accent type: H* simple high (declarative) L* simple low (ynq) L*+H scooped, late rise (uncertainty/ incredulity) L+H* early rise to stress (contrastive focus) H+!H* fall onto stress (implied familiarity)
![Page 29: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/29.jpg)
Julia 04/19/23 29
•Downstepped accents:
•!H*, L+!H*, L*+!H
•Degree of prominence:within a phrase: HiF0
across phrases
![Page 30: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/30.jpg)
Julia 04/19/23 30
Functions of Pitch AccentFunctions of Pitch Accent
• Given/new information S: Do you need a return ticket.
U: No, thanks, I don’t need a return.
• Contrast (narrow focus) U: No, thanks, I don’t need a RETURN…. (I need
a time schedule, receipt,…)
• Disambiguation of discourse markers S: Now let me get you the train information.
U: Okay (thanks) vs. Okay….(but I really want…)
![Page 31: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/31.jpg)
Julia 04/19/23 31
Prosodic Phrasing in ToBIProsodic Phrasing in ToBI
• ‘Levels’ of phrasing: intermediate phrase: one or more pitch
accents plus a phrase accent (H- or L- ) intonational phrase: 1 or more intermediate
phrases + boundary tone (H% or L% )
• ToBI break-index tier 0 no word boundary 1 word boundary
2 strong juncture with no tonal markings
3 intermediate phrase boundary 4 intonational phrase boundary
![Page 32: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/32.jpg)
Julia 04/19/23 32
Functions of PhrasingFunctions of Phrasing
• Disambiguates syntactic constructions, e.g. PP attachment: S: You should buy the ticket with the discount
coupon.
• Disambiguates scope ambiguities, e.g. Negation: S: You aren’t booked through Rome because
of the fare.
• Or modifier scope: S: This fare is restricted to retired politicians
and civil servants.
![Page 33: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/33.jpg)
Julia 04/19/23 33
Contours: Accent + Contours: Accent + PhrasingPhrasing
• What do intonational contours ‘mean’ (Ladd ‘80, Bolinger ‘89)? Speech acts (statements, questions, requests)
S: That’ll be credit card? (L* H- H%)
Propositional attitude (uncertainty, incredulity)
S: You’d like an evening flight. (L*+H L- H%)
Speaker affect (anger, happiness, love)
U: I said four SEVEN one! (L+H* L- L%)
“Personality”
S: Welcome to the Sunshine Travel System.
![Page 34: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/34.jpg)
Julia 04/19/23 34
Pitch Range and TimingPitch Range and Timing
• Level of speaker engagement S: Welcome to InfoTravel. How may I help you?
• Contour interpretation S: You can take the L*+H bus from Malpensa to
Rome L-H%.
U: Take the bus. vs. Take the bus!
• Discourse/topic structure
![Page 35: Intonational Variation in Spoken Dialogue Systems](https://reader035.fdocuments.net/reader035/viewer/2022062221/56812d17550346895d920416/html5/thumbnails/35.jpg)
04/19/2335
AT&T Labs AT&T Labs ResearchResearch
Can systems make use of this Can systems make use of this information?information?
Can they produce it??Can they produce it??
Can they recognize it??Can they recognize it??