Dialogue Act Tagging Discourse and Dialogue CMSC 35900-1 November 4, 2004.
-
Upload
albert-atkinson -
Category
Documents
-
view
217 -
download
0
Transcript of Dialogue Act Tagging Discourse and Dialogue CMSC 35900-1 November 4, 2004.
Dialogue Act Tagging
Discourse and Dialogue
CMSC 35900-1
November 4, 2004
Roadmap
• Maptask overview
• Coding– Transactions– Games– Moves
• Assessing agreement
Maptask
• Conducted by HCRC – Edinburgh/Glasgow
• Task structure:– 2 participants: Giver, follower– 2 slightly different maps
• Giver guides follower to destination on own map– Forces interaction, ambiguities, disagreements, etc
– Conditions: Familiar/not; Visible/not
Dialogue Tagging
• Goal: Represent dialogue structure as generically as possible
• Three level scheme:– Transactions
• Major subtasks in participants overall task
– Conversational Games• Correspond to G&S discourse segments
– Conversational Moves• Initiation and response steps
Basic Dialogue Moves
• Initiations and responses• Cover acts observed in dialogue – generalized
Initiations: Instruct: tell to carry out some action;Explain: give unelicited information;Check: ask for confirmation; Align:check attention;Query-yn: Query-whResponses:Acknowledge: signal understand & accept;Reply-y; Reply-n; Reply-wh; ClarifyReady:Inter-game moves
Game Coding
• Initiation:– Identified by first move
• Purpose – carry through to completion
– May embed other games – Mark level
– Mark completion/abandonment
Interrater Agreement
• How good is tagging? A tagset?• Criterion: How accurate/consistent is it?• Stability:
– Is the same rater self-consistent?
• Reproducibility: – Do multiple annotators agree with each other?
• Accuracy:– How well do coders agree with some “gold standard”?
Agreement Measure
• Kippendorf’s Kappa (K)– Applies to classification into discrete categories– Corrects for chance agreement
• K<0 : agree less than expected by chance
– Quality intervals: • >= 0.8: Very good; 0.6<K<0.8: Good, etc
• Maptask: K=0.92 on segmentation,– K = 0.83 on move labels
Dialogue Act Tagging
• Other tagsets– DAMSL, SWBD-DAMSL, VERBMOBIL, etc
• Many common move types– Vary in granularity
• Number of moves, types
• Assignment of multiple moves
Dialogue Act Recognition
• Goal: Identify dialogue act tag(s) from surface form
• Challenge: Surface form can be ambiguous– “Can you X?” – yes/no question, or info-request
• “Flying on the 11th, at what time?” – check, statement
• Requires interpretation by hearer– Strategies: Plan inference, cue recognition
Plan-inference-based
• Classic AI (BDI) planning framework– Model Belief, Knowledge, Desire
• Formal definition with predicate calculus– Axiomatization of plans and actions as well– STRIPS-style: Preconditions, Effects, Body
– Rules for plan inference
• Elegant, but..– Labor-intensive rule, KB, heuristic development– Effectively AI-complete
Cue-based Interpretation
• Employs sets of features to identify– Words and collocations: Please -> request– Prosody: Rising pitch -> yes/no question– Conversational structure: prior act
• Example: Check: • Syntax: tag question “,right?”• Syntax + prosody: Fragment with rise• N-gram: argmax d P(d)P(W|d)
– So you, sounds like, etc
• Details later ….
Recognizing Maptask Acts
• Assume: – Word-level transcription
– Segmentation into utterances,
– Ground truth DA tags
• Goal: Train classifier for DA tagging– Exploit:
• Lexical and prosodic cues
• Sequential dependencies b/t Das
– 14810 utts, 13 classes
Features for Classification
• Acoustic-Prosodic Features:– Pitch, Energy, Duration, Speaking rate
• Raw and normalized, whole utterance, last 300ms
• 50 real-valued features
• Text Features:– Count of Unigram, bi-gram, tri-grams
• Appear multiple times
• 10000 features, sparse
• Features z-score normalized
Classification with SVMs
• Support Vector Machines– Create n(n-1)/2 binary classifiers
• Weight classes by inverse frequency
• Learn weight vector and bias, classify by sign
– Platt scaling to convert outputs to probabilities
Incorporating Sequential Constraints
• Some sequences of DA tags more likely:– E.g. P(affirmative after y-n-Q) = 0.5– P(affirmative after other) = 0.05
• Learn P(yi|yi-1) from corpus– Tag sequence probabilities– Platt-scaled SVM outputs are P(y|x)
• Viterbi decoding to find optimal sequence
Results
SVM Only SVM+Seq
Text Only 58.1 59.1
Prosody Only 41.4 42.5
Text+Prosody 61.8 65.5
From Human to Computer
• Conversational agents– Systems that (try to) participate in dialogues– Examples: Directory assistance, travel info,
weather, restaurant and navigation info
• Issues:– Limited understanding: ASR errors, interpretation– Computational costs:
• broader coverage -> slower, less accurate
Dialogue Manager Tradeoffs
• Flexibility vs Simplicity/Predictability– System vs User vs Mixed Initiative– Order of dialogue interaction– Conversational “naturalness” vs Accuracy– Cost of model construction, generalization,
learning, etc
• Models: FST, Frame-based, HMM, BDI• Evaluation frameworks