Speaking while monitoring addressees for understanding

44
Speaking while monitoring addressees for understanding Torsten Jachmann 16.12.2013 Herbert H. Clark and Meredyth A. Krych Seminar „Gaze as function of instructions - and vice versa“

description

Speaking while monitoring addressees for understanding. Seminar „Gaze as function of instructions - and vice versa “. Herbert H. Clark and Meredyth A. Krych. Torsten Jachmann 16.12.2013. Research Question. Speaking and listening in dialog Unilateral - PowerPoint PPT Presentation

Transcript of Speaking while monitoring addressees for understanding

Page 1: Speaking while monitoring addressees for understanding

Speaking while monitoring addressees for understanding

Torsten Jachmann

16.12.2013

Herbert H. Clark and Meredyth A. Krych

Seminar „Gaze as function of instructions - and vice versa“

Page 2: Speaking while monitoring addressees for understanding

Research Question• Speaking and listening in dialog

o Unilateral• Speakers and listeners act autonomous• No interaction

o Bilateral• Speakers and listeners monitor their respective partner• Joint activity

What do speakers monitor?How do they use that information?

Page 3: Speaking while monitoring addressees for understanding

Grounding• Level 1

o Attend to vocalization

• Level 2o Identify words, phrases and sentences

• Level 3o Understand the meaning

• Level 4o Consider answering

Page 4: Speaking while monitoring addressees for understanding

GroundingA: Where you there when they erected the new signs?B: Th… which new signs? (Level 3)A: Little notice boards, indicating where you had to go for everythingB: No. Bilateral account

Page 5: Speaking while monitoring addressees for understanding

Monitoring• Voices

o Attendance to partners utterances

• Faceso Gaze and facial expressions as indicator for understanding

• Workspaceso Region in front of the bodyo Manual gestures (but also games, etc.)

Page 6: Speaking while monitoring addressees for understanding

Monitoring• Bodies

o Head and torso movement as indicator

• Shared Sceneso Scenery beyond workspace

• Signals vs. Symptomso Signals are constructed to get meaning acrosso Symptoms are not intentionally created

Page 7: Speaking while monitoring addressees for understanding

Least joint effort• Opportunistic

o Selection of the available methods that take the least effort to produce

• “Tailored”o Overhearers (not monitored by speaker) may

misunderstand utterances

Page 8: Speaking while monitoring addressees for understanding

Method• Pairs of directors and builders

o 76 students (34 male / 42 female)

• Instructions to build 10 simple Lego Models• 2 x 2 design (interactive)

o 28 pairs

• Additional non-interactive conditiono 10 pairs

• Video and audio analyses

Page 9: Speaking while monitoring addressees for understanding

Interactive• Mixture model

o Workspace (between subject)• Visible• Invisible

o Faces (within subject)• Visible• Invisible

• No restrictions in time and talk

Page 10: Speaking while monitoring addressees for understanding

Non-interactive• Only one condition• Director records instructions

o No time or talk constrainso Prototype can be examined as long as wanted before

recording

• Builders listen to instructionso No constrains on actions• Start, stop, rewind

Page 11: Speaking while monitoring addressees for understanding

Results• Efficiency• Turns• Gestures and grounding

o Deictic expressionso Gestures by addresseeso Cross-timing of actionso Timing strategieso Visual monitoring

Page 12: Speaking while monitoring addressees for understanding

Efficiency

• Visibility of workspace improves efficiency

Page 13: Speaking while monitoring addressees for understanding

EfficiencyNon-interactive• Time needed to build much longer

(245s “n-i” vs. 183s “i”)

• Strong drop in accuracy

o Inadequate instructions

Page 14: Speaking while monitoring addressees for understanding

Turns

• Fewer SPOKEN turns of builder when workspace is visible

Page 15: Speaking while monitoring addressees for understanding

Deictic expressions

• Mainly unusable when workspace hiddeno Joint attention neededo only referring to before mentioned situation

Page 16: Speaking while monitoring addressees for understanding

Gestures by addressees• Mostly accompanied by

deictic utterances (if any)

• Explicit verdict usually only on such utterances

(otherwise continuing)

Page 17: Speaking while monitoring addressees for understanding

Cross-timing

• Gestural signalso Reflect understanding at that moment

Page 18: Speaking while monitoring addressees for understanding

Cross-timing

• Overlapping signalso Usually not in spoken dialogo Start with “sufficient information”

Page 19: Speaking while monitoring addressees for understanding

Cross-timing

• Projectingo Prediction of following actions/instructions

Page 20: Speaking while monitoring addressees for understanding

Cross-timing

• Initiation timeo Waiting for partner to be able to attend the following

utterance

Page 21: Speaking while monitoring addressees for understanding

Cross-timing

• Time uptakeo Responses have to be timed exactly to the action and

situation

Page 22: Speaking while monitoring addressees for understanding

Timing strategies

• Self-interruptiono Dealing with evidence from the addresseeo Usually not continued

Page 23: Speaking while monitoring addressees for understanding

Timing strategies

• Collaborative referenceso Deictic references rely on addressees actions

Page 24: Speaking while monitoring addressees for understanding

Visual monitoring

• Mainly used when director reaches a problem• Eye gaze as support

Page 25: Speaking while monitoring addressees for understanding

Conclusion

• Grounding is fundamental• Visible workspace enhances grounding

speed• In task-oriented dialogs faces are not

important• Compensation possible (only if any

monitoring is available)

Page 26: Speaking while monitoring addressees for understanding

Conclusion

• Updating common ground• Increments are determined jointly• Much evidence for bilateral account

o Addressees provide statement about current understanding

o Speakers monitor to update and change utterances

Page 27: Speaking while monitoring addressees for understanding

Conclusion• Opportunistic process

o Offering optionso Self-interruptionso Waitingo Instant revision

• Multi-modal processo Speech and gestures are combined if possibleo Speech alone takes more time

Page 28: Speaking while monitoring addressees for understanding

Remarks• Gaze only important for certain types of

tasks

• Measurement of time maybe outdated(“old” study)

• No contradicting studies(To some extend commonsense)

Page 29: Speaking while monitoring addressees for understanding

Gaze and Turn-Taking Behavior in Casual

Conversation InteractionsKristiina Jokinen, Hirohisa Furukawa, Masafumi Nishida and

Seiichi Yamamoto

Page 30: Speaking while monitoring addressees for understanding

Differences

• Three-party dialogue

• No instructional task

• Stronger focus on eye gaze

Page 31: Speaking while monitoring addressees for understanding

Research Question• How well can eye gaze help in predicting

turn taking?• What is the role of eye gaze when the

speaker holds the turn?• Is the role of eye gaze as important in

three-party dialogs as in two-party dialogue?

Page 32: Speaking while monitoring addressees for understanding

Hypothesis• In group discussions, eye gaze is

important in turn to management (especially in turn holding cases)• The speaker is more influential than the

other partners in coordinating interactions

(selects the next speaker)

Page 33: Speaking while monitoring addressees for understanding

Method

• Three-person conversational eye gaze corpuso Natural conversationso Balanced familiarity (50% familiar; 50% unfamiliar)o Balanced gender (male-only; female-only; mixed)

Page 34: Speaking while monitoring addressees for understanding

Method

• 28 conversations among Japanese students in their early 20’s with three participants each

• Each conversation about 10 minutes• Eye gaze recorded for one participant

Page 35: Speaking while monitoring addressees for understanding

Method

• Eye tracker fixed on table to remain naturalness

Page 36: Speaking while monitoring addressees for understanding

Method

Page 37: Speaking while monitoring addressees for understanding

Used data• Estimated at the last 300ms of an

utterance if followed by a 500ms pause

Page 38: Speaking while monitoring addressees for understanding

Used data

• Dialog acts

• Speech featuresoValues of F0, etc.

• Eye gaze

Page 39: Speaking while monitoring addressees for understanding

Results

Page 40: Speaking while monitoring addressees for understanding

Conclusion

• Speaker signals whether he intends to give the turn or hold it by using eye gazeo fixating listener vs. focusing attention somewhere

• Eye gaze in multi-participant conversation as important as in two-participant conversations

Page 41: Speaking while monitoring addressees for understanding

Conclusion

• Eye gaze is used to select next speaker (seems to be correct)

• Maybe Japanese data interferes with value of speech datao Comparison Study?

• Listeners focus on speaker not vice versa

Page 42: Speaking while monitoring addressees for understanding

Remarks• Vague information and data presentation

o Although various data exists, interaction of factors is not presented

o Some conclusions rely on the before mentioned point

• Setup only takes one participant in consideration• Much of the data was unused

o Lack in quality and way of creation

Page 43: Speaking while monitoring addressees for understanding

Remarks

• Study is based on data for another studyo Setup is not optimal

• Realistic designo Yet, contains biasing flaws (situation of the

participants, only one eye tracker)

Page 44: Speaking while monitoring addressees for understanding

Comparison• Clark and Krych present interesting ideas

but eye gaze is only rarely handledo How could this be altered?

• Jokinen et al. focus on eye gaze in a (more or less) natural situation but lack in scientific results and setupoWhat points and ideas of this setup could be

beneficial?