Spoken Dialogue Systems and Social Talk - Emer Gilmartin

21
SPOKEN DIALOGUE SYSTEMS AND SOCIAL TALK Emer Gilmartin Speech Communication Lab Trinity College Dublin

Transcript of Spoken Dialogue Systems and Social Talk - Emer Gilmartin

SPOKEN DIALOGUE SYSTEMS AND SOCIAL TALK

Emer Gilmartin

Speech Communication LabTrinity College Dublin

What?• Spoken dialogue systems attempt to create a spoken

interaction with a user• Dialogue systems, Intelligent Virtual Agents (IVA’s),

Embodied Conversational Agents (ECA’s), Chatbots• Dream (Turing, 1950 ) vs Practical Progress (Allen, 2000)

• AI – early chat – pattern matching – ELIZA• Practical Dialogues – task to be performed - Practical Dialogue

Hypothesis (Allen, 2000)

What’s out there?• Command and Control – voice commands• Information Retrieval – Siri• Interactive Voice Response – IVR• Chatbots• Embodied Conversational Agents (ECA)• Intelligent Virtual Agents

Casual conversation – the unmarked case

• Ordering a pizza (transactional)• performing a well-defined task• content (‘What?’) vital for success

• Chat with neighbour (interactional)• building/maintaining social bonds• social (‘How?’) very important

• ‘continuing state of incipient talk’

What is a Spoken Dialogue System?

Multimodality

• Expression and Recognition• Audio, visual, verbal, vocal, non-verbal, facial expression,

gesture, posture…• Presence, affect, attitude...

• The Problem: Building social dialogue systems entails understanding of casual social dialogue but…

• Much linguistic theory is based on language similar to writing but highly unlike talk• regards spoken interaction as debased, chaotic

• SDS technology based on• Practical Dialogue Hypothesis (Allen, 2000)• Constraint introduced to make dialogue modelling tractable

• Much corpus study of spoken interaction based on Task-based Dialogue• Information gap activities – MapTask (HCRC), DiaPix (Lucid)• Meetings – AMI, ICSI• These are not corpora of casual or social talk

Social Talk• Spoken interaction as social activity

• Malinowski, Dunbar, Jakobsen, Brown and Yule• Structure and Content

• Smalltalk at the margins (Laver)• Chat and chunks (Slade & Eggins)• Bouts – gossip, narrative• Bouts end with ‘idling’ (Schneider)• Phases – greetings, approach, centre, leavetaking (Ventola)• Multiparty (Slade)

• Problems: • much of this is theory, analysis by example• based on orthographical transcriptions• corpus based studies on transactional dyadic interaction,

phonecalls…

January 15, 2016 IWSDS 2016

Genre differences in spoken interaction?• Spoken interaction is situated

• ‘speech-exchange systems’ (SSJ),• communicative activities (Allwood)

• Some low level mechanisms may follow universal patterns

• It is also possible that even basic interaction mechanisms such as turn-taking vary with the type and parameters of different interactions

• What might vary?• Utterance/turn characteristics• Distribution of pauses/gaps/overlaps• ‘Disfluencies’, VSU’s, laughter…

• Explore different genres and use knowledge to inform design of interfaces

Anatomy of casual conversation

January 15, 2016 IWSDS 2016

LA

CG

10 minutes from a 5-party casual conversation showing chat (240s-480s and chunk 480 –end) phasesRed-speech, yellow-laughter, grey-silence

January 15, 2016 IWSDS 2016

Chunk to Chunk Transition – more interaction and laughter at end of chunks

January 15, 2016 IWSDS 2016

Chat/Chunk• Significant differences in

• Length – chat very variable, chunk ~ 30s• Phrase final prosody• Gap lengths

• Important because;• Need different timing modules for different phases• Useful to know which phase we’re in

• Current Work• Stochastic model – simple bigram HMM can classify chat/chunk • Goal - online classifier – knowledge additional to ASR to inform

dialogue management

January 15, 2016 IWSDS 2016

How do I make a spoken dialogue system?• Virtual Human Toolkit (https://vhtoolkit.ict.usc.edu/)• Pandorabots (www.pandorabots.com)• CSLU Toolkit (http://www.cslu.ogi.edu/toolkit/)• Voxeo Prophecy (http://voxeo.com/prophecy/)• AT&T Speech

Mashupshttps://service.research.att.com/smm/login.jsp

SPOKEN AND MULTIMODAL DIALOGUE APPLICATIONS IN HEALTHCAREEmer GilmartinSpeech Communication LabSCSS

ELIZA – text-based Rogerian therapist

• Weizenbaum - 1966• http://www.masswerk.at/elizabot/eliza_test.html

NWU - Relational Agents Group• Relational agents group• https://www.youtube.com/watch?v=Jx5Tsn9wFw• Presentation• https://www.youtube.com/watch?v=lcb_rMhJQTI• RAISE• https://www.youtube.com/watch?v=ttBMG-F1HS0

ICT – Virtual Humans• Simsensei• https://www.youtube.com/watch?v=I2aBJ6LjzMw• https://www.youtube.com/watch?v=ejczMs6b1Q4

Simulations and Virtual Reality• ICT – MILES – Training for Therapists

• https://www.youtube.com/watch?v=KNGVRePdEL8

• ICT - Standard Patient Hospital

The future• ICTnarrative

• https://www.youtube.com/watch?v=4IBoO2IKFMI

• Bickmore’s warning• https://www.youtube.com/watch?v=V20KEjIjFl8

• http://relationalagents.com/index.html