ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation...

16
eNTERFACE 08 Project #1 MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008

description

Project Objectives Main objective: develop an ECA Tour Guide system which can interract with one or two users Research features: multiparty dialogue model and scenario between two humans and ECA handling and combining input data: users presence and behaviors (speech, tracking) gaze behaviors control and nonverbal model of ECA

Transcript of ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation...

Page 1: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

eNTERFACE 08 Project #1“MultiParty Communication

with a Tour Guide ECA”Final presentation

August 29th, 2008

Page 2: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Outline• Project Overview

• Objectives, Issues & Work Done

• System Overview• Configuration and Design

• Conclusion

Page 3: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Project Objectives• Main objective: develop an ECA Tour Guide system which can interract with one or two users

• Research features:

• multiparty dialogue model and scenario between two humans and ECA

• handling and combining input data: users presence and behaviors (speech, tracking)

• gaze behaviors control and nonverbal model of ECA

Page 4: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Work done: Component Functionality Overview

• We implemented components which support scenario based on narration and interruptions

• ECA is narrator, users can ask context-related questions (“where”, “how”, “when”)

• speaker, addresse and listener identification, ECA gaze model

• ECA can ask users simple “yes/no” questions to keep attention

• System can detect users appearance and dynamically initiate/end session

• System can detect and handle situation when users are paying less attention

• System can recover from failure (e.g. SR does not recognize user’s speech)

Page 5: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Work done...about to be done...

• Components are implemented

• System is being integrated

• debugging and full testing is needed

• Not supported:• Detection of situation when users are starting their conversation• Detection of speech collision between users• Smart scheduling and control of ECAs behaviors

Page 6: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

System Configuration

Okao Vision

OpenCV

NonVerbal Input Understanding

Decision Making Planner (Scenario Component)

Animation Player

Speech Recognition 1

Speech Recognition 2

Input

Central Part

Output

Page 7: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Speech Recognition

Okao Vision

OpenCV

NonVerbal Input Understanding

Decision Making Planner (Scenario Component)

Animation Player

Speech Recognition 1

Speech Recognition 2

Input

Central Part

Output

Page 8: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Speech Recognition• Functionality:

• Detects users requests (“Where”, “How”, “When”, “Who”)• Detects users willingness to leave the system • Detects results of simple questioners (“yes/no”) • Detects unknown words

• Implementation:• Keywords detection with confidence score and speech duration is implemented by using Loquendo API

Page 9: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Nonverbal Inputs and Understanding

Okao Vision

OpenCV

NonVerbal Input Understanding

Decision Making Planner (Scenario Component)

Animation Player

Speech Recognition 1

Speech Recognition 2

Input

Central Part

Output

Page 10: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Nonverbal Inputs: Users appearance and face orientation

• Functionality of components:• Detect motions and users appearance/disappearance • Detect number of users present• Detect users face orientation and increased/decreased attention

• left, right user

• Implementation:• OpenCV (motion) & Okao Vision (face orientation, gazing)

Page 11: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Decision Making Component

Okao Vision

OpenCV

NonVerbal Input Understanding

Decision Making Planner (Scenario Component)

Animation Player

Speech Recognition 1

Speech Recognition 2

Input

Central Part

Output

Page 12: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Decision Making Component- Functionalities

• Makes decisions “when and what to do to whom”:• Handles multimodal input events (number of users, attention, speech channels)• Handles user interruptions while ECA is speaking• Handles failures from SR component• Generates multimodal output and controls ECA’s gazing• Simple rule: “First one will be served”

• “yes”/”no” questionnaire is exception • No domain knowledge and behavior scheduling

Page 13: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Decision Making Component - Implementation

• Decision Making Component component uses ideas from information state theory [Larsson’00] and AIML:

• The progress of dialogue is represented by a set of variables

• Most appropriate plans are selected and scheduled by simple inference

• Time control to obtain both messages from speech channels in case (“yes/no”) questions

• Component is being developed by using MIDIKI’s toolkit as reference

Page 14: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Animation Player

Okao Vision

OpenCV

NonVerbal Input Understanding

Decision Making Planner (Scenario Component)

Animation Player

Speech Recognition 1

Speech Recognition 2

Input

Central Part

Output

Page 15: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Animation Player• Functionality:

• Animation player uses scripted behaviors (GSML language) to generate speech and animation• Model of gaze in a multiparty communication is supported:

• Gazing control is obtained on the utterance level

• Gaze pattern is following conversational rules (who is addresee, who is listener)

• Implementation:• Visage SDK (based on MPEG-4 standard)• 3ds Max

Page 16: ENTERFACE 08 Project #1 “ MultiParty Communication with a Tour Guide ECA” Final presentation August 29th, 2008.

Conclusion• Components to support context-based two party human - ECA communication are implemented

• System is being integrated, but not fully tested• Component issues:

• missing face tracking and domain knowledge about users behaviors• simple dialogue management and control (no smart scheduling and smart gaze control)

• Future directions: system debugging and testing, implement tracking, improve gazing control, study on users behaviors and gazing, system evaluation