Towards a virtual agent using similarity-based laughter production
description
Transcript of Towards a virtual agent using similarity-based laughter production
Towards a virtual agent using similarity-based laughter
productionJérôme Urbain, Stéphane Dupont, Thierry Dutoit, Radoslaw
Niewiadomski, Catherine Pelachaud
TCTS Lab, Faculté Polytechnique de Mons
CNRS - LTCI UMR 5141, Institut TELECOM - TELECOM ParisTech
<date> <footer> 2
Context
• Desire to communicate with machines as we do with humans
• Lack of naturalness, expressivity, perception of emotions
• Laughter is a very important signal:– Conveys emotional contents– Communicative– ...
<date> <footer> 3
Objectives
• Improve the expressivity of an Embodied Conversational Agent (ECA) by enabling it to laugh:– acoustic laughter production– synchronisation with virtual agent
expressions– ability to instantaneously answer to a
laughter
<date> <footer> 4
Outline
• Multimodal expressions of emotions by an ECA
• Acoustic laughter production based on similarities
• Upcoming projects
Go beyond « static images »/linear expressions (e.g. beyond basic emotions predictions)Complete data obtained from theory and literature Multimodal expressions of emotions
across modalities many emotions are expressed by sequences (or combination) of multimodal signals rather than monomodal signals (eg static facial expressions)
Multimodal expressions of emotions
D. Keltner, B. N. Buswell Embarrassment: Its Distinct Form and Appeasement Functions
• Difficulty: lack of relevant research and video-corpora
• new video-corpus of examples of spontaneous multimodal behaviors (based on TV broadcasts)
• annotation in Anvil (Kipp 2001)
• Specification of Behaviors Sets and Constraints
Multimodal expressions of emotions
Corpus of emotional displays
• Annotation of audio-visual recordings from reality shows, hidden camera recordings, Belfast Naturalistic database, EmoTV corpus.
• The observed people are non actors in emotional situations: natural and not stereotyped multimodal behaviour is displayed
• 20 video clips (3 to 14 seconds each):– relief (2), tension (6), joy (2), sadness (2), anger
(3), despair (1), fear (4)
Multimodal annotation scheme
• Annotation on Anvil v4.7.6 (Kipp, 2001) with 5 tracks:– Emotion (inferred from the
situation)– Facial expression (FACS coding)– Head movement– Gaze movement– Gesture
Multimodal Joy annotation
Video annotations with the Anvil software: multimodal display of joy from the Belfast Naturalistic Emotional Database (Cowie et al., 2003)
• Joy: – arm movements,
– head/torso movements towards front and backwards,
– tilts, and micro tilts,
– movements to the side,
– Great arm movements (like playing on a drum)
– Facial expressions like smile and raise eyebrow.
Multimodal Joy annotation
Humaine video-corpus
formalization of multimodal expressions of emotions by constraints
Multimodal expressions of emotions
Emotion Behavior Sets:(set of signals of different modalities)
Constraints:- time constraints- ordering - simultaneity - proba of occurrence
variety of multimodal expressions of emotions
Annotation
of corpus
FML: Emotion label
Multimodal expressions of emotions
<multimodal emotion="embarrassment"> <signals>
<signal id="1" name="head=head_down_strong" repetitivity="0" min_duration="2" ... <signal id="2" name="head=head_left_strong" repetitivity="0" min_duration="5" ... <signal id="3" name="gaze=look_down" repetitivity="0" min_duration="2" ...<signal id="4" name="gaze=look_right_strong" repetitivity="0" min_duration="1" ... <signal id="5" name="gaze=look_left_strong" repetitivity="0" min_duration="1" ... <signal id="6" name="affect=smile" repetitivity="1" min_duration="2" ...
...
</signals><cons> <con type="minus"> <arg id="6" type="start"/> <arg id="2" type="start"/> <lessthan value="0"/> </con> <con type="minus"> <arg id="7" type="start"/> <arg id="2" type="start"/> <lessthan value="0"/> </con>
....
annotation
•literature
constraints
D. Keltner, B. N. Buswell Embarrassment: Its Distinct Form and Appeasement Functions
Multimodal expressions of emotions
• implementation of the algorithm
FML
BML
<emotion id="e1" type=“joy" start="1.0" end="14" />
A set of behaviours
animationBMLRealizerFMLRealizer
ExampleEmbarrassment
Embarrassment
Joy
Video
<date> <footer> 16
Outline
• Multimodal expressions of emotions by an ECA
• Acoustic laughter production based on similarities
• Upcoming projects
17
• A prototype application for browsing through musical loop libraries. AudioCycle provides the user with a graphical view where the audio extracts are visualized and organized according to their similarity in terms of musical properties, such as timbre, harmony, and rhythm. The user is able to navigate in this visual representation, and listen to individual audio extracts, searching for those of interest.
• Richer in features than other similar concepts we have seen.
Audio Cycle
18
Technologies & Architecture
Audio Analysis
3D Visual Rendering
Visualization
Audio, Musical Loops
Meta-Data&
Features
User Input
3D Audio Rendering
19
• Timbre: Mel-Frequency Cepstral Coefficients
• Harmony and Melody: Chromas (information about the notes played)
• Rhythm: periodicity, « Beats per Minute »
Extracted features
20
Visualization: Clustering
Reference Loop
Cluster Centroïd
Loop
21
• Firsts tests: Audio Cycle as it is:– some grouping of classes (whisper-like,
« retained » laughters, melodious laughters, …)
– some grouping of classes– laughing audience
Adaptation to laughter
22
• New feature set (« laughter »):– mean pitch– rate of voiced frames– mean energy– maximum aplitude– duration
• Automatic laughter bursts segmentation
Adaptation to laughter
23
• First laughter synthesis using concatenation of bursts
Adaptation to laughter
<date> <footer> 24
Outline
• Multimodal expressions of emotions by an ECA
• Acoustic laughter production based on similarities
• Upcoming projects
25
• Adapt Audio Cycle to Laughter Laughter Cycle:
– improve features– model for concatenation of bursts– how to combine laughters
Future Work
26
• Audovisual laughing machine: eNTERFACE’09
– synchronisation between audio and ECA expressions, using emotional behavior descriptors
– automatic answer to an input laughter
Future Work
27
• Numediart Research Program:http://www.numediart.org
• eNTERFACE’09:http://www.infomus.org/enterface09/
• CALLAS project:http://www.callas-newmedia.eu/
Future Work