Special session on Multimodal Fusion
• A survey: Fusion Engines for Multimodal Input• 5 papers
D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)
1
Multimodal fusion
• Multimodal fusion for• Perception• Interaction
• Focus on multimodal interaction• 4 papers on multimodal interaction• 1 paper on multimodal perception
(first one)
2
Input Multimodal Interaction
3
Input Fusion Engines• Multimodal fusion
• Combining and interpreting data from multiple input modalities
• Usage of input modalities
Combined
Independent
Sequential Parallel
Alternate
Exclusive
Synergistic
Concurrent
4
Input Fusion Engines
• Combined usage (sequential, parallel) why?
• Natural interaction is multimodal by nature.
• The combination of input modalities increases the bandwidth of the human-computer interaction.
5
Fusion engines• A very dynamic domain • ˜15 years of contributions: 1993-2008
6
Input Fusion engines• Some key features
• Multiple and temporal combinations• Types of data and time synchronization
• Probabilistic inputs• Non deterministic inputs
• Robustness• Error handling• Adaptation to context
• Context = (user, environment, platform)
7
Classification:Fusion engines
8
1980 R. Bolt
“Put that there”
Classification:Fusion engines
9
1980 R. Bolt
“Put that there”
Cubricon
1989
CARE 1995
Quickset
1997
ICARE 2004 Petshop
2004FAME 2006
Classification:Fusion engines
10
1980 R. Bolt
“Put that there”
Multiple (up to 255) Input API in
Windows 7 Microsoft
MultiPoint SDK
“Zoom in
here”
UX beats Usability
A gap
Theories and Contributions over Time
11
Reference Tool/ language/ programFusion Time Representation
Application types
Notation Type Level Input DevicesAmbiguity Resolution
Quantitative Qualitative
BBolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation
R Wahlster
Erreur ! Source du renvoi introuvable. XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation
Neal [26] CubriconGeneralized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard
Proximity-based N Y Map manipulation
E
Koons [19] No name Parse treeFrame-based Dialog Speech, Eye gaze, Gesture
First solution Y Y 3D World
Nigay [28] Pac-Amodeus Melting PotFrame-based Dialog + low level Speech, Keyboard, Mouse
Context-based resolution Y N Flight Scheduling
Cohen [9] Quickset Feature Structure Unification Dialog Pen VoiceS / G & G / S & N best Y N
Simulation System training
Bellik [3] MEDITOR NoneFrame-based Dialog + low level Speech Mouse
History Buffer Y Y Text Editor
Martin [22] TYCOON Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse
Probability-based resolution Y Y
Edition of graphical user interfaces
Johnston [18] FST Finite State Automata Procedural Dialog Speech penPossible (N best) Y Y Corporate Directory
T & A Krahnstoever
[20] iMap Stream StampedFrame-based Dialog Speech gesture Not given Y N Crisis Management
Dumas [12] HephaisTK XML Typed (SMUIML)Frame-based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants
Holzapfel [17] No Name Typed Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot
Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool
Milota [25] No Name Multimodal Parse Tree Unification DialogSpeech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design
Melichar [24] WCIMultimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB
Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control
Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example
Latoschik [21] No NameTemporal Augmented Transition Network Procedural Dialog Speech gesture
Fuzzy constraints Y Y Virtual reality
Bouchet [5] [6]Mansoux [23]
ICARE(Input/Output) Melting pot
Frame-based Dialog + low level
Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard
Context-based resolution Y N
Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery
Navarre [30] Petshop Petri nets Procedural Dialog + low levelSpeech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit
Flippo [14] No Name Semantic tree Hybrid DialogSpeech Mouse Gaze gesture
Feedback for missing data Y N Collaborative Map
Portillo [34] MIMUSFeature Value Structure (DTAC) Hybrid Dialog Speech Mouse
Knowledgeable agent Y N
Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book12
ReferenceTool/
language/ program
FusionTime
Representation Application types
Notation Type Level Input DevicesAmbiguity Resolution
Quantitative
Qualitative
B Bolt [4] Put that here system None None Dialog Speech gesture ? N ? Map manipulation
R Wahlster XTRA None Unification Dialog Keyboard Mouse N Y Map manipulation
Neal [26] Cubricon
Generalized Augmented Transition Network Procedural Dialog Speech Mouse Keyboard Proximity-based N Y Map manipulation
E Koons [19] No name Parse tree Frame-based Dialog Speech, Eye gaze, Gesture First solution Y Y 3D World
Nigay [28] Pac-Amodeus Melting Pot Frame-based Dialog + low level Speech, Keyboard, MouseContext-based resolution Y N Flight Scheduling
Cohen [9] Quickset Feature Structure Unification Dialog Pen Voice S / G & G / S & N best Y NSimulation System training
Bellik [3] MEDITOR None Frame-based Dialog + low level Speech Mouse History Buffer Y Y Text Editor
Martin [22] TYCOON
Set of processes – Guided Propagation Networks Procedural Dialog Speech Keyboard Mouse
Probability-based resolution Y Y
Edition of graphical user interfaces
Johnston [18] FST Finite State Automata Procedural Dialog Speech pen Possible (N best) Y Y Corporate Directory
T & A Krahnstoever [20] iMap Stream Stamped Frame-based Dialog Speech gesture Not given Y N Crisis Management
Dumas [12] HephaisTK XML Typed (SMUIML) Frame-based Dialog Speech Mouse Phidgets First one Y Y Meeting assistants
Holzapfel [17] No NameTyped Feature Structure Unification Dialog Speech gesture N Best list Y N Humanoid Robot
Pfleger [33] PATE XML Typed Unification Dialog Speech pen N Best list Y Y Bathroom design Tool
Milota [25] No NameMultimodal Parse Tree Unification Dialog
Speech Mouse keyboard Touchscreen S / G & G /S Y N Graphic Design
Melichar [24] WCIMultimodal Generic Dialog Node Unification Dialog Speech Mouse Keyboard First One ? ? Multimedia DB
Sun [37] PUMPP Matrix Unification Dialog Speech gesture S / G N Y Traffic Control
Bourguet [7] Mengine Finite State machine Procedural Low level Speech Mouse Not given N Y No example
Latoschik [21] No NameTemporal Augmented Transition Network Procedural Dialog Speech gesture Fuzzy constraints Y Y Virtual reality
Bouchet [5] [6]Mansoux [23]
ICARE(Input/Output) Melting pot Frame-based Dialog + low level
Speech, Helmet visor HOTAS, Tactile surface, GPS localization, Magnetometer, Mouse, Keyboard
Context-based resolution Y N
Aircraft Cockpit, Authentication, Mobile Augmented Reality systems (Game, Post-it), Augmented Surgery
Navarre [30] Petshop Petri nets Procedural Dialog + low levelSpeech mouse Keyboard Touchscreen *** Y Y Aircraft Cockpit
Flippo [14] No Name Semantic tree Hybrid Dialog Speech Mouse Gaze gestureFeedback for missing data Y N Collaborative Map
Portillo [34] MIMUSFeature Value Structure (DTAC) Hybrid Dialog Speech Mouse Knowledgeable agent Y N
Duarte [11] FAME Behavioral Matrix Hybrid Dialog Speech Mouse Keyboard Not given ? ? Digital talking Book
13
Special sessionMultimodal Fusion
• Content• A survey• 5 papers
• Schedule • 10 mn introduction and survey outlook• 15 mn per paper + 5 mn questions• 10 mn for questions on the session
D. Lalanne (Switzerland), L. Nigay (France), P. Palanque (France), P. Robinson (UK), J. Vanderdonckt (Belgium)
Special sessionMultimodal Fusion
• H. Mendonça: Agent-based fusion• B. Dumas: An evaluation framework to
benchmarck fusion engines• L. Nigay: CARE-based fusion• J. Ladry & P. Palanque: Petri net based formal
description and execution of fusion engines• M. Sezgin: Fusion of speech and facial
expression recognition
16
QUESTIONS?
Fusion engines: research agenda
• Performance evaluation• Testbeds, metrics• Identification of interpretation errors• Formal predictive evaluation
• Adaptation to context• Dynamic aspect of adaptation• Reconfigurations
• Engineering aspects• Difficult to develop (toolkit from manufacturers required)• Fusion engine tuning (tuning is the key for interaction
techniques e.g. drag&drop)
17
Fusion Principles
• Notation: Petri nets based (ICOs)• Type: Procedural only• Level: Dialogue and low level• Input Devices: Speech, mice, keyboard,
touch screen • Ambiguity resolution: inside models • Time representation (Quantitative –
Qualitative): Both• Application Type : Safety Critical,
Aeronautics and Space
18
Top Related