Multimodal Information Access Using Speech and Gestures Norbert Reithinger...
Transcript of Multimodal Information Access Using Speech and Gestures Norbert Reithinger...
![Page 1: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/1.jpg)
Multimodal Information Access Using Speech and Gestures
Norbert Reithinger
![Page 2: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/2.jpg)
17.02.2004 Chennai 3
Natural Access to Digital Resources
• Speech dialog systems already ready-to-market
• Next step: multimodal interfaces
• Add modalities beyond speech – Enable
interactions using
– Gestures
– Pointing
– Haptics...
![Page 3: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/3.jpg)
17.02.2004 Chennai 4
Advantages of Multimodality for the Users
• Natural use of preferred modalities• Mutual disambiguation using multiple
modalities under adverse conditions• Environment may elicit preferred interaction
modality• Product realization e.g.as multimodal
information kiosk • Example: SmartKom
![Page 4: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/4.jpg)
17.02.2004 Chennai 5
SmartKom: Intuitive Multimodal Interaction
MediaInterfaceEuropean Media LabUinv. Of
MunichUniv. ofStuttgart
Saarbrücken
Aachen
Dresden Berkeley
Stuttgart
MunichUniv. of
Erlangen
Heidelberg
Main Contractor
DFKISaarbrücken
The SmartKom Consortium:
Project Budget: € 25.5 million (partly funded by the German Ministry of Education and Research - BMBF)
Project Duration: 4 years (September 1999 – September 2003)
Ulm
![Page 5: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/5.jpg)
17.02.2004 Chennai 6
MultimodalDialogue
Backbone
Public:Cinema,Phone,
Fax, Mail,
Biometrics
ApplicationLayer
SmartKom-Public:Communication
Companion that helps
to keep in touch and to get information
Mobile:Car andPedestrianNavigation
SmartKom-Mobile:Mobile Travel Companionthat helps with navigation
Home:Consumer Electronics
EPG
SmartKom-Home:Infotainment Companion
that helps to select media content
SmartKom: The Three Scenarios
![Page 6: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/6.jpg)
17.02.2004 Chennai 7
Infrared Camera for Gestural Input, Tilting CCD Camera for Scanning, Video Projector
Microphone
Multimodal Control of TV-Set
Multimodal Control of VCR/DVD Player
Camera forFacial Analysis
ProjectionSurface
Speakers forSpeech Output
SmartKom’s Multimodal Input and Output Devices
3 dual Xeon2.8 Ghzprocessorswith 1.5 GB
main memory
![Page 7: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/7.jpg)
17.02.2004 Chennai 8
User specifies goal
delegates task
cooperate
on problems
asks user
presents results
Service 1 Service 1
Service 2Service 2
Service NService N
IT Services
PersonalizedInteraction
Agent
...
SmartKom`s SDDP Interaction Metaphor
SDDP = Situated Delegation-oriented Dialogue Paradigm
![Page 8: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/8.jpg)
17.02.2004 Chennai 9
Please reservehere.
SmartKom Understands Multimodal Input
![Page 9: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/9.jpg)
17.02.2004 Chennai 10
Home
EPG (Electronic-Programming Guide)
General program Channel selection Channel information Selection based on genre
Information for one broadcast Time-based operations Help functions for genres
7 TV On/off Channel selection 2 VCR control On/off
Record Play Pause
Wind/rewind Programming using EPG and the calendar
6 Lean-Forward/ Lean Backward
Select Lean-Backward Deactivate Lean-Backward
Context aware presentations 3
Total Home 18
Public
Telephone Manipulative key operations Telephony functions
Audio handling Address book 4
Hand contour biometry Selection of biometry type Hand biometry
Presentation and camera control Address book (see above) 3
Voice biometry Presentation and audio control Voice biometry
Address book (see above) Selection of biometry type (see above) 2
Signature biometry Presentation and tablet control Signature biometry
Address book (see above) Selection of biometry type (see above) 2
Fax Presentation and interaction Fax handling
Address book (see above) Camera control 3
E-Mail Presentation and interaction E-Mail handling
Address book (see above) Camera control (see above) 2
Cinema General program Movie information
Seat reservation Cinema location 4
Total Public 20
Mobile
Car navigation Selection of start und goal city Route type selection Car route computation
Selection of parking garage Information about parking garages 5
Pedestrian navigation Selection of map type Selection of start und goal Route computation
Selection of points of interest Information for points of interest Integrated car and pedestrian route planning 6
Map manipulation Resize Help functions for map interactions
Change viewpoint
3 Total Mobile 14
Total System 52
SmartKom:
14 applications52 functionalities
(Source: Reithinger et al: SmartKom - Adaptive and Flexible Multimodal Access to Multiple Applications. In ICMI ’03)
![Page 10: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/10.jpg)
17.02.2004 Chennai 11
Generic Technologies Used
• Speech and gesture recognition• Language and gesture understanding• Modality fusion• Dialog processing• Information extraction/retrieval, e.g. from Internet
sources• Biometry• Presentation planning• Answer generation• Speech synthesis• Interactive presentation
![Page 11: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/11.jpg)
17.02.2004 Chennai 12
Please place your hand with
spread fingers on the marked area.
Interactive Biometric Authentication by Hand Contour Recognition
![Page 12: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/12.jpg)
17.02.2004 Chennai 13
Adaptation to Another Language: SmartKom Mobile English
SmartKom’s modular architecture encapsulates language specific knowledge in few language processing modules
Modified
Not modified
Not used
Module was …
SmartKom system overview:
Minor modifications
![Page 13: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/13.jpg)
17.02.2004 Chennai 14
Conclusion
• Multimodal interaction enables natural access
to digital resources
• Advantageous for many users
• SmartKom realizes an exemplary multimodal
information kiosk
• Adaptation to different languages relatively
easy
![Page 14: Multimodal Information Access Using Speech and Gestures Norbert Reithinger norbert.reithinger@dfki.de.](https://reader036.fdocuments.net/reader036/viewer/2022062802/56649e895503460f94b8d8c8/html5/thumbnails/14.jpg)
17.02.2004 Chennai 15
Thank you very much for your attention!
• Please find more information at
http://www.smartkom.org
• Other multimodal projects with participation of DFKI
– MIAMM (EU): Multidimensional Information Access using
Multiple Modalities: http://www.miamm.org
– COMIC (EU): COnversational Multimodal Interaction with
Computers: http://www.hcrc.ed.ac.uk/comic
– VirtualHuman (BMBF): Virtual agents for education
http://www.virtual-human.org