Multimodal user interfaces: Implementation Chris Vandervelpen [email protected].
-
Upload
lawrence-perry -
Category
Documents
-
view
226 -
download
0
Transcript of Multimodal user interfaces: Implementation Chris Vandervelpen [email protected].
Multimodal user interfaces: Implementation
Chris [email protected]
Overview
• Introduction• VoiceXml• X+V• From models to X + V• Demo: ACCESS Netfront• Conclusions• Questions
Introduction
• Focus on speech/direct manipulation on mobile device
• How can we deploy a multi modal UI– Build our own framework using speech
synthesizer/recognizers that interpret the designed models (reinventing the wheel)
– Build software that generates standardized markup from the models (use existing technologies) start point
VoiceXml
• Markup language for speech only interfaces
• Telephone interfaces• Using grammars for speech recognition
– Java Speech Grammar Format (JSGF)– Nuance Grammar Specification Language
(NGSL)• Speech output
– Synthesis– Prerecorded audio
• http://www.voicexml.org
VoiceXml
<vxml:form><vxml:field name=“departure_city“>
<vxml:grammar> <![CDATA[ #JSGF V1.0; grammar cities; <city> = brussels | antwerp | amsterdam;
]]> </vxml:grammar> <vxml:prompt> What departure city do you like?? </vxml:prompt> <vxml:catch event="help nomatch noinput"> For example, brussels, antwerp or amsterdam </vxml:catch>
<vxml:filled> <vxml:prompt>Your departure city is <vxml:value=“expr=departure_city”
/></vxml:prompt></vxml:filled>
</vxml:field><vxml:field name=“destination_city“>
………</vxml:field>
</vxml:form>
VoiceXml
• Mixed-initiative forms– Single user input for several fields– Supports more natural language
• For example – I want to fly from “brussels” to
“amsterdam”– Filling in departure_city and
destination_city fields
X + V
• X + V– XHtml: visual channel– VoiceXml snippets: speech channel
• Synchronization between modalities using Xml Events
• Multimodal browsers supporting X+V– ACCESS Netfront multimodal browser
(PocketPC)– Opera
• http://www.voicexml.org/specs/multimodal/x+v/12/
X + V
<html><body>
<form> <input id=“from” name=“from” size=“20”
ev:event=“inputfocus” ev:handler=“#voice_city_from” />
<input id=“to” name=“to” size=“20”ev:event=“inputfocus”ev:handler=“#voice_city_to” />
</form></body>
</html>
X + V
<vxml:form id=“voice_city”><vxml:field name=“departure_city_field“ id=“voice_city_from”>
<vxml:grammar> <![CDATA[ #JSGF V1.0; grammar cities; <city> = brussels | antwerp | amsterdam;
]]> </vxml:grammar> <vxml:prompt> What departure city do you like?? </vxml:prompt> <vxml:catch event="help nomatch noinput"> For example, brussels, antwerp or amsterdam </vxml:catch>
<vxml:filled> <vxml:assign name=“document.getElementById(‘from)” expr=“departure_city” /></vxml:filled>
</vxml:field><vxml:field name=“destination_city_field“ id=“voice_city_to” >
…….</vxml:field>
</vxml:form>
X + V
• Also usable with XForms• VoiceXml snippets and XForms
influence same XForms instance model synchronization
Models to X + V
Models to X + V
• Annotate UI description for speech [Shao2003: Transcoding HTML to VoiceXML Using Annotations]
• Extend this approach to UIML and X + V– Identify particular information structures
• Text areas• Menu/List structures• Top-level visual region
– Define their representation in XHTML and VoiceXml
– Generate the synchronization XML eventing code
Model to X + V
• Define a generic UIML widget vocabulary mapping for both GUI and speech [Plomp2002]
• TextEntry– <field> (VoiceXml)– <input type=“text” /> (XHtml)– System.Windows.Forms.TextBox
• Collection– <form> (VoiceXml)– <form> (XHtml)– System.Windows.Forms.Panel
• Access Netfront multimodal browser• PocketPC• Ordering pizza• Ordering Chinese
Demo
Conclusions
• X + V– built-in modality synchronization– alternative to own multimodal
implementation– declarative– transformation from UIML possible
Questions?