M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced...

15
M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:http://www.cavs.msstate.edu/hse/ies/publications/conferences/ieee_secon/2006/ hlt_system/ A Robust Architecture for Human Language Technology Systems

Transcript of M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced...

Page 1: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

M. Liu, T. Stanley, J. Baca and J. PiconeIntelligent Electronic Systems

Center for Advanced Vehicular SystemsMississippi State University

URL:http://www.cavs.msstate.edu/hse/ies/publications/conferences/ieee_secon/2006/hlt_system/

A Robust Architecture for Human Language

Technology Systems

Page 2: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 2 of 14

● Speech recognition and synthesis

● Natural language processing

● Machine translation

Human Language Technology

Overview of Human Language technologies

Page 3: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 3 of 14

Architecture for Human Language Technology Systems

monolithic system distributed systems

Speech recognitionInformation

retrieval

●●

decompose into function

components

Development of HLT systems:

● Monolithic fashion

● Distributed system

Natural language understanding

Page 4: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 4 of 14

Galaxy (DARPA) Communicator

• Hub

• Servers

• System initialization

Galaxy Communicator is an open source architecture for constructing dialogue systems.

Its target is to provide the next generation of intelligent conversational interfaces to distributed information. The goal is to support the creation of speech-enabled interfaces that scale gracefully across modalities, from speech-only to interfaces that include graphics, maps, pointing and gesture.

How It Works

Page 5: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 5 of 14

Advantages VS Disadvantages

• Frequent deadlocks in the

communication between

servers

• Need for automated recovery

from server failures

• Lack of a common user

interface for all the applications

• Development time

• Complex inter-process communication

• Plug-and-play approach

• Automating server startup

• The need for robustness to

error and improved debugging

capabilities

• A common interface to allow

users to select among

applications.

Page 6: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 6 of 14

Automated server management

● Multiple Applications

Though the communicator process

monitor provides a good interface

to start and terminate servers, it

requires manual monitoring.

● Process Manager

module

Automatically starts and controls all

server processes in the prototype

system architecture.

Page 7: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 7 of 14

Architectural Enhancements – Process Manager

ProcessManager

Client Side Server Side

Speech Analysis

Hub

Signal Detector

Data RecorderSpeech

Recognition

Hub

Signal DetectorData Recorder

Speech Recognition

Page 8: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 8 of 14

Common Application Interface

A Single Screen Interface

Once the user selects an application,

the Demo Selector loads and

displays the user interface needed

for the specific application.

• Requirement of a common interface

• Demo Selector module

• Process Manager module

Demo Selector interface

Page 9: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 9 of 14

Architectural Enhancements – Robustness Improvements

State Machine architecture

• Servers redesign

• Each state gets a particular

message.

• Trap inter-process communication

errors

Handshaking

• A simple protocol

• Sends a signal

• Waits for an acknowledgement

• Communicator frame

Wait_for_Audio_

Ready state

Data_Transfer

State

End_Of_Utterance

State

Initializationstate

Audio Ready

state

Data_Transfer

State

End_Of_Utterance

State

Initializationstate

Audio_Ready_Ack

state

Audio_Ready

Audio_Ready_

Ack

Data

Data_Ack

End_Points

End_Of_Utterance

End_Of_Utterance

Ack

Speech Analysis Client Signal Detector

Page 10: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 10 of 14

Evaluation

Experiment 1:comparing the results obtained by testing utterances from the extended pilot database on the original and the enhanced architecture.

Three experiments were conducted to measure the

quantitative improvements in the robustness of the system.

Page 11: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 11 of 14

Evaluation

Experiment 2

Scenario sample:(Dialog system Application) Imagine you are in a big city to attend a conference. Once the conference proceedings are over for the day, you want to visit some sites of interest. You don’t have a map with you and have no idea about the layout of the city. Use the system to plan your trip.

Procedure:Five users were asked to engage in 24 usage scenarios using the original and the enhanced architecture. After a 10-minute practice to get familiar with the functionality of the system, the user performed the scenarios. The entire experiment took approximately 1 hour 30 minutes. The user was asked to cease testing if there was a system failure or he/she exceeded the allotted time of 30 minutes.

Page 12: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 12 of 14

Evaluation

* E stands for enhanced architecture; O stands for original architecture.

Results:

Page 13: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 13 of 14

Evaluation

Experiment 3

Page 14: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 14 of 14

Future work

• Further experiments should be conducted to obtain

additional measures of the robust improvements due

to the enhanced architecture.

• Adding other state of the art application to our

existing HLT package.

• Enhance the Process Manager to create and

manage server processes on different host machines

Page 15: M. Liu, T. Stanley, J. Baca and J. Picone Intelligent Electronic Systems Center for Advanced Vehicular Systems Mississippi State University URL:

SECON 2006 Page 15 of 14

Reference

• J. Aberdeen, B. George and S. Bayer, “Galaxy Communicator,” SourceForge.net, Open Source Technology Group, VA Software, Fremont, California, December 2005, (http://sourceforge.net/projects/communicator).

• K. Hacioglu and B. Pellom, “A Distributed Architecture for Robust Automatic Speech Recognition,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 1234-1234, Hong Kong, April 2003.

• J. Baca, F. Zheng, H. Gao and J. Picone, “Dialog Systems for Automotive Environments,” Proceedings of the European Conference on Speech Communication and Technology (EUROSPEECH), Geneva, Switzerland, pp. 1929‑1932, September 2003.

• Fredrik Olsson. 2002. ``A requirement analysis for an open set of human language technology tasks''. In Proceedings of Workshop on Portability Issues in Human Language Technologies held in conjunction with the Third International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain, June.

• W. Ward and B. Pellom, “The CU Communicator System,” Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, Keystone, Colorado, USA, pp. 1234‑1234, December 1999.

• http://www.speechrecognition.philips.com/index.asp?id=506• http://www.virage.com/home/• http://www.cavs.msstate.edu/hse/ies/projects/dialog/