A Prototype Personal Dictation System Adam Janin [email protected].

12
A Prototype Personal Dictation System Adam Janin [email protected] u
  • date post

    15-Jan-2016
  • Category

    Documents

  • view

    225
  • download

    0

Transcript of A Prototype Personal Dictation System Adam Janin [email protected].

Page 1: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

A Prototype Personal Dictation System

Adam [email protected]

Page 2: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Final Goal – A Portable Meeting Recorder

Record impromptu meetings in a natural environment.

Detect multiple speakers.Allow correction and annotation.Support indexing and searching.Self-contained (using IRAM).

Page 3: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Intermediate Goal – A Personal Dictation System

Record a single user dictating text.Allow correction and editing.Hosted system:

ASR runs on workstation. GUI runs on Pilot. Communicate via wired network. Close-talking mic. Limited domain (Broadcast News).

Page 4: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Asides...

Why not Wizard of Oz? Structure of correction mechanism is

recognizer specific. Develop infrastructure. Produce a working demo.

Informal user study, mostly with speech researchers.

Page 5: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Architecture

Palm Pilot

Correct transcripts

Edit transcripts

Create new text

Sun Workstation

Audio frontend

Speech recognizer

Correction server

Page 6: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Correcting and Editing

Correcting – informing the recognizer that it has made an error. If recognizer has a good idea of alternatives,

it may be faster to correct than to edit. Recognizer can adapt to user and

vocabulary.

Editing – changing the output. “That’s not what I meant to say”. Text vs. speech input.

Page 7: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Correction Methods: Background

Lattice contains recognizer’s best guesses.

More compact than N-best lists.

Contains word order and timing.

1). the records …2). a rack ...3). the wreck or …4). a record ...

Page 8: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Correction Methods: Selecting Hypotheses

User corrects “records”.

1). the records …2). a rack ...3). the wreck or …4). a record ...

System picks all words that overlap in time.

Presents in order from most likely to least.

Note: full overlap is probably not optimal.

Page 9: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Correction Methods: Rescoring

User corrects “records” to “record”.

1). the records …2). a rack ...3). the wreck or …4). a record ...

Unexpected changes!

Select only paths with “record”.

Rescore lattice.

Page 10: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Editing

Allows user to add or edit text arbitrarily.

Must synchronize with correction server.

Edit vs. Correct is currently implemented modally with push buttons on-screen.

Gestural interface for correcting and editing would be preferable.

Page 11: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Details...

Correction allows for words not in lattice.

Tap to correct worked better than press-and-hold.

System updates text when user pauses.

Doesn’t handle punctuation, paragraphs, etc.

Correction is fast, but dictation is slow.

Page 12: A Prototype Personal Dictation System Adam Janin janin@icsi.berkeley.edu.

Future Work

“Real” user studies.Experiment more with correction

mechanisms.Implement editing synchronization.Implement gestures.Move to wireless network and mic.Add punctuation, paragraphs, etc.