Uses for Automatic Speech Recognition with Diverse English Speakers 2002 American...

37
Uses for Automatic Speech Recognition with Diverse English Speakers 2002 American Speech-Language-Hearing Association Annual Convention Atlanta, Georgia World Congress Center, Room: A314, Saturday, Nov 23 2002 4:30PM – 5:30PM Presenters/Authors: Kathleen Eilers Crandall, Ph.D., Paula M. Brown, Ph.D., Donna E. Gustina, and Stephen S. Campbell National Technical Institute for the Deaf Rochester Institute of Technology
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    2

Transcript of Uses for Automatic Speech Recognition with Diverse English Speakers 2002 American...

Uses for Automatic Speech Recognition with Diverse English Speakers

2002 American Speech-Language-Hearing Association Annual Convention

Atlanta, Georgia World Congress Center, Room: A314, Saturday, Nov 23 2002 4:30PM – 5:30PM

Presenters/Authors: Kathleen Eilers Crandall, Ph.D., Paula M. Brown, Ph.D., Donna E. Gustina, and Stephen S. Campbell

National Technical Institute for the DeafRochester Institute of Technology

Seminar – PresentersKathleen Eilers crandall,

Ph.D.Department of English, National Technical Institute for the Deaf, Rochester Institute of Technology

Paula M. Brown, Ph.D., CCC-SLP Department of Speech and Language, National Technical Institute for the Deaf, Rochester Institute of Technology

The Glossograph

• Fay wrote about an experimental mechanical device used to transcribe human speech, and said,

• “… it is not unreasonable to hope that some instrument will yet be contrived …“

Fay, E.A. (1883). The glossograph. American Annals of the Deaf, 28, 67-69.

Sci-Fi or Reality?

"The pen was an archaic instrument, seldom used even for signatures...Apart from very short notes, it was usual to dictate everything into the speak-write…” (Nineteen eighty-four. Orwell, 1949)

Two Projects

• Teacher use of ASR:– English Classroom/Lab Project

• Student use of ASR:– Speech Project

Funded by a grant from the Parsons Foundation of California

English Classroom/Lab Project

English Classroom/Lab Project

Purpose

Investigate direct use of ASR by classroom teacher to learn:

• Is acceptable recognition level attained?

• Under what conditions?– Style of speaking– Communication mode– Language complexity

Related Work

Use of ASR by an intermediary • Intermediary, a ‘captionist,’ re-speaks

professor’s words into a computer• Intermediary summarizes professor’s

words into a computer (‘interpreted speech’)

• Intermediary may use C-print (a shorthand typing system) in combination with ASR http://cprint.rit.edu/

Related Work

Use of ASR by the primary speaker

• iCommunicator™ http://www.myicommunicator.com/product_info.html

• Liberated Learning Environment http://www.liberatedlearning.com (St. Mary’s University, Halifax, Nova Scotia)

Speech Project

Speech Project Intent

• Can ASR become better than a naïve listener?

• Can ASR serve as an effective and motivating feedback system?

Speech Project How ASR Is Used Educationally

Visual displays provide feedback regarding speech production

• Natural way of learning

• Expect feedback to reflect accuracy– Assume if don’t get right picture, you were

wrong

English Classroom/Lab Project

English Classroom/Lab Project

Teacher -- Students• Teacher -- Speaker

– Native speaker of American English– User of ASL as a second language – Trained the ASR equipment

• Students -- Readers – Young adult college students who are deaf or hard-of-

hearing– Reading and writing skills at the lowest quartile of

entering students– Enrolled in basic level English language reading and

writing courses

English Classroom/Lab Project

Evaluation Procedures

• ASR Software: – Dragon Naturally Speaking– IBM ViaVoice– Microsoft Office

• Speaking styles: – Spontaneous conversation– Dictation-like speech

• Communication modes:– Speaking– Simultaneously speaking and signing

English Classroom/Lab

Teacher stationControl systemSmart Board & LCD Projector

Student Stations

English Classroom/Lab Project

Accuracy Needs

• Vary by population and message predictability– New vs. Known information– Fluent readers vs.

Language learners– Reading for pleasure vs. Reading to master new

information

• CLOZE research and prediction of missing information

English Classroom/Lab Project

Results: ASR Software

75%

80%

85%

90%

95%

100%

Dragon ViaVoice XP

Conversation

Dictation

English Classroom/Lab Project

Results: Communication Mode

80%

82%

84%

86%

88%

90%

92%

94%

96%

98%

Simultaneous Commmunication Speech Only

Conversation

Dictation

English Classroom/Lab Project

Results: Language Complexity

82%

84%

86%

88%

90%

92%

94%

96%

98%

< 7th Grade > 7th Grade

Conversation

Dictation

English Classroom/Lab Project

Correcting Text

• Error correction– What to correct – When to correct– How to correct

Multitasking Demands

• Normal tasks for speaker/teacher– Formulating ideas relevant to topic– Attending to learning needs of students – Meeting lipreading and sign language needs

• Added tasks for speaker/teacher – Speaking to produce readable ASR text– Monitoring text– Making corrections

Speech Project

Speech Project

Training Sequence

• Read a paragraph

• Correct and train recognition errors

• Reread paragraph

• Correct and train recognition errors

• Create transfer paragraph or spontaneous speech

• Correct and train recognition errors

Recognition Accuracy

0%

10%

20%30%

40%

50%

60%

70%

80%90%

100%

M Intel F semi-intel F quasi-intel

Improvement Across Sessions

0%

10%

20%

30%

40%

50%

60%

70%

80%

time 1 time 2 time 3 time 4 time 5

Improvement Within Session

65%

70%

75%

80%

85%

90%

95%

Reading 1 Reading 2 Reading 3 Spon Sp

Speech Project

Improvement Evaluated

• Improvement across sessions

• Improvement within a session– Improvement with speaker training– Improvement with ASR training

RecommendationsDiscussionQuestions

Grammatical Correctness

• Is ASR accuracy affected by the grammatical correctness of the user’s speech?

• Student written responses spoken as written: Accuracy – 93.8%

• Student written responses spoken after corrected: Accuracy - 94.3%

Style of Speaking

1. Style of speaking that more closely resembles dictation approaches a usable accuracy rate.

2. Lowering the complexity does not improve accuracy.

Conditions of Use

Direct use of ASR by a language teacher --Useful only under very controlled conditions.• Illustrating the generation of written

language • Demonstrating the use of notes and

outlines to produce written text• Translating selected sign language

utterances into English text during discussions

ASR: Classroom Use

Prepared Outline

Student’s Screen

Teacher’s Screen

Considerations• Training

– Critical to reach over 90% accuracy– Training with conversation

• Corrections– Familiarity with strategies – Dictate, Spell, Right click

• Equipment– Microphone headsets - design, comfort, and size– Demand on computer processor– Effect of optional settings

Language Processing

Teaching/Learning Issues:• Does ASR promote the learning of reading

and writing for Deaf and Hard-of-Hearing students?

• How do students process this information?• Do students attend to multiple inputs?• Can teachers attend to this many tasks

effectively?

More Questions

• Who is at fault?– Speaker or ASR receiver?

• Acceptability of input– Various voices– Nontypical speakers

• User friendliness– Want immediate use

PresentersKathleen Eilers Crandall, Ph.D.Department of English

National Technical Institute for the Deaf

Rochester Institute of Technology Lyndon Baines Johnson Building -

2264

Phone: (585) 475-5111

Fax: (585) 475-6500

Email: [email protected]

Web: http://www.rit.edu/~kecncp

Paula M. Brown, Ph. D., CCC-SLP

Department of Speech and Language

National Technical Institute for the Deaf

Rochester Institute of Technology Lyndon Baines Johnson Building -

3851

Phone: (585) 475-6593 V/TDD

Fax: (585) 475-6500

Email: [email protected]

Web: http://www.rit.edu/~462www/