Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading...
-
date post
22-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of Carnegie Mellon Project LISTEN17/22/2004 If I Have a Hammer: Computational Linguistics in a Reading...
1 7/22/2004
CarnegieMellon
Project LISTEN
If I Have a Hammer: Computational Linguistics in a Reading Tutor that Listens
Jack MostowProject LISTEN (www.cs.cmu.edu/~listen)
Carnegie Mellon University
“To a man with a hammer, everything looks like a nail.” – Mark Twain
Funding: National Science Foundation
Keynote at 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain
2 7/22/2004
CarnegieMellon
Project LISTEN
If I had a hammer… [Hays & Seeger]
If I had a hammer,I’d hammer in the morningI’d hammer in the evening,All over this land
I’d hammer out danger,I’d hammer out a warning,I’d hammer out love between my brothers and my sisters,All over this land.
3 7/22/2004
CarnegieMellon
Project LISTEN
Outline
1. Project LISTEN’s Reading Tutor
2. Roles of computational linguistics in the tutor
3. So… Conclusions
4 7/22/2004
CarnegieMellon
Project LISTEN
Project LISTEN’s Reading Tutor (video)
5 7/22/2004
CarnegieMellon
Project LISTEN
Project LISTEN’s Reading Tutor (video)
John Rubin (2002). The Sounds of Speech (Show 3). On Reading Rockets (Public Television series commissioned by U.S. Department of Education). Washington, DC: WETA.
Available at www.cs.cmu.edu/~listen.
6 7/22/2004
CarnegieMellon
Project LISTEN
Thanks to fellow LISTENers
Tutoring: Dr. Joseph Beck, mining tutorial data Prof. Albert Corbett, cognitive tutors Prof. Rollanda O’Connor, reading Prof. Kathy Ayres, stories for children Joe Valeri, activities and interventions Becky Kennedy, linguist
Listening: Dr. Mosur Ravishankar, recognizer Dr. Evandro Gouvea, acoustic training John Helman, transcriber
Programmers: Andrew Cuneo, application Karen Wong, Teacher Tool
Field staff: Dr. Roy Taylor Kristin Bagwell Julie Sleasman
Grad students: Hao Cen, HCI Cecily Heiner, MCALL Peter Kant, Education Shanna Tellerman, ETC
Plus: Advisory board Research partners
DePaul UBC U. Toronto
Schools
7 7/22/2004
CarnegieMellon
Project LISTEN
Computational linguistics models in an intelligent tutor
Language models predict word sequences for a task. E.g. expect ‘once upon a time…’
Domain models describe skills to learn. E.g. pronounce ‘c’ as /k/.
Production models describe student behavior. E.g. which mistakes do students make?
Student models estimate a student’s skills. E.g. which words will a student need help on?
Pedagogical models guide tutorial decisions. E.g. which types of help work best?
Theme: use data to train models automatically.
8 7/22/2004
CarnegieMellon
Project LISTEN
Language model of oral reading [Mostow, Roth, Hauptmann, & Kane AAAI94]
Problem: which word sequences to expect?Language model specifies word transition probabilities
Given sentence text (e.g. ‘Once upon a time…’) Expect correct reading But allow for deviations With heuristic probabilities
Result: Accepted 96% of correctly read words. Detected about half the serious mistakes.
onceonce
upup
aa
PrRepeatPrRepeat
PrJumpPrJump......
PrTruncatePrTruncate
onceonce PrCorrectPrCorrect uponupon
9 7/22/2004
CarnegieMellon
Project LISTEN
Using ASR errors to tune a language model [Banerjee, Mostow, Beck, & Tam ICAAI03]
Training data: 3,421 oral reading utterances Spoken by 50 children aged 6-10 Recognized (imperfectly) by speech recognizer Transcribed by hand
Method: learn to classify language model transitions Reward good transitions that match transcript Penalize bad transitions that cause recognizer errors Generalize from features (kid age, text length, word type, …)
Result: reduced tracking error by 24% relative to baseline
10 7/22/2004
CarnegieMellon
Project LISTEN
Domain model of pronunciation
Problem: what should students learn?Data: pronunciation dictionary for children’s text
‘teach’ /T IY CH/
Method: align spelling against pronunciation ‘t’ /T/, ‘ea’ /IY/, ‘ch’ /CH/
How frequent is each grapheme-phoneme mapping? ‘t’ /T/ occurred 622 times in 9776 mappings ‘z’ /S/ occurred once (in ‘quartz’)
How consistently is each grapheme pronounced? ‘v’ /V/ always ‘e’ /EH/ (‘bed’), /AH/ (‘the’), /IY/ (‘be’), /IH/ (‘destroy’) + ‘ea’, ‘eau’, ‘ed’, ‘ee’, ‘ei’, ‘eigh’, ‘eo’, ‘er’, ‘ere’, ‘eu’, …
11 7/22/2004
CarnegieMellon
Project LISTEN
Production model of pronunciation [Fogarty, Dabbish, Steck, & Mostow AIED2001]
Problem: Which mistakes to expect?
Data: U. Colorado database of oral reading mistakes ‘bed’ /B IY D/
Method: train G P P’ malrules for decoding ‘e’ /EH/ /IY/
12 7/22/2004
CarnegieMellon
Project LISTEN
Top five G P P’ decoding errors
Drop ‘s’.
Drop ‘s’.
Add ‘n’.
Add ‘s’.
Drop ‘n’.Result: predicted mistakes in unseen test data
Context-sensitive rules improved accuracy.
Later work: predict real-word mistakes [Mostow, Beck, Winter, Wang, & Tobin ICSLP2002]
G P P’ Example
‘s’ /S/ // ‘plants’
‘s’ /Z/ // ‘arms’
‘’ // /N/ ‘ha_d’
‘’ // /Z/ ‘car_’
‘n’ /N/ // ‘land’
13 7/22/2004
CarnegieMellon
Project LISTEN
Student model of help requests [Beck, Jia, Sison, & Mostow UM2003]
Problem: when will a student request help on a word?
Data: 7 months of Reading Tutor use by 87 students Average ~20 hours per student Transactions logged in detail Help request rate excluding common words: 0.5%–54%
Method: train classifier using word, student, history
Result: predict words that unseen students click on
14 7/22/2004
CarnegieMellon
Project LISTEN
Learning curves for students’ help requests
Try to predict subset Grade 1-2 level 1-6 prior encounters
Selected data 53 students 175,961 words 29,278 help requests
Train predictive model Count help requests 5x Predict other kids’ data 71% accuracy
15 7/22/2004
CarnegieMellon
Project LISTEN
Features used
Information about the student Help request rate, overall reading proficiency, etc.
Information about the word Word length, position in sentence, etc.
Student’s history with reading word Percent of times accepted by Reading Tutor, time to read,
etc.
Student’s prior help on this word Was the word helped previously? Earlier today?
How to get all this data??
16 7/22/2004
CarnegieMellon
Project LISTEN
Data collection and translation
word features
17 7/22/2004
CarnegieMellon
Project LISTEN
Structure of Reading Tutor database
Story EncounterList stories Pick stories
Sentence Encounter Read sentenceShow one sentence at a time
Word Encounter Read each word
Listens and helps
StudentReading Tutor
SessionLoginList readers
18 7/22/2004
CarnegieMellon
Project LISTEN
Project LISTEN’s Reading Tutor: A rich source of experimental data
The Reading Tutor beats independent practice… Effect sizes up to 1.3 [Mostow SSSR02, Poulsen 04]
…but how? Use embedded experiments to investigate!
2003-2004 database: 9 schools > 200 computers > 50,000 sessions > 1.5M tutor responses > 10M words recognized Embedded experiments
Randomized trials
19 7/22/2004
CarnegieMellon
Project LISTEN
Pedagogical model of help on decoding [Mostow, Beck, & Heiner SSSR2004]
Problem: Which types of help work best?
Data: 270 students’ assisted reading in the Reading Tutor
Method: randomize choice of help and analyze its effects
Result: detected significant differences in effectiveness
20 7/22/2004
CarnegieMellon
Project LISTEN
Within-subject experiment design: 270 students, 180,909 randomized trials
Outcome: success = ASR accepts word as read fluently
(How) does the type of help affect the next encounter?
Randomized choice among feasible types
Student clicks ‘read.’
‘I love to read stories.’
‘People sit down and …’
‘… read a book.’
Student is reading a story
Student needs help on a word
Tutor chooses what help to give
Student continues reading
Student sees word in a later sentence
Time passes…
21 7/22/2004
CarnegieMellon
Project LISTEN
180,909 word hints(average success rate 66.1%)
Whole word: 24,841 Say In Context 56,791 Say Word
Decomposition: 6,280 Syllabify 14,223 Onset Rime 19,677 Sound Out 22,933 One Grapheme
Analogy: 13,165 Rhymes With 13,671 Starts Like
Semantic: 14,685 Recue 2,285 Show Picture 488 Sound Effect
Which types stood out? Best: Rhymes With 69.2% ± 0.4% Worst: Recue 55.6% ± 0.4%
Example: ‘People sit down and read a book.’
22 7/22/2004
CarnegieMellon
Project LISTEN
What helped which words best?
Same day: Later day:
Grade 1 words: Say In Context,
Onset Rime
Onset Rime
Grade 2 words: Say In Context, Rhymes With
Rhymes With
Grade 3 words: Say In Context Rhymes With, One Grapheme
Compare within level to control for word difficulty.
Supplying the word helped best in the short term…But rhyming hints had longer lasting benefits.
23 7/22/2004
CarnegieMellon
Project LISTEN
So…. what can your computational linguistics model in an intelligent tutor?
What problem is important to solve? Language models predict word sequences for a task. Domain models describe skills to learn. Production models describe student behavior. Student models estimate a student’s skills. Pedagogical models guide tutorial decisions. …
What data is available to train on?What method is suitable to apply?What result is appropriate to evaluate?
24 7/22/2004
CarnegieMellon
Project LISTEN
…Well I got a hammer
Well I got a hammer,And I got a bell,And I got a song to sing, all over this land. It’s the hammer of Justice,It’s the bell of Freedom,It’s the song about Love between my brothers and my sisters,All over this land.
25 7/22/2004
CarnegieMellon
Project LISTEN
Conclusions…
See papers & videos at www.cs.cmu.edu/~listen.
Muchas graciasMolto grazieObrigadoMerci beaucoupDanke schönDank U wellSpaseebaBlagodaria
TakTodah rabahShukraEfcharistoXeh-xehArigato gozaymasKop-kun krapThank you! Questions?
Thanks