Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum...
-
Upload
dominic-cameron-morris -
Category
Documents
-
view
213 -
download
0
Transcript of Bayesian models of cross-situational word learning Michael C. Frank Noah Goodman Josh Tenenbaum...
Bayesian models of cross-situational word learning
Michael C. FrankNoah Goodman
Josh Tenenbaum(MIT)
Thanks to Kathy Hirsh-Pasek and Roberta Golinkoff for valuable discussion. Also thanks to Vikash Mansinghka, Ted Gibson, tedlab, and cocosci for
comments and the Jacob Javits Foundation for funding.
Word-learning in action
The problem of word learning
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
words: “blue rings”objects: rings, big bird
words: “and green rings”objects: rings, big bird
words: “and yellow rings”objects: rings, big bird
words: “Bigbird! Do you want to hold the rings?”
objects: big bird
In any one situation, children hear many words and see many objects
One possible solution
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Apply a cross-situational strategy to learn mappings(but this is harder than it looks)
The problem of word learning
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
QuickTime™ and a decompressor
are needed to see this picture.
words: “blue rings”objects: rings, big bird
words: “and green rings”objects: rings, big bird
words: “and yellow rings”objects: rings, big bird
words: “Bigbird! Do you want to hold the rings?”
objects: big bird
Techniques for cross-situational word learning • Deductive inference: Siskind (1996)• Translation model: Yu, Ballard, & Aslin (2005), Yu &
Ballard (in press)
Outline
• Some facts of word learning– Mutual exclusivity– Fast-mapping– Use of social cues
• Our model: Bayesian word-learner
• Extension: Learning social cues
• Experimental coverage
• Some facts of word learning
• Our model: Bayesian word-learner
• Extension: Learning social cues
• Experimental coverage
Three facts of word learning
By 18-24 months, children will map a
novel word onto a novel referent (Markman
1992; Mervis & Bertrand, 1994)
Give me the dax!
Mutual exclusivity
Three- and four-year-olds can learn words
from one situation (Carey, 1978; Markson
& Bloom, 1997)
This one is a koba!
Fast mapping
By 18 months, children distinguish referents
from one another using social cues (Hollich,
Hirsh-Pasek, & Golinkoff, 2001)
Look at the modi!
Use of social cues
Outline
• The facts of word learning
• Our model: Bayesian word-learner– Model– Corpus– Comparison models– Results
• Extension: Learning social cues
• Experimental coverage
Generative model
O
W
lexicon
words
objects
I
things you intend to refer to
€
l
situations
unobserved
observed
observed
Generative model: example
situations
Wwords look pretty
objects O
Iintention
ball
lexicon
ball bike
€
l
Inference
Bayes’ rule
Parsimony prior on lexicons
Inference technique• Stochastic search with simulated tempering• Data-driven proposals drawn from the mutual
information of word-object pairings
Corpus
• 2x10 min clips from CHILDES-Rollins
• Interaction between mom and infant (~6mo)
• 2528 word tokens of 420 words in 623 sentences
• 24 objects, all toys
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
Model comparison
• Co-occurrence frequency
• Point-wise mutual information
€
MI(W ,O) =p(W ,O)
p(W )p(O)
• Translation model, based on IBM model 1 (Yu & Ballard, in press)
Results: model comparison
precision
€
correct pairs in lexicon
total pairs in lexicon
recall
€
correct pairs in lexicon
total correct pairs
Results: intuitive analysis
Word Objectbaby book
bigbird birdbird rattle
birdie duckbook bookoink pighand handhat hat
meow kittymoocow cow
oink pigon ring
ring ringsheep sheep
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Most likely intentionsBest lexicon found
by search
Also: unlike baseline models, our model is extremely extensible
Outline
• The facts of word learning
• Our model: Bayesian word-learner
• Extension: Learning social cues– Corpus– Model– Preliminary results
• Experimental coverage
Social corpus coding
Coded social cues for each utterance: infant’s hands, eyes, mouth, and touch; mom’s
hands, eyes, and touch
How it works
I’m looking
Mom looking
Ball 0 1
Bike 1 0
… … …
Bag 0 0
could be caused by base rate or by relevance
Noisy OR process
base rate relevance
Social model framework
S
social cues
r,b
relevance and base rate of social cues
O
W
lexicon
words
objects
I
things you intend to
refer to
€
l
situations
unobserved
Preliminary Results
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Model finds appropriate features
Social features allow finding intent in situations without referential words
Outline
• The facts of word learning
• Our model: Bayesian word-learner
• Extension: Learning social cues
• Experimental coverage– Mutual exclusivity– Fast-mapping– Use of social cues
Mutual exclusivity
model shows soft mutual exclusivity
Fast-mapping
model can fast-map: learn a word from a single instance
ruled out on account of “light syntax”: penalty for using a referring word in a non-referring way
Use of social cues
model can learn word meanings based on social cues alone
Conclusions
• Bayesian model of cross-situational word-learning– Performed best over a corpus– Allows parsing of sentences and interpretation
of speaker’s intent
• Social model– Model can learn which social cues are relevant
to reference
• Experimental coverage– Mutual exclusivity– Fast-mapping– Learning words for social cues