Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with...
Transcript of Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with...
![Page 1: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/1.jpg)
Phylogenetic Inference for Language
Nicholas Andrews, Jason Eisner, Mark Dredze
Department of Computer Science, CLSP, HLTCOEJohns Hopkins University
Baltimore, Maryland 21218
April 23, 2013
![Page 2: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/2.jpg)
Outline
1 Phylogenetic inference?
2 Generative model
3 A sampler sketch
4 Variational EM
5 Experiments
![Page 3: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/3.jpg)
Phylogenetic inference?
Language evolution: e.g. sound change1
1(Bouchard-Cote et al., 2007)
![Page 4: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/4.jpg)
Phylogenetic inference?
Bibliographic entry variation:
initials first; shorten to ACLdelete location, shorten venue
Abney, S., Schapire, R. E., & Singer, Y. (1999). Boostingapplied to tagging and PP attachment. Proc. EMNLP-VLC. New Brunswick, New Jersey: Association forComputational Linguistics
S. Abney, R. E. Schapire & Y. Singer (1999). Boostingapplied to tagging and PP attachment. In Proc. EMNLP-VLC. New Brunswick, New Jersey. ACL.
Abney, S., Schapire, R. E., & Singer, Y. (1999). Boostingapplied to tagging and PP attachment. EMNLP.
Steven Abney, Robert E. Schapire, & Yoram Singer (1999). Boostingapplied to tagging and PP attachment. Proc. EMNLP-VLC. New Brunswick, New Jersey: Association forComputational Linguisticsabbreviate names
![Page 5: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/5.jpg)
Phylogenetic inference?
Paraphrase:
Papa ate the caviar
Papa devoured the caviar Papa ate the caviar with a spoon
The caviar was devoured by papa
Active to passive
substitute "devoured" add "with a spoon"
![Page 6: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/6.jpg)
Phylogenetic inference?
One Entity, Many NamesQaddafi, Muammar
Al-Gathafi, Muammar
al-Qadhafi, Muammar
Al Qathafi, Mu’ammar
Al Qathafi, Muammar
El Gaddafi, Moamar
El Kadhafi, Moammar
El Kazzafi, Moamer
El Qathafi, Mu’Ammar
ú
¯ @
Y
�®Ë @ PAJ
JÓ ñK.
@ ÐC�Ë@ YJ.« YÒm× QÒªÓ
ú
¯ @
Y
�®Ë @ PAJ
JÓ ñK.
@ YÒm× QÒªÓ
ú
¯ @
Y
�®Ë @ QÒªÓ
YÒm× ñK.
@
3 / 41
2
2Spence et al, NAACL 2012
![Page 7: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/7.jpg)
Phylogenetic inference?
In each example, there are systematic changes over time:
• Sound change: assimilation, metathesis, etc.
• Bibliographic variation: typos, abbreviations, punctuation,etc.
• Paraphrase: synonyms, voice change, re-arrangements, etc.
• Name variation: nicknames, titles, initials, etc.
This talk: name variation
![Page 8: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/8.jpg)
Phylogenetic inference?
In each example, there are systematic changes over time:
• Sound change: assimilation, metathesis, etc.
• Bibliographic variation: typos, abbreviations, punctuation,etc.
• Paraphrase: synonyms, voice change, re-arrangements, etc.
• Name variation: nicknames, titles, initials, etc.
This talk: name variation
![Page 9: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/9.jpg)
Outline
1 Phylogenetic inference?
2 Generative model
3 A sampler sketch
4 Variational EM
5 Experiments
![Page 10: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/10.jpg)
What’s a name phylogeny?
A phylogeny is a directed tree rooted at ♦
Khawaja Gharibnawaz Muinuddin Hasan Chisty
Khwaja Gharib Nawaz
Khwaja Muin al-Din Chishti
Ghareeb Nawaz
Khwaja Moinuddin Chishti
Khwaja gharibnawazMuinuddin Chishti
Figure: A cherry-picked fragment of a phylogeny learned by our model.
![Page 11: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/11.jpg)
Objects in the model
Names are mentioned in context:
Observed? Description Example
X Name Justin
Parent x13
Entity e44 (= Justin Bieber)X Type person
Topic 6 (= music)X Document d20
X Language EnglishX Token position 100
Index 729
![Page 12: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/12.jpg)
Generative model
Step 1: Sample a topic z at each position in each document3 (forall documents in the corpus):
z1 z2 z3 z4 z5...
Step 2: Sample either (1) a context word or (2) a named-entitytype at each position, conditioned on the topic:
Beliebers held up infinity signs at PERSON ...
3This is just like latent Dirichlet allocation (LDA).
![Page 13: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/13.jpg)
Generative model
Step 1: Sample a topic z at each position in each document3 (forall documents in the corpus):
z1 z2 z3 z4 z5...
Step 2: Sample either (1) a context word or (2) a named-entitytype at each position, conditioned on the topic:
Beliebers held up infinity signs at PERSON ...
3This is just like latent Dirichlet allocation (LDA).
![Page 14: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/14.jpg)
Generative model
Step 3: For the nth named-entity mention y , pick a parent x :
1 Pick ♦ with probability αn+α
♦
PERSONn
2 Pick a previous mention with probability proportional toexp (φ · f(x , y)):
x
PERSONn
Features of x and y: topic, entity type, language
![Page 15: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/15.jpg)
Generative model
Step 3: For the nth named-entity mention y , pick a parent x :
1 Pick ♦ with probability αn+α
♦
PERSONn
2 Pick a previous mention with probability proportional toexp (φ · f(x , y)):
x
PERSONn
Features of x and y: topic, entity type, language
![Page 16: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/16.jpg)
Generative model
Step 4: Generate a name conditioned on the selected parent
1 If the parent is ♦, generate a name from scratch
♦
Justin Bieber
2 Otherwise:
Justin Bieber
Justin Bieber
copy with probability 1− µ
Justin Bieber
J.B.
mutate with probability µ
![Page 17: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/17.jpg)
Generative model
Step 4: Generate a name conditioned on the selected parent
1 If the parent is ♦, generate a name from scratch
♦
Justin Bieber
2 Otherwise:
Justin Bieber
Justin Bieber
copy with probability 1− µ
Justin Bieber
J.B.
mutate with probability µ
![Page 18: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/18.jpg)
Generative model
Step 4: Generate a name conditioned on the selected parent
1 If the parent is ♦, generate a name from scratch
♦
Justin Bieber
2 Otherwise:
Justin Bieber
Justin Bieber
copy with probability 1− µ
Justin Bieber
J.B.
mutate with probability µ
![Page 19: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/19.jpg)
Generative model
Name variation as mutations
“Mutations” capture different types of name variation:
1. Transcription errors: Barack → barack
2. Misspellings: Barack → Barrack
3. Abbreviations: Barack Obama → Barack O.
4. Nicknames: Barack → Barry
5. Dropping words: Barack Obama → Barack
![Page 20: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/20.jpg)
Generative model
Mutation via probabilistic finite-state transducers
The mutation model is a probabilistic finite-state transducerwith four character operations: copy, substitute, delete,insert
I Character operations are conditioned on the right inputcharacter
I Latent regions of contiguous edits
I Back-off smoothing
Transducer parameters θ determine the probability of being indifferent regions, and of the different character operations
![Page 21: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/21.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[
Beginning of edit region
Example mutation
![Page 22: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/22.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[B
1 substitution operation: (R, B)
Example mutation
![Page 23: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/23.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[B o b
2 copy operations: (ε, o), (ε, b)
Example mutation
![Page 24: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/24.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[B o b
3 deletion operations: (e,ε), (r,ε), (t, ε)
Example mutation
![Page 25: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/25.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y$M r . _[B o b b y
2 insertion operations: (ε,b), (ε,y)
Example mutation
![Page 26: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/26.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[B o b b y]
End of edit region
Example mutation
![Page 27: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/27.jpg)
Generative model
Example: Mutating a name
Mr. Robert Kennedy
Mr. Bobby Kennedy
M r . _ R o b e r t _ K e n n e d y $M r . _[B o b b y]_ K e n n e d y $
Example mutation
![Page 28: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/28.jpg)
Outline
1 Phylogenetic inference?
2 Generative model
3 A sampler sketch
4 Variational EM
5 Experiments
![Page 29: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/29.jpg)
Inference
The latent variables in the model are4
• The spanning tree over tokens p
• The token permutation i
• The topics of all named-entity and context tokens z
Inference requires marginalizing over the latent variables:
Prφ,θ(x) =∑
p,i,z
Prφ,θ(x, z, i,p)
4The mutation model also has latent alignments
![Page 30: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/30.jpg)
Inference
The latent variables in the model are
• The spanning tree over tokens p
• The token permutation i
• The topics of all named-entity and context tokens z
Inference requires marginalizing over the latent variables:
Prφ,θ(x) =∑
p,i,z
Prφ,θ(x, z, i,p)
This sum is intractable to compute /
![Page 31: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/31.jpg)
Inference
The latent variables in the model are
• The spanning tree over tokens p
• The token permutation i
• The topics of all named-entity and context tokens z
Inference requires marginalizing over the latent variables:
Prφ,θ(x) =������������∑
p,i,z
Prφ,θ(x, z, i,p)
≈ 1
N
N∑
n=1
Prφ,θ(x, zn, in,pn)
But we can sample from the posterior! ,
![Page 32: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/32.jpg)
A block sampler
Key idea: sampling (p, i, z) jointly is hard, but sampling from theconditional for each variable is easy(ier)
Procedure:
• Initialize (p, i, z).
• For n = 1 to N:
1 Resample a permutation i given all other variables.
2 Resample the topic vector z, similarly.3 Resample the phylogeny p, similarly.4 Output the current sample (p, i, z).
Steps 1 and 2 are Metropolis-Hastings proposals
![Page 33: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/33.jpg)
A block sampler
Key idea: sampling (p, i, z) jointly is hard, but sampling from theconditional for each variable is easy(ier)Procedure:
• Initialize (p, i, z).
• For n = 1 to N:
1 Resample a permutation i given all other variables.
2 Resample the topic vector z, similarly.3 Resample the phylogeny p, similarly.4 Output the current sample (p, i, z).
Steps 1 and 2 are Metropolis-Hastings proposals
![Page 34: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/34.jpg)
Sampling topics
Step 1: Run belief propagation with messages Mij directed fromthe leaves to the root ♦
♦
x
zyMyx Mzx
Step 2: Sample topics z from ♦ downwards proportional to thebelief at each vertex, conditioned on previously sampled topics
![Page 35: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/35.jpg)
Sampling topics
Step 1: Run belief propagation with messages Mij directed fromthe leaves to the root ♦
♦
x
zyMyx Mzx
Step 2: Sample topics z from ♦ downwards proportional to thebelief at each vertex, conditioned on previously sampled topics
![Page 36: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/36.jpg)
Sampling permutations
♦
yx
(a) Compatible with both (x , y) and(y , x).
♦
x
y
(b) Compatible with a singlepermutation: (x , y).
![Page 37: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/37.jpg)
Sampling permutations
Each edge between non-root vertices yields a constraint on possiblepermutations:
Example
♦
x
zy
yields two constraints: x ≺ y and x ≺ z .
Sampling uniformly from the set of permutations respecting theseconstraints is a simple recursive procedure:
def unif perm(u):
yield u
for x in unif shuffle([ unif perm(x) for x in children[u] ]):
yield x
![Page 38: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/38.jpg)
Sampling permutations
Each edge between non-root vertices yields a constraint on possiblepermutations:
Example
♦
x
zy
yields two constraints: x ≺ y and x ≺ z .
Sampling uniformly from the set of permutations respecting theseconstraints is a simple recursive procedure:
def unif perm(u):
yield u
for x in unif shuffle([ unif perm(x) for x in children[u] ]):
yield x
![Page 39: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/39.jpg)
Sampling phylognies
Conditioned on topics and a permutation of the tokens, sample aparent x for each mention y with probability:
∝ Prφ(x , y)︸ ︷︷ ︸affinity model
· Prθ(x .n, y .n)︸ ︷︷ ︸transducer model
No cycles, since the mention permutation i is known.
![Page 40: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/40.jpg)
Outline
1 Phylogenetic inference?
2 Generative model
3 A sampler sketch
4 Variational EM
5 Experiments
![Page 41: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/41.jpg)
A simplified model
The sampler is still running /
We report experiments from our EMNLP 2012 paper + followupexperiments, which use a simpler model:
• No context/topics: only the transducer parameters θ need tobe estimated
• Type-level inference and supervision: vertices in the phylogenyrepresent distinct name types rather than name tokens
![Page 42: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/42.jpg)
A simplified model
The sampler is still running /
We report experiments from our EMNLP 2012 paper + followupexperiments, which use a simpler model:
• No context/topics: only the transducer parameters θ need tobe estimated
• Type-level inference and supervision: vertices in the phylogenyrepresent distinct name types rather than name tokens
![Page 43: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/43.jpg)
Inference
Inference
Input: An unaligned corpus of names (“bag-of-words”)
I The order in which the tokens were generated is unknown
I No “inputs” or “outputs” are known for the mutation model
Barack Obama
Obama
President Barack Obama
Barack
Barrackbarack obama
Hillary Clinton
Clinton
Bill Clinton
billBill
Barry
Vice President Clinton
Billy
Hillary
will clinton
Hillary Rodham Clinton
Mitt RomneyBarack Obama Sr
Romney
Willard M. Romney
Governor Mitt Romney
Mr. Romney
mittMitt rommey
clinton
William Clinton
barak
President Bill Clinton
President
Barack H. Obama
Ms. Clinton
Output: A distribution over name phylogenies parametrized bytransducer parameters θ
![Page 44: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/44.jpg)
Inference
Type phylogeny vs token phylogeny
The generative model is over tokens (name mentions)
Ehud Barak President Barack Obama Secretary of State Hillary Clinton
Barack Obama Hillary Clinton
Barack Obama Clinton
Obama
Barak
Barack
Barry
Hillary Clinton
Barry
But we do type-level inference for the following reasons:
1. Allows faster inference
2. Allows type-level supervision
![Page 45: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/45.jpg)
Inference
Type phylogeny vs token phylogeny
We collapse all copy edges into a single vertex
President Barack Obama Secretary of State Hillary Clinton
BARACK OBAMA (2) HILLARY CLINTON (2)
Clinton
Obama
Barack
BARRY (2)
Ehud Barak
Barak
Barry
I The first token in each collapsed vertex is a mutation, andthe rest are copies
I Every edge in the phylogeny now corresponds to a mutation
I Approximation: disallow multiple tokens of the same type tobe derived from mutations
![Page 46: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/46.jpg)
Inference
Edge weights
I New names: edges from ♦ to a name x :
δ(x | ♦) = α · p(x | ♦)
I Mutations: edges from a name x to a name y :
δ(y | x) = µ · p(y | x) · nxny + 1
Approximation: Edges weights are not quite edge factored. We aremaking an approximation of the form
E∏
y
δ(y | pa(y)) ≈∏
y
Eδ(y | pa)
![Page 47: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/47.jpg)
Inference
Inference via EM
Iterate until convergence:
1. E-step: Given θ, compute a distribution over namephylogenies
2. M-step: Re-estimate transducer parameters θ given marginaledge probabilities.
I This step sums over alignments for each (x , y) string pairusing forward-backward
I Each (x , y) pair may be viewed as a training example weightedby the marginal probability of the edge from x to y
![Page 48: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/48.jpg)
Inference
E-step: marginalizing over latent variables
The latent variables in the model are:
1. Name phylogeny (spanning tree) relating names as inputsand/or outputs
2. Character alignments from potential input names x to outputnames y
We use the Matrix-Tree theorem for directed graphs (Tutte, 1984)to efficiently evaluate marginal probabilities:
1. Partition function (sum over phylogenies)
2. Edge marginals
![Page 49: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/49.jpg)
Outline
1 Phylogenetic inference?
2 Generative model
3 A sampler sketch
4 Variational EM
5 Experiments
![Page 50: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/50.jpg)
Data
• We collected a corpus of Wikipedia redirect strings used asexamples of names variations
• Filtered down to a subset 77489 people from EnglishWikipedia (Examples in the next slide!)
• The frequency of each variation is estimated using the Googlecrosswiki dataset5
• Dictionary of anchor strings linking to English Wikipediaarticles
• Collected “by crawling a reasonably large approximation of theentire web”
5Spitkovsky and Chang, 2012
![Page 51: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/51.jpg)
Example Wikipedia redirects
Ho Chi MinhHo chi mihnHo-Chi MinhHo Chih-minh
Guy FawkesGuy fawkesGuy fauxGuy foxe
Bill GatesLord BillyWilliam Gates IIIWilliam H. Gates
Billll ClintonWilliam J. Blythe IVWilliam ClintonPresident Clinton
![Page 52: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/52.jpg)
Example Wikipedia redirects
Ho Chi MinhHo chi mihnHo-Chi MinhHo Chih-minh
Guy FawkesGuy fawkesGuy fauxGuy foxe
Bill GatesLord BillyWilliam Gates IIIWilliam H. Gates
Billll ClintonWilliam J. Blythe IVWilliam ClintonPresident Clinton
![Page 53: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/53.jpg)
Example Wikipedia redirects
Ho Chi MinhHo chi mihnHo-Chi MinhHo Chih-minh
Guy FawkesGuy fawkesGuy fauxGuy foxe
Bill GatesLord BillyWilliam Gates IIIWilliam H. Gates
Billll ClintonWilliam J. Blythe IVWilliam ClintonPresident Clinton
![Page 54: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/54.jpg)
Example Wikipedia redirects
Ho Chi MinhHo chi mihnHo-Chi MinhHo Chih-minh
Guy FawkesGuy fawkesGuy fauxGuy foxe
Bill GatesLord BillyWilliam Gates IIIWilliam H. Gates
Billll ClintonWilliam J. Blythe IVWilliam ClintonPresident Clinton
![Page 55: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/55.jpg)
Incorporating supervision
Type-level supervision is incorporated by tagging vertices withunique IDs and enforcing that they agree from parent to child:
tagged
untaggedX
Bill Gates
William GatesX
Bill Gates
Bill Clinton
![Page 56: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/56.jpg)
Experiment 1: Evaluating the transducer
Procedure:
• At train time:
1 Estimate the transducer parameters θ
• At test time:
1 For each name x in the test set, rank all other names y by thetransducer probability
Prθ(y | x)
2 Compute the mean reciprocal rank (MRR) over all names
![Page 57: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/57.jpg)
Experiment 1: Evaluating the transducer
0
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
Jaro Winkler Levenshtein 10 entities 10+unlabeled Unsupervised 1500 entities
0.8030.7630.7640.741
0.6420.611
MR
R
![Page 58: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/58.jpg)
Experiment 2: Evaluating the phylogeny
Step 1: Estimate θ via EM on the training corpusStep 2: Find the highest scoring tree 6
William H. Gates
Lord Billy
Guy Fawkes
Bill Gates
Guido Fawkes
President Bill Clinton
Input: “bag of words.”
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Output: 1-best tree
6O(m log n) for graphs of n vertices and m edges
![Page 59: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/59.jpg)
Experiment 2: Evaluating the phylogeny
Step 1: Estimate θ via EM on the training corpusStep 2: Find the highest scoring tree 6
William H. Gates
Lord Billy
Guy Fawkes
Bill Gates
Guido Fawkes
President Bill Clinton
Input: “bag of words.”
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Output: 1-best tree
6O(m log n) for graphs of n vertices and m edges
![Page 60: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/60.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
α︸︷︷︸pseudo-count at ♦
·Prθ(Mr. Clinton | ♦)︸ ︷︷ ︸transducer probability
![Page 61: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/61.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(William H. Gates)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton |William H. Gates)︸ ︷︷ ︸transducer probability
![Page 62: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/62.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(Bill Gates)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton | Bill Gates)︸ ︷︷ ︸transducer probability
![Page 63: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/63.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(President Bill Clinton)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton | President Bill Clinton)︸ ︷︷ ︸transducer probability
![Page 64: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/64.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(Lord Billy)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton | Lord Billy)︸ ︷︷ ︸transducer probability
![Page 65: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/65.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(Guy Fawkes)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton | Guy Fawkes)︸ ︷︷ ︸transducer probability
![Page 66: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/66.jpg)
Experiment 2: Evaluating the phylogeny
Step 3: Attach each name in the test corpus to its most likelyparent in the 1-best tree
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
∝ c(Guido Fawkes)︸ ︷︷ ︸name frequency
·Prθ(Mr. Clinton | Guido Fawkes)︸ ︷︷ ︸transducer probability
![Page 67: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/67.jpg)
Experiment 2: Evaluating the phylogeny
Step 4: Calculate macro-averaged precision and recall for eachtest name
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. Clinton
X
X
Precision = 23
Recall = 22
![Page 68: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/68.jpg)
Experiment 2: Evaluating the phylogeny
Step 4: Calculate macro-averaged precision and recall for eachtest name
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. ClintonX
Precision = 13
Recall = 12
![Page 69: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/69.jpg)
Experiment 2: Evaluating the phylogeny
Step 4: Calculate macro-averaged precision and recall for eachtest name
♦
Guy Fawkes
Guido Fawkes
President Bill Clinton
Lord Billy
William H. Gates
Bill Gates
Mr. ClintonX
Precision = 11
Recall = 12
![Page 70: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/70.jpg)
Baselines
We compare to two baselines:
1 Flat tree
♦
Flat tree: depth ≤ 2
♦
Unrestricted tree
2 Weak transducer• No latent edit regions• Only 3 degrees of freedom: the weights of different edit
operations
![Page 71: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/71.jpg)
Baselines
We compare to two baselines:
1 Flat tree
♦
Flat tree: depth ≤ 2
♦
Unrestricted tree
2 Weak transducer• No latent edit regions• Only 3 degrees of freedom: the weights of different edit
operations
![Page 72: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/72.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 0% supervision
Full modelBaseline
![Page 73: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/73.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 27% supervision
Full modelBaseline
![Page 74: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/74.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 34% supervision
Full modelBaseline
![Page 75: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/75.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 47% supervision
Full modelBaseline
![Page 76: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/76.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 53% supervision
Full modelBaseline
![Page 77: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/77.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 63% supervision
Full modelBaseline
![Page 78: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/78.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree @ 100% supervision
Full modelBaseline
![Page 79: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/79.jpg)
Comparison to flat tree
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. flat tree
0%0%27%27%34%34%47%47%53%53%63%63%100%100%
![Page 80: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/80.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 0% supervision
Full modelBaseline
![Page 81: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/81.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 27% supervision
Full modelBaseline
![Page 82: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/82.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 34% supervision
Full modelBaseline
![Page 83: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/83.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 47% supervision
Full modelBaseline
![Page 84: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/84.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 53% supervision
Full modelBaseline
![Page 85: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/85.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 63% supervision
Full modelBaseline
![Page 86: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/86.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer @ 100% supervision
Full modelBaseline
![Page 87: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/87.jpg)
Comparison to weak transducer
0.0 0.2 0.4 0.6 0.8 1.0Recall
0.0
0.2
0.4
0.6
0.8
1.0
Prec
isio
n
Full model vs. weak transducer
0%0%27%27%34%34%47%47%53%53%63%63%100%100%
![Page 88: Phylogenetic Inference for Languagejason/papers/andrews+al.emnlp12...Justin Bieber copy with probability 1 Justin Bieber J.B. mutate with probability Generative model Name variation](https://reader036.fdocuments.net/reader036/viewer/2022071219/6057951714da1656f24d54a3/html5/thumbnails/88.jpg)
The End
Khawaja Gharibnawaz Muinuddin Hasan Chisty
Khwaja Gharib Nawaz
Khwaja Muin al-Din Chishti
Ghareeb Nawaz
Khwaja Moinuddin Chishti
Khwaja gharibnawazMuinuddin Chishti
Thanks! Questions?