Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav...
-
Upload
trevion-lief -
Category
Documents
-
view
221 -
download
2
Transcript of Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav...
![Page 1: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/1.jpg)
Self-training with Products of Latent Variable Grammars
Zhongqiang Huang, Mary Harper, and Slav Petrov
![Page 2: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/2.jpg)
OverviewMotivation and Prior Related Research
Experimental SetupResultsAnalysisConclusions
2
![Page 3: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/3.jpg)
Parse Tree Sentence Parameters
...
Derivations
PCFG-LA Parser[Matsuzaki et. al ’05] [Petrov et. al ’06] [Petrov & Klein’07]
3
![Page 4: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/4.jpg)
PCFG-LA Parser
NP
NP1 NP2
Hierarchical splitting (& merging)
NP1 NP2 NP3 NP4
NP1 NP2 NP3 NP4 NP5 NP6 NP7 NP8
Split to 2
Split to 4
Split to 8
Original Node
…
IncreasedModel
Complexity
n-th grammar: grammar trained after n-th split-merge rounds
…
Typical learning curve
Grammar Order Selection
Use development set
![Page 5: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/5.jpg)
Max-Rule Decoding (Single Grammar)
S
NP
VP
[Goodman ’98, Matsuzaki et al. ’05, Petrov & Klein ’07]
6
![Page 6: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/6.jpg)
Variability
7 [Petrov, ’10]
![Page 7: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/7.jpg)
...
Max-Rule Decoding (Multiple Grammars)
[Petrov, ’10]
Treebank
8
![Page 8: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/8.jpg)
Product Model Results
9 [Petrov, ’10]
![Page 9: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/9.jpg)
Motivation for Self-Training
10
![Page 10: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/10.jpg)
Self-training (ST)
HandLabele
d
UnlabeledData
Train
LabelAutomatically Labeled
Data
Train
Select with dev
11
![Page 11: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/11.jpg)
Self-Training Curve
13
![Page 12: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/12.jpg)
WSJ Self-Training Results
F score
14 [Huang & Harper, ’09]
![Page 13: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/13.jpg)
Self-Trained Grammar Variability
Self-trained Round 7
Self-trained Round 6
16
![Page 14: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/14.jpg)
Summary Two issues: Variability & Over-fitting
Product model Makes use of variability Over-fitting remains in individual grammars
Self-training Alleviates over-fitting Variability remains in individual grammars
Next step: combine self-training with product models
17
![Page 15: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/15.jpg)
Experimental Setup Two genres:
WSJ: Sections 2-21 for training, 22 for dev, 23 for test, 176.9K sentences per self-trained grammar
Broadcast News: WSJ+80% of BN for training, 10% for dev, 10% for test (see paper),
Training Scenarios: train 10 models with different seeds and combine using Max-Rule Decoding Regular: treebank training with up to 7 split-merge
iterations Self-Training: three methods with up to 7 split-
merge iterations18
![Page 16: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/16.jpg)
ST-Reg
LabelAutomatically Labeled
Data
UnlabeledData
HandLabele
d
Train
Train ⁞
Multiple Grammars?
ProductTrain
Select with dev set
19
Single automatically labeled set by round 6 product
![Page 17: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/17.jpg)
ST-Prod
LabelAutomatically Labeled
Data
UnlabeledData
HandLabele
d
Train⁞
Product
Train ⁞
Use more data?
Product
20
Single automatically labeled set by round 6 product
![Page 18: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/18.jpg)
ST-Prod-Mult
⁞
HandLabele
d
Train⁞
Label
Product
⁞
Label
Product
Product
21
10 different automaticallylabeled sets by round 6 product
![Page 19: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/19.jpg)
24
![Page 20: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/20.jpg)
A Closer Look at Regular Results
25
![Page 21: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/21.jpg)
A Closer Look at Regular Results
26
![Page 22: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/22.jpg)
A Closer Look at Regular Results
27
![Page 23: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/23.jpg)
A Closer Look at Self-Training Results
28
![Page 24: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/24.jpg)
A Closer Look at Self-Training Results
29
![Page 25: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/25.jpg)
A Closer Look at Self-Training Results
30
![Page 26: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/26.jpg)
Analysis of Rule Variance We measure the average empirical variance
of the log posterior probabilities of the rules among the learned grammars over a held-out set S to get at the diversity among the grammars:
31
![Page 27: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/27.jpg)
Analysis of Rule Variance
32
![Page 28: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/28.jpg)
English Test Set Results (WSJ 23)
Single Parser Reranker Product Parser Combination
[Ch
arn
iak
’00]
Petr
ov e
t al.
’0
6]
[Carr
era
s e
t al.
’08]
[Hu
an
g &
Harp
er
’08]
Th
is W
ork
[Petr
ov ’
10]
Th
is W
ork
[Ch
arn
iak &
Joh
nson
’05]
[Hu
an
g ’
08]
[McC
losky e
t al.
’06]
[Sag
ae &
Lavie
’06]
[Fossu
m &
Kn
igh
t ’0
9]
[Zh
an
g e
t al.
’09]
33
![Page 29: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/29.jpg)
Broadcast News
34
![Page 30: Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.](https://reader035.fdocuments.net/reader035/viewer/2022062515/56649c6d5503460f9491edae/html5/thumbnails/30.jpg)
Conclusions Very high parse accuracies can be
achieved by combining self-training and product models on newswire and broadcast news parsing tasks.
Two important factors:1. Accuracy of the model used to parse the
unlabeled data 2. Diversity of the individual grammars
35