Improved Inference for Unlexicalized Parsing
description
Transcript of Improved Inference for Unlexicalized Parsing
![Page 1: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/1.jpg)
Improved Inference for Unlexicalized Parsing
Slav Petrov and Dan Klein
![Page 2: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/2.jpg)
Unlexicalized Parsing
Hierarchical, adaptive refinement:
1,140 Nonterminal symbols 1621min Parsing time
531,200 Rewrites
[Petrov et al. ‘06]
91.2 F1 score on Dev Set (1600 sentences)
DT1 DT2 DT3 DT4 DT5 DT6 DT7 DT8
DT1 DT2 DT3 DT4
DT1
DT
DT2
![Page 3: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/3.jpg)
1621 min
![Page 4: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/4.jpg)
Coarse-to-Fine Parsing[Goodman ‘97, Charniak&Johnson ‘05]
Coarse grammarNP … VP
NP-dog NP-catNP-apple VP-run NP-eat…
Refined grammar
…
TreebankParse
Pru
ne
NP-17 NP-12NP-1 VP-6VP-31…
Refined grammar
…
Parse
![Page 5: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/5.jpg)
Prune?
For each chart item X[i,j], compute posterior probability:
… QP NP VP …
coarse:
refined:
E.g. consider the span 5 to 12:
< threshold
![Page 6: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/6.jpg)
1621 min
111 min(no search error)
![Page 7: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/7.jpg)
[Charniak et al. ‘06]
NP … VP
NP-dog NP-catNP-apple VP-run NP-eat…
Refined grammar
…
X
A,B,..
Multilevel Coarse-to-Fine Parsing
Add more rounds of
pre-parsing
Grammars coarser
than X-bar ???
???
?
![Page 8: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/8.jpg)
Hierarchical Pruning
Consider again the span 5 to 12:
… QP NP VP …coarse:
split in two: … QP1
QP2
NP1 NP2 VP1 VP2 …
… QP1
QP1
QP3
QP4
NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 …split in four:
split in eight: … … … … … … … … … … … … … … … … …
![Page 9: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/9.jpg)
Intermediate Grammars
X-Bar=G0
G=
G1
G2
G3
G4
G5
G6
Lea
rning DT1 DT2 DT3 DT4 DT5 DT6 DT7 DT8
DT1 DT2 DT3 DT4
DT1
DT
DT2
![Page 10: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/10.jpg)
1621 min111 min
35 min(no search error)
![Page 11: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/11.jpg)
State Drift (DT tag)
somesomethisthisThatThat thesethese
That this some
the
these
this some
that
That this some
the
these
this some
that
……………… …… ……………… …… somesomethesethisThatThis thatthat EM
![Page 12: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/12.jpg)
G1
G2
G3
G4
G5
G6
Lea
rning
G1
G2
G3
G4
G5
G6
Lea
rning
Projected Grammars
X-Bar=G0
G=
Pro
jectio
n i
0(G)
1(G)
2(G)
3(G)
4(G)
5(G)G
![Page 13: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/13.jpg)
Estimating Projected Grammars
Nonterminals?
Nonterminals in G
NP1VP1VP0 S0S1
NP0
Nonterminals in (G)
VP
S
NP
Projection
Easy:
![Page 14: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/14.jpg)
Rules in G Rules in (G)
Estimating Projected Grammars
Rules?
S1 NP1 VP1 0.20S1 NP1 VP2 0.12S1 NP2 VP1 0.02S1 NP2 VP2 0.03S2 NP1 VP1 0.11S2 NP1 VP2 0.05S2 NP2 VP1 0.08S2 NP2 VP2 0.12
S NP VP
????
![Page 15: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/15.jpg)
Treebank
Estimating Projected Grammars[Corazza & Satta ‘06]
Rules in (G)
S NP VP
Rules in G
S1 NP1 VP1 0.20S1 NP1 VP2 0.12S1 NP2 VP1 0.02S1 NP2 VP2 0.03S2 NP1 VP1 0.11S2 NP1 VP2 0.05S2 NP2 VP1 0.08S2 NP2 VP2 0.12
Infinite tree distribution
…
…
0.56
Estimating Grammars
![Page 16: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/16.jpg)
Calculating Expectations
Nonterminals:
ck(X): expected counts up to depth k Converges within 25 iterations (few seconds)
Rules:
![Page 17: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/17.jpg)
1621 min111 min35 min
15 min(no search error)
![Page 18: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/18.jpg)
G1
G2
G3
G4
G5
G6
Lea
rning
Parsing times
X-Bar=G0
G=
60 %
12 %
7 %
6 %
6 %
5 %
4 %
![Page 19: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/19.jpg)
Bracket Posteriors (after G0)
![Page 20: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/20.jpg)
Bracket Posteriors (after G1)
![Page 21: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/21.jpg)
Bracket Posteriors (Movie)(Final Chart)
![Page 22: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/22.jpg)
Bracket Posteriors (Best Tree)
![Page 23: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/23.jpg)
Parse Selection
Computing most likely unsplit tree is NP-hard: Settle for best derivation. Rerank n-best list. Use alternative objective function.
Parses:
-1
-1
-2
-2
-1
-1
-1Derivations:
-1
-2
-1
-1
-2
-1
-2
![Page 24: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/24.jpg)
Parse Risk Minimization
Expected loss according to our beliefs:
TT : true tree TP : predicted tree L : loss function (0/1, precision, recall, F1)
[Titov & Henderson ‘06]
Use n-best candidate list and approximate
expectation with samples.
![Page 25: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/25.jpg)
Reranking Results
Objective Precision Recall F1 Exact
BEST DERIVATION
Viterbi Derivation 89.6 89.4 89.5 37.4
Exact (non-sampled) 90.8 90.8 90.8 41.7
Exact/F1 (oracle) 95.3 94.4 95.0 63.9
RERANKING
Precision (sampled) 91.1 88.1 89.6 21.4
Recall (sampled) 88.2 91.3 89.7 21.5
F1 (sampled) 90.2 89.3 89.8 27.2
Exact (sampled) 89.5 89.5 89.5 25.8
![Page 26: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/26.jpg)
Dynamic Programming
[Matsuzaki et al. ‘05]Approximate posterior parse distribution
à la [Goodman ‘98]Maximize number of expected correct rules
![Page 27: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/27.jpg)
Objective Precision Recall F1 Exact
BEST DERIVATION
Viterbi Derivation 89.6 89.4 89.5 37.4
DYNAMIC PROGRAMMING
Variational 90.7 90.9 90.8 41.4
Max-Rule-Sum 90.5 91.3 90.9 40.4
Max-Rule-Product 91.2 91.1 91.2 41.4
Dynamic Programming Results
![Page 28: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/28.jpg)
Final Results (Efficiency)
Berkeley Parser: 15 min 91.2 F-score Implemented in Java
Charniak & Johnson ‘05 Parser 19 min 90.7 F-score Implemented in C
![Page 29: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/29.jpg)
Final Results (Accuracy)
≤ 40 words
F1
all
F1
EN
G
Charniak&Johnson ‘05 (generative) 90.1 89.6
This Work 90.6 90.1
Charniak&Johnson ‘05 (reranked) 92.0 91.4
GE
R
Dubey ‘05 76.3 -
This Work 80.8 80.1
CH
N
Chiang et al. ‘02 80.0 76.6
This Work 86.3 83.4
![Page 30: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/30.jpg)
Conclusions
Hierarchical coarse-to-fine inference Projections Marginalization
Multi-lingual unlexicalized parsing
![Page 31: Improved Inference for Unlexicalized Parsing](https://reader035.fdocuments.net/reader035/viewer/2022062500/568157e7550346895dc560fb/html5/thumbnails/31.jpg)
Thank You!
Parser available at
http://nlp.cs.berkeley.edu