Dynamic Knowledge-Base Alignment for Coreference Resolution
-
Upload
jinho-d-choi -
Category
Technology
-
view
334 -
download
2
description
Transcript of Dynamic Knowledge-Base Alignment for Coreference Resolution
Dynamic Knowledge-Base Alignment for Coreference Resolution
Jiaping Zheng, Luke Vilnis, Sameer Singh,Jinho D. Choi, Andrew McCallum
Presented at CoNLL 2013University of Massachusetts Amherst
Thursday, August 8, 13
Coreference Resolution
2
The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...
Thursday, August 8, 13
Coreference Resolution
• Identify mentions that refer to the same entity.
2
The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...
Thursday, August 8, 13
Coreference Resolution
• Identify mentions that refer to the same entity.
2
The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...
Thursday, August 8, 13
Coreference Resolution
• Identify mentions that refer to the same entity.
2
The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...
Thursday, August 8, 13
Coreference Resolution
• Identify mentions that refer to the same entity.
• Useful in relation extraction, question answering, machine translation, etc.
2
The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...
Thursday, August 8, 13
Coreference Resolution• Determine the entity in a reference knowledge-base for
textual mentions.
3
Thursday, August 8, 13
Coreference Resolution• Determine the entity in a reference knowledge-base for
textual mentions.
3
… George W. !Bush ...!
Thursday, August 8, 13
Coreference Resolution• Determine the entity in a reference knowledge-base for
textual mentions.
3
… George W. !Bush ...!
Entity! Attr!
George W. Bush! ...!...!
Thursday, August 8, 13
Entity Linking
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
- High precision, but fewer alignments.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
- High precision, but fewer alignments.
- Ponzetto & Strube 2006.
- Ratinov & Roth 2012.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
- High precision, but fewer alignments.
- Ponzetto & Strube 2006.
- Ratinov & Roth 2012.
• Linking a mention to multiple knowledge-base entities.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
- High precision, but fewer alignments.
- Ponzetto & Strube 2006.
- Ratinov & Roth 2012.
• Linking a mention to multiple knowledge-base entities.
- Higher recall, but conflates entities.
4
Thursday, August 8, 13
Entity Linking• Provides global context for coreference resolution.
• Linking a mention to one knowledge-base entity.
- High precision, but fewer alignments.
- Ponzetto & Strube 2006.
- Ratinov & Roth 2012.
• Linking a mention to multiple knowledge-base entities.
- Higher recall, but conflates entities.
- Rahman & Ng 2011.
4
Thursday, August 8, 13
Dynamic Alignment
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
- Querying knowledge-based bridge (Dalton & Dietz, 2013).
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
- Querying knowledge-based bridge (Dalton & Dietz, 2013).
2. Merge entity lists when mentions are coreferenced.
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
- Querying knowledge-based bridge (Dalton & Dietz, 2013).
2. Merge entity lists when mentions are coreferenced.
3. Re-rank the merged list.
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
- Querying knowledge-based bridge (Dalton & Dietz, 2013).
2. Merge entity lists when mentions are coreferenced.
3. Re-rank the merged list.
4. Attributes are extracted from the top ranked entity.
5
Thursday, August 8, 13
Dynamic Alignment1. Compute initial ranked list of knowledge-base entities
for named entities.
- List of Wikipedia articles.
- Querying knowledge-based bridge (Dalton & Dietz, 2013).
2. Merge entity lists when mentions are coreferenced.
3. Re-rank the merged list.
4. Attributes are extracted from the top ranked entity.
- Surface string variations from the web.
5
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Washington State
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Washington State
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Washington State
Washington, DCWashington State
Car WashThe Wash
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Washington State
Washington, DCWashington State
Car WashThe Wash
Thursday, August 8, 13
Dynamic Alignment
6
… about navigation charts that he had ordered from a company based in Washington …
… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …
… were not meaningful because counting in Washington State has been completed …
Washington, DCWashington State
Car WashThe Wash
Washington State
Washington, DCWashington State
Car WashThe WashWashington State
Washington, DCCar WashThe Wash
Thursday, August 8, 13
Experiments
7
Thursday, August 8, 13
Experiments• ACE 2004 dataset.
7
Thursday, August 8, 13
Experiments• ACE 2004 dataset.
• Baseline.
- Pairwise classification system.
- No external knowledge sources.
- A rich set of features.
- Best link strategy.
- L2-regularized SVM using hinge-loss.
7
Thursday, August 8, 13
Experiments• ACE 2004 dataset.
• Baseline.
- Pairwise classification system.
- No external knowledge sources.
- A rich set of features.
- Best link strategy.
- L2-regularized SVM using hinge-loss.
• Static linking.
7
Thursday, August 8, 13
Experiments• ACE 2004 dataset.
• Baseline.
- Pairwise classification system.
- No external knowledge sources.
- A rich set of features.
- Best link strategy.
- L2-regularized SVM using hinge-loss.
• Static linking.
• Dynamic linking.
7
Thursday, August 8, 13
Experiments
8
76.5
78
79.5
81
82.5
79.3
75.8
80.4
75.8
80.8
81.8
77.05
81.12
76.18
81.66
78.01
82.21
MUC
B-CUBE
Culotta’07Raghunathan’10Bengston & Roth’08Stoyanov & Eisner’12BaselineStatic linkingDynamic linking
Thursday, August 8, 13
Experiments
9
79.9
80.8
81.7
82.6
83.5
83.03
79.77
82.5
80.25
83.06
81.13
83.32
Transcripts
Non-Transcripts
Bengston & Roth’08BaselineStatic linkingDynamic linking
Thursday, August 8, 13
Experiments
10
coreference by merging all pairs of proper nounmentions that share at least one common candi-date as per KB bridge. Further, the remaining non-pronoun mentions are linked to these proper nounsif the mention string matches any of the entity ti-tles or anchor texts.BR 2008: A pairwise coreference model contain-ing a rich set of features, as described and evalu-ated in Bengston and Roth (2008).Baseline: Our implementation of a pairwisemodel that is similar to the BR 2008 approach withthe differences described in Section 2. This is ourbaseline system that performs coreference withoutthe use of external knowledge.Dynamic linking: This is our complete systemas described in Section 3, in which the list of can-didates associated with each mention is rerankedand modified during inference.Static linking: Identical to dynamic linking ex-cept that entity candidate lists are not merged dur-ing inference (i.e., Algorithm 1 without line 17).This approach is comparable to the fixed align-ment model, as in the approaches of Ponzetto andStrube (2006) and Ratinov and Roth (2012).
4.3 Results
As in Bengston and Roth (2008), we evaluateour system primarily using the B3 metric (Baggaand Baldwin, 1998), but also include pairwise,MUC and CEAF(m) metrics. The performanceof our systems on the test data set is shown inTable 2. These results use true mentions pro-vided in the dataset. Note that as suggested by Ng(2010), coreference resolvers that use differentmention detectors (extraction from parse tree, de-tector trained from gold boundaries, etc) sould notbe compared against each other.
Our baseline system outperforms a system thatfollows the pairwise classification approach by0.32 B3 F1 points on this data set. IncorporatingWikipedia and anchor text information from theweb with a fixed alignment (static linking) furtherimproves our performance by 0.54 B3 F1 point.Using dynamic linking that improves the align-ment during inference achieves another 0.55 F1point improvement, which is 1.09 F1 above ourbaseline, and 1.41 F1 above the current best pair-wise classification system (corresponding to an er-ror reduction of 7.4%). The improvement of thedynamic linking approach over our baselines isconsistent across the various evaluation metrics.
55 10 15 20 25 30 35 40 45 50
0.6
0.8
1
1.2
1.4
1.6
1.8
Top X% of Docs by Number of Mentions
Impro
vem
ent
over
Bas
elin
e
Dynamic Linking
Static Linking
Figure 2: Improvements on the top X% of docu-ments ranked by the number of mentions.
5 Discussion
We explore our system’s performance on subsetsof the ACE dataset, and on the OntoNotes dataset.
5.1 Document Length
Coreference becomes more difficult as the num-ber of mentions is increased since the number ofpairwise comparisons increases quadratically withthe number of mentions. We observe this phe-nomenon in our dataset: the performance on thesmallest third of the documents (when sorted ac-cording to number of mentions) is 8.5-10% higherthan on the largest third of the documents, as perthe B3 metric. However, we expect dynamic link-ing of entities to be more beneficial on these largerdocuments as our system can use the informationfrom a larger number of mentions to improve thealignment during inference. Static linking, on theother hand, is unlikely to obtain higher improve-ments with the larger number of mentions in thedocument as the alignment is fixed.
We set up the following experiment to analyzethe performance with varying numbers of men-tions. We sort all the documents in the test setaccording to their number of mentions, and per-form evaluation on the top X% of this list (whereX is 10, 33, 40, 50). As the results demonstratein Figure 2, the improvement of the static linkingapproach stays fairly even as X is varied. Eventhough the experiments suggest that the largerdocuments are tougher to coreference,3 dynamiclinking provides higher improvements when thedocuments contain a larger number of mentions.
3i.e., the absolute values are lower for these splits. Thebaseline system obtains 83.08, 79.29, 79.64, and 79.77 re-spectively for X = 10, 33, 40, 50.
Thursday, August 8, 13
Conclusion• Coreference resolution systems benefit greatly from
inclusion of global context.
• Linking mentions to a knowledge base provides this context.
• Maintaining a ranked list of entities outperforms previous fixed alignment approaches.
11
Thursday, August 8, 13