Dynamic Knowledge-Base Alignment for Coreference Resolution

42
Dynamic Knowledge-Base Alignment for Coreference Resolution Jiaping Zheng, Luke Vilnis, Sameer Singh, Jinho D. Choi, Andrew McCallum Presented at CoNLL 2013 University of Massachusetts Amherst Thursday, August 8, 13

description

Coreference resolution systems can benefit greatly from inclusion of global context, and a number of recent approaches have demonstrated improvements when precomputing an alignment to external knowledge sources. However, since alignment itself is a challenging task and is often noisy, existing systems either align conservatively, resulting in very few links, or combine the attributes of multiple candidates, leading to a conflation of entities. Our approach instead maintains ranked lists of candidate entities that are dynamically merged and reranked during inference. Further, we incorporate a large set of surface string variations for each entity by using anchor texts from the web that link to the entity. These forms of global context enable our system to outperform a competitive baseline without a knowledge base by 1.09 B3 F1 points, and a state-of-the-art system by 0.41 points on the ACE 2004 data.

Transcript of Dynamic Knowledge-Base Alignment for Coreference Resolution

Page 1: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Knowledge-Base Alignment for Coreference Resolution

Jiaping Zheng, Luke Vilnis, Sameer Singh,Jinho D. Choi, Andrew McCallum

Presented at CoNLL 2013University of Massachusetts Amherst

Thursday, August 8, 13

Page 2: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution

2

The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...

Thursday, August 8, 13

Page 3: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution

• Identify mentions that refer to the same entity.

2

The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...

Thursday, August 8, 13

Page 4: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution

• Identify mentions that refer to the same entity.

2

The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...

Thursday, August 8, 13

Page 5: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution

• Identify mentions that refer to the same entity.

2

The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...

Thursday, August 8, 13

Page 6: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution

• Identify mentions that refer to the same entity.

• Useful in relation extraction, question answering, machine translation, etc.

2

The Chicago suburb of Arlington Heights is the first stop for George W. Bush today. The Texas governor stops in Gore’s home state of Tennessee this afternoon ...

Thursday, August 8, 13

Page 7: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution• Determine the entity in a reference knowledge-base for

textual mentions.

3

Thursday, August 8, 13

Page 8: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution• Determine the entity in a reference knowledge-base for

textual mentions.

3

… George W. !Bush ...!

Thursday, August 8, 13

Page 9: Dynamic Knowledge-Base Alignment for Coreference Resolution

Coreference Resolution• Determine the entity in a reference knowledge-base for

textual mentions.

3

… George W. !Bush ...!

Entity! Attr!

George W. Bush! ...!...!

Thursday, August 8, 13

Page 10: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking

4

Thursday, August 8, 13

Page 11: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

4

Thursday, August 8, 13

Page 12: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

4

Thursday, August 8, 13

Page 13: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

- High precision, but fewer alignments.

4

Thursday, August 8, 13

Page 14: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

- High precision, but fewer alignments.

- Ponzetto & Strube 2006.

- Ratinov & Roth 2012.

4

Thursday, August 8, 13

Page 15: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

- High precision, but fewer alignments.

- Ponzetto & Strube 2006.

- Ratinov & Roth 2012.

• Linking a mention to multiple knowledge-base entities.

4

Thursday, August 8, 13

Page 16: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

- High precision, but fewer alignments.

- Ponzetto & Strube 2006.

- Ratinov & Roth 2012.

• Linking a mention to multiple knowledge-base entities.

- Higher recall, but conflates entities.

4

Thursday, August 8, 13

Page 17: Dynamic Knowledge-Base Alignment for Coreference Resolution

Entity Linking• Provides global context for coreference resolution.

• Linking a mention to one knowledge-base entity.

- High precision, but fewer alignments.

- Ponzetto & Strube 2006.

- Ratinov & Roth 2012.

• Linking a mention to multiple knowledge-base entities.

- Higher recall, but conflates entities.

- Rahman & Ng 2011.

4

Thursday, August 8, 13

Page 18: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

5

Thursday, August 8, 13

Page 19: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

5

Thursday, August 8, 13

Page 20: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

5

Thursday, August 8, 13

Page 21: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

- Querying knowledge-based bridge (Dalton & Dietz, 2013).

5

Thursday, August 8, 13

Page 22: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

- Querying knowledge-based bridge (Dalton & Dietz, 2013).

2. Merge entity lists when mentions are coreferenced.

5

Thursday, August 8, 13

Page 23: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

- Querying knowledge-based bridge (Dalton & Dietz, 2013).

2. Merge entity lists when mentions are coreferenced.

3. Re-rank the merged list.

5

Thursday, August 8, 13

Page 24: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

- Querying knowledge-based bridge (Dalton & Dietz, 2013).

2. Merge entity lists when mentions are coreferenced.

3. Re-rank the merged list.

4. Attributes are extracted from the top ranked entity.

5

Thursday, August 8, 13

Page 25: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment1. Compute initial ranked list of knowledge-base entities

for named entities.

- List of Wikipedia articles.

- Querying knowledge-based bridge (Dalton & Dietz, 2013).

2. Merge entity lists when mentions are coreferenced.

3. Re-rank the merged list.

4. Attributes are extracted from the top ranked entity.

- Surface string variations from the web.

5

Thursday, August 8, 13

Page 26: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Thursday, August 8, 13

Page 27: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Thursday, August 8, 13

Page 28: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Thursday, August 8, 13

Page 29: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Washington State

Thursday, August 8, 13

Page 30: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Washington State

Thursday, August 8, 13

Page 31: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Washington State

Washington, DCWashington State

Car WashThe Wash

Thursday, August 8, 13

Page 32: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Washington State

Washington, DCWashington State

Car WashThe Wash

Thursday, August 8, 13

Page 33: Dynamic Knowledge-Base Alignment for Coreference Resolution

Dynamic Alignment

6

… about navigation charts that he had ordered from a company based in Washington …

… opened one of them to discover the absentee ballot of Steven H. Forrester of Bellevue, Wash. …

… were not meaningful because counting in Washington State has been completed …

Washington, DCWashington State

Car WashThe Wash

Washington State

Washington, DCWashington State

Car WashThe WashWashington State

Washington, DCCar WashThe Wash

Thursday, August 8, 13

Page 34: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments

7

Thursday, August 8, 13

Page 35: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments• ACE 2004 dataset.

7

Thursday, August 8, 13

Page 36: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments• ACE 2004 dataset.

• Baseline.

- Pairwise classification system.

- No external knowledge sources.

- A rich set of features.

- Best link strategy.

- L2-regularized SVM using hinge-loss.

7

Thursday, August 8, 13

Page 37: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments• ACE 2004 dataset.

• Baseline.

- Pairwise classification system.

- No external knowledge sources.

- A rich set of features.

- Best link strategy.

- L2-regularized SVM using hinge-loss.

• Static linking.

7

Thursday, August 8, 13

Page 38: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments• ACE 2004 dataset.

• Baseline.

- Pairwise classification system.

- No external knowledge sources.

- A rich set of features.

- Best link strategy.

- L2-regularized SVM using hinge-loss.

• Static linking.

• Dynamic linking.

7

Thursday, August 8, 13

Page 39: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments

8

76.5

78

79.5

81

82.5

79.3

75.8

80.4

75.8

80.8

81.8

77.05

81.12

76.18

81.66

78.01

82.21

MUC

B-CUBE

Culotta’07Raghunathan’10Bengston & Roth’08Stoyanov & Eisner’12BaselineStatic linkingDynamic linking

Thursday, August 8, 13

Page 40: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments

9

79.9

80.8

81.7

82.6

83.5

83.03

79.77

82.5

80.25

83.06

81.13

83.32

Transcripts

Non-Transcripts

Bengston & Roth’08BaselineStatic linkingDynamic linking

Thursday, August 8, 13

Page 41: Dynamic Knowledge-Base Alignment for Coreference Resolution

Experiments

10

coreference by merging all pairs of proper nounmentions that share at least one common candi-date as per KB bridge. Further, the remaining non-pronoun mentions are linked to these proper nounsif the mention string matches any of the entity ti-tles or anchor texts.BR 2008: A pairwise coreference model contain-ing a rich set of features, as described and evalu-ated in Bengston and Roth (2008).Baseline: Our implementation of a pairwisemodel that is similar to the BR 2008 approach withthe differences described in Section 2. This is ourbaseline system that performs coreference withoutthe use of external knowledge.Dynamic linking: This is our complete systemas described in Section 3, in which the list of can-didates associated with each mention is rerankedand modified during inference.Static linking: Identical to dynamic linking ex-cept that entity candidate lists are not merged dur-ing inference (i.e., Algorithm 1 without line 17).This approach is comparable to the fixed align-ment model, as in the approaches of Ponzetto andStrube (2006) and Ratinov and Roth (2012).

4.3 Results

As in Bengston and Roth (2008), we evaluateour system primarily using the B3 metric (Baggaand Baldwin, 1998), but also include pairwise,MUC and CEAF(m) metrics. The performanceof our systems on the test data set is shown inTable 2. These results use true mentions pro-vided in the dataset. Note that as suggested by Ng(2010), coreference resolvers that use differentmention detectors (extraction from parse tree, de-tector trained from gold boundaries, etc) sould notbe compared against each other.

Our baseline system outperforms a system thatfollows the pairwise classification approach by0.32 B3 F1 points on this data set. IncorporatingWikipedia and anchor text information from theweb with a fixed alignment (static linking) furtherimproves our performance by 0.54 B3 F1 point.Using dynamic linking that improves the align-ment during inference achieves another 0.55 F1point improvement, which is 1.09 F1 above ourbaseline, and 1.41 F1 above the current best pair-wise classification system (corresponding to an er-ror reduction of 7.4%). The improvement of thedynamic linking approach over our baselines isconsistent across the various evaluation metrics.

55 10 15 20 25 30 35 40 45 50

0.6

0.8

1

1.2

1.4

1.6

1.8

Top X% of Docs by Number of Mentions

Impro

vem

ent

over

Bas

elin

e

Dynamic Linking

Static Linking

Figure 2: Improvements on the top X% of docu-ments ranked by the number of mentions.

5 Discussion

We explore our system’s performance on subsetsof the ACE dataset, and on the OntoNotes dataset.

5.1 Document Length

Coreference becomes more difficult as the num-ber of mentions is increased since the number ofpairwise comparisons increases quadratically withthe number of mentions. We observe this phe-nomenon in our dataset: the performance on thesmallest third of the documents (when sorted ac-cording to number of mentions) is 8.5-10% higherthan on the largest third of the documents, as perthe B3 metric. However, we expect dynamic link-ing of entities to be more beneficial on these largerdocuments as our system can use the informationfrom a larger number of mentions to improve thealignment during inference. Static linking, on theother hand, is unlikely to obtain higher improve-ments with the larger number of mentions in thedocument as the alignment is fixed.

We set up the following experiment to analyzethe performance with varying numbers of men-tions. We sort all the documents in the test setaccording to their number of mentions, and per-form evaluation on the top X% of this list (whereX is 10, 33, 40, 50). As the results demonstratein Figure 2, the improvement of the static linkingapproach stays fairly even as X is varied. Eventhough the experiments suggest that the largerdocuments are tougher to coreference,3 dynamiclinking provides higher improvements when thedocuments contain a larger number of mentions.

3i.e., the absolute values are lower for these splits. Thebaseline system obtains 83.08, 79.29, 79.64, and 79.77 re-spectively for X = 10, 33, 40, 50.

Thursday, August 8, 13

Page 42: Dynamic Knowledge-Base Alignment for Coreference Resolution

Conclusion• Coreference resolution systems benefit greatly from

inclusion of global context.

• Linking mentions to a knowledge base provides this context.

• Maintaining a ranked list of entities outperforms previous fixed alignment approaches.

11

Thursday, August 8, 13