Retrieving Correct Semantic Boundaries in Dependency Structure
-
Upload
jinho-d-choi -
Category
Technology
-
view
775 -
download
2
description
Transcript of Retrieving Correct Semantic Boundaries in Dependency Structure
![Page 1: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/1.jpg)
Retrieving Correct Semantic Boundaries in Dependency Structure
Jinho D. Choi (University of Colorado at Boulder)Martha Palmer (University of Colorado at Boulder)
The 4th Linguistic Annotation Workshop at ACL’10July 15th, 2010
![Page 2: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/2.jpg)
Dependency Structure for SRL• What is dependency?
- Syntactic or semantic relation between a pair of words.
• Why dependency structure for semantic role labeling?
- Dependency relations often correlate with semantic roles.
- Simpler structure
2
LOC PMODNMOD
places in this city
TMP
events year
→ faster annotation → more gold-standard
→ more applicationsfaster parsing
Dep (Choi) vs. Phrase (Charniak) → 0.0025 vs. 0.5 (sec)
![Page 3: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/3.jpg)
Phrase vs. Dependency Structure• Constituent vs. Dependency
3
appear
results
The
in
news
today
's
SBJ LOC
NMOD PMOD
NMOD
NMOD
10/15 (66.67%) parsing papers at ACL’10are on Dependency Parsing
-SBJ
-LOC
![Page 4: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/4.jpg)
PropBank in Phrase Structure• A corpus annotated with verbal propositions and arguments.
• Arguments are annotated on phrases.
4
But there is no phrasein dependency structure
ARG0
ARGM-LOC
![Page 5: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/5.jpg)
PropBank in Dependency Structure• Arguments are annotated on head words instead.
5
The results appear in today 's newsroot
NMOD SBJ LOC NMODNMOD
ROOT PMOD
Phrase = Subtree of head-word
ARG0
ARGM-LOC
![Page 6: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/6.jpg)
Propbank in Dependency Structure• Phase ≠ Subtree of head-word.
6
The plant owned by Mark
NMOD NMOD LGS PMOD
Subtree of the head word includes the predicate
ARG1
![Page 7: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/7.jpg)
Tasks• Tasks
- Convert phrase structure (PS) to dependency structure (DS).
- Find correct head words in DS.
- Retrieve correct semantic boundaries from DS.
• Conversion
- Pennconverter, by Richard Johansson
• Used for CoNLL 2007 - 2009.
- Penn Treebank (Wall Street Journal)
• 49,208 trees were converted.
• 292,073 Propbank arguments exist.
7
![Page 8: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/8.jpg)
System Overview
8
Penn Treebank PropBank
Automatic SRL System
Set of Head words
Dependency trees
Pennconverter
Head words
Heuristics
Set of chunks (phrases)
Heuristics
![Page 9: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/9.jpg)
Finding correct head words• Get the word-set Sp of
each argument in PS.
• For each word in Sp, find the word wmax with the maximum subtree in DS.
• Add the word to the head-list Sd.
• Remove the subtree of wmax from Sp.
• Repeat the search until Sp becomes empty.
9
Yields on mutual toroot
NMOD OPRDNMODPMOD
ROOTSBJ
funds continued slide
IM
}
Sp = { }
Sd = [Yields ], to
Yields, on, mutual, funds, to, slide
![Page 10: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/10.jpg)
Retrieving correct semantic boundaries
• Retrieving the subtrees of head-words
- 100% recall, 87.62% precision, 96.11% F1-score.
- What does this mean?
• The state-of-art SRL system using DS performs about 86%.
• If your application requires actual argument phrases instead of head-words, the performance becomes lower than 86%.
• Improve the precision by applying heuristics on:
- Modals, negations
- Verb chain, relative clauses
- Gerunds, past-participles
10
![Page 11: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/11.jpg)
Verb Predicates whose Semantic Arguments are their Syntactic Heads
• Semantic arguments of verb predicates can be the syntactic heads of the verbs.
• General solution
- For each head word, retrieve the subtree of the head word excluding the subtree of the verb predicate.
11
The plant owned by Mark
NMOD NMOD LGS PMOD
![Page 12: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/12.jpg)
Examples• Modals are the heads of the main verbs in DS.
• Conjunctions
• Past-participles
12
He may or read the bookroot
SBJ COORD ADV NMODOBJROOT
may not
CONJCOORD
people who meet exceed
NMOD
or the
DEP NMODOBJ
expectation
CONJCOORD
correspondence mailed about
NMODNMOD
incomplete 8300s
NMODPMOD
![Page 13: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/13.jpg)
Evaluations• Models
- Model I : retrieving all words in the subtrees (baseline).
- Model II : using all heuristics.
- Model III : II + excluding punctuation.
• Measurements
- Accuracy : exact match
- Precision
- Recall
- F1-score
13
![Page 14: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/14.jpg)
• Results
- Baseline : 88.00%a, 92.51%p, 100%r , 96.11%f
- Final model : 98.20%a, 99.14%p, 99.95%r, 99.54%f
• Statistically significant (t = 149, p < .0001)
Evaluations
14
88
91
94
97
100
I II III
AccuracyPrecisionRecallF1
![Page 15: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/15.jpg)
• Overlapping arguments
Error Analysis
15
ARG1 ARGM-LOC
inshare
LOC
burdens the region
NMODPMOD
OBJ
ARG1
inshare burdens the region
NMODPMOD
OBJLOC
![Page 16: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/16.jpg)
the investors showed forenthusiasm stocks
NMODNMOD
SBJ PMOD
ADV
Error Analysis• PP attachment
16
the investors showed forenthusiasm stocks
NMODNMOD
SBJ ADV PMOD
ARG1
ARG1
![Page 17: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/17.jpg)
Conclusion• Conclusion
- Find correct head words (min-set with max-coverage).
- Find correct semantic boundaries (99.54% F1-score).
- Suggest ways of reconstructing dependency structure so that it can fit better with semantic roles.
- Can be used to fix some of the inconsistencies in both Treebank and Propbank annotations.
• Future work
- Apply to different corpora.
- Find ways of automatically adding empty categories.
17
![Page 18: Retrieving Correct Semantic Boundaries in Dependency Structure](https://reader034.fdocuments.net/reader034/viewer/2022042623/546ce6aeb4af9f702c8b5207/html5/thumbnails/18.jpg)
Acknowledgements• Special thanks are due to Professor Joakim Nivre of
Uppsala University (Sweden) for his helpful insights.
• National Science Foundation CISE-CRI-0551615
• Towards a Comprehensive Linguistic Annotation and CISE-CRI 0709167
• Collaborative: A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu
• Defense Advanced Research Projects Agency (DARPA/IPTO) under the GALE program, DARPA/CMO Contract No. HR0011-06-C-0022, subcontract from BBN, Inc.
18