Unsupervised Learning of Narrative Event Chains Original paper by: Nate Chambers and Dan Jurafsky in...
-
Upload
evelyn-gardner -
Category
Documents
-
view
213 -
download
0
Transcript of Unsupervised Learning of Narrative Event Chains Original paper by: Nate Chambers and Dan Jurafsky in...
Unsupervised Learning of Narrative Event Chains
Original paper by: Nate Chambers and Dan Jurafsky
in ACL 2008This presentation for discussion created by:
Peter Clark (Jan 2009)Disclaimer: these slides are by Peter Clark, not the original
authors, and thus represent a (possibly flawed) interpretation of the original work!
Why Scripts?
Essential for making sense of text We typically match a narrative with expected “scripts”
to make sense of what’s happening to fill in the gaps, fill in goals and purpose
On November 26, the Japanese attack fleet of 33 warships and auxiliary craft, including 6 aircraft carriers, sailed from northern Japan for the Hawaiian Islands. It followed a route that took it far to the north of the normal shipping lanes. By early morning, December 7, 1941, the ships had reached their launch position, 230 miles north of Oahu.
depart → travel → arrive
Scripts Important/essential for NLP But: expensive to build
Can we learn them from text?
“John entered the restaurant. He sat down, and ordered a meal. He ate…”
enter
sit
order
eat
?
Our own (brief) attempt Look at next events in 1GB corpus:
"shoot“ is followed by:("say" 121)("be" 110)("shoot" 103)("wound" 58)("kill" 30)("die" 27)("have" 23)("tell" 23)("fire" 15)("refuse" 15)("go" 13)("think" 13)("carry" 12)("take" 12)("come" 11)("help" 10)("run" 10)("be arrested" 9)("find" 9)
"drive" is followed by:("drive" 364)("be" 354)("say" 343)("have" 71)("continue" 47)("see" 40)("take" 32)("make" 29)("expect" 27)("go" 24)("show" 22)("try" 19)("tell" 18)("think" 18)("allow" 16)("want" 15)("come" 13)("look" 13)("close" 12)
"fly“ is followed by:("fly" 362)("say" 223)("be" 179)("have" 60)("expect" 48)("allow" 40)("tell" 33)("see" 30)("go" 27)("take" 27)("make" 26)("plan" 24)("derive" 21)("want" 19)("schedule" 17)("report" 16)("declare" 15)("give" 15)("leave on" 15)
Some glimmers of hope, but not great…
Andrew Gordon (2007)
From [email protected] Thu Sep 27 09:33:04 2007
…Recently I tried to apply language modelingtechniques over event sequences in a billion words of narrative text extracted from Internet weblogs, and barely exceeded chance performance on some event-ordering evaluations….
Chambers and Jurafsky Main insight:
Don’t look at all verbs, just look at those mentioning the “key player” – the protagonist – in the sequence
Capture some role relationships also: Not just “push” → “fall”, but “push X” → “X fall”
“An automatically learned Prosecution Chain. Arrows indicate the before relation.”
Approach
Stage 1: find likelihood that one event+protagonist goes with
another (or more) event+protagonist NOTE: no ordering info e.g., given:
“X pleaded”, what other event+protagonist occur with unusually high frequencies?
→ “sentenced X”, “fined X”, “fired X”
Stage 2: order the set of event+protagonist
The Training Data
Articles in the GigaWord corpus For each article:
find all pairs of events (verbs) which have a shared argument shared argument found by OpenNLP coreference includes transitivity (X = Y, Y = Z, → X = Z)
add each pair to the database
events about John: {X enter, X sat, greet X} events about the waiter: {X come, X greet}
“John entered the restaurant. The waiter came over. John sat down, and the waiter greeted him….
X enter, X satX enter, greet XX sat, greet XX come, X greet
database ofpairs cooccurring
in the article
Stage 1
Given two events with a shared protagonist, do they occur “unusually often” in a corpus?
=
number of times “push” and “fall” have been seen with these corefererring arguments
number of times any pair of verbs have been seen with any coreferring arguments
probability of seeing “push” and “fall” with particular coreferring arguments
“push X” & “X fall”
Prob(“X event1” AND “X event2”) = Number(“X event1” AND “X event2”)
Sumij Number(“X eventi” AND “X eventj”)
more generally:…
PMI(“X event1”, “X event2”) = logProb(“X event1” AND “X event2”)
Prob (“X event1”) Prob(“X event2”)
= the “surprisingness” that the arg of event1
and event2 are coreferential
PMI (“surprisingness”):…
Can generalize: PMI: given an event (+ arg), how “unusual” is it to see
another event (+ same arg)? Generalization: given N events (+ arg), how “unusual” to
see another event (+ same arg)?
Thus:
set
Evaluation: Cloze test
Fill in the blank…
McCann threw two interceptions early. Toledo pulled McCann aside and told him he’d start. McCann quickly completed his first two passes.
X throwpull Xtell X
X startX complete
?pull Xtell X
X startX complete
(note: a set, not list)
Cloze task: predict “?”
Results:
69 articles, with >=5 protagonist+event in them System produces ~9000 guesses at each “?”
Learning temporal ordering Stage 1: add labels to corpus
Given: verb features (neighboring POS tags, neighboring axuiliaries and modals, WordNet synsets, etc.)
Assign: tense, grammatical aspect, aspectual class [Aside: couldn’t a parser assign this directly?]
Using: SVM, trained on labeled data (TimeBank corpus)
Stage 2: learn before() classifier Given: 2 events in a document sharing an argument Assign: before() relation Using: SVM, trained on labeled data (TimeBank
expanded with transitivity rule “X before Y and Y before Z → X before Z”) A variety of features used, including whether e1 grammatically
occurs before e2 in the text
Learning temporal ordering (cont)
Stage 3: For all event pairs with shared arg in the main corpus
e.g., “push X”, “X fall”
count the number of before(e1,e2) vs. before(e2,e1) classifications, to get an overall ordering confidence
Evaluation
Test set: use same 69 documents minus 6 which had no ordered events
Task: for each documenta. manually label the before() relations
b. generate a random ordering Can system distinguish real from random order?
“Coherence” ≈ sum of confidences of before() labels on all event pairs in document
Confidence(e1→e2) = log(#before(e1,e2) - #before(e2,e1)
# event+shared arg in doc:
Not that impressive (?)
Agglomeration and scripts
How do we get scripts? Could take a verb+arg, e.g., “arrest X” Then look for the most likely 2nd verb+arg, eg “charge X” Then the next most likely verb+arg, given these 2, eg
“indict X” etc.
Then: use ordering algorithm to produce ordering
{arrest X}↓
{arrest X, charge X}↓
{arrest X, charge X, indict X}↓…
“Good” examples…
“Prosecution” (This was the initial seed. Agglomeration was stopped arbitrarily after 10 events, or when a cutoff for node inclusion was reached (whichever was first)).
Good examples…
“Employment”(dotted lines are incorrect “before” relations)
Nate Chambers’ suggested mode of use: Given a set of events in a news article Predict/fill in the missing events
→ Do we really need scripts?
Many ways of referring to the same entity…
Less common style:
More common style:Nagumo's fleet assembled in the remote anchorage of Tankan Bay in the Kurile Islands and departed in strictest secrecy for Hawaii on 26 November 1941. The ships' route crossed the North Pacific and avoided normal shipping lanes. At dawn 7 December 1941, the Japanese task force had approached undetected to a point slightly more than 200 miles north of Oahu.
John went to a restaurant. John sat down. John ate. He paid…
Generally, there are a lot of entities doing a lot of things!
From [email protected] Tue Dec 16 12:48:58 2008…Even with the protagonist idea, it is still difficult to name the protagonist himself as many different terms are used. Naming the other non-protagonist roles is even more sparse. I'm experiencing the same difficulties. My personal thought is that we should not aim to fill the role with one term, but a set of weighted terms. This may be a set of related nouns, or even a set of unrelated nouns with their own preference weights.
Also: many ways of describing the same event!
Different levels of detail, different viewpoints: The planes destroyed the ships The planes dropped bombs, which destroyed the ships The bombs exploded, destroying the ships The Japanese destroyed the ships
Different granularities: Planes attacked Two waves of planes attacked 353 dive-bombers and torpedo planes attacked
Summary
Exciting work! simple but brilliant insight of “protagonist”
But is really only a first step towards scripts
mainly learns verb+arg co-associations in a text temporal ordering and agglomeration is a post-processing step quality of learned results still questionable
Cloze: needs >1000 guesses before hitting a mentioned, co-associated verb+arg
nice “Prosecution” script: a special case as most verbs in script are necessarily specific to Prosecution?
fluidity of language use (multiple ways of viewing same scene, multiple ways of referring to same entity) still a challenge
maybe don’t need to reify scripts (?) fill in missing (implied) events on the fly in context-sensitive way