1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University,...
-
Upload
devin-croxford -
Category
Documents
-
view
219 -
download
0
Transcript of 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University,...
![Page 1: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/1.jpg)
1
The PASCALRecognizing Textual Entailment
Challenges - RTE-1,2,3
Ido Dagan Bar-Ilan University, Israelwith …
![Page 2: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/2.jpg)
2
Recognizing Textual Entailment
PASCAL NOE Challenge2004-5
Ido Dagan, Oren glickman Bar-Ilan University, IsraelBernardo Magnini ITC-irst, Trento, Italy
![Page 3: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/3.jpg)
3
The Second PASCAL Recognising Textual Entailment Challenge
Roy Bar-Haim, Ido Dagan, Bill Dolan, Lisa Ferro, Danilo Giampicollo, Bernardo Magnini, Idan Szpektor
Bar-Ilan, CELCT, ITC-irst, Microsoft Research, MITRE
![Page 4: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/4.jpg)
4
The Third Recognising Textual Entailment
Challenge
Danilo Giampiccolo (CELCT) and Bernardo Magnini (FBK-ITC)
With Ido Dagan (Bar-Ilan) and Bill Dolan (Microsoft Research) Patrick Pantel (USC-ISI), for Resources Pool
Hoa Dang and Ellen Voorhees (NIST), for Extended Task
![Page 5: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/5.jpg)
5
RTE Motivation
• Text applications require semantic inference• A common framework for addressing applied
inference as a whole is needed, but still missing– Global inference is typically application dependent– Application-independent approaches and resources exist
for some semantic sub-problems
• Textual entailment may provide such common application-independent semantic framework
![Page 6: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/6.jpg)
6
Framework Desiderata
A framework for modeling a target level of language processing should provide:
1) Generic module for applications– A common underlying task, unified interface (cf.
parsing)
2) Unified paradigm for investigating sub-phenomena
![Page 7: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/7.jpg)
7
Outline
• The textual entailment task – what and why?
• Evaluation dataset & methodology
• Participating systems and approaches
• Potential for machine learning
• Framework for investigating semantics
![Page 8: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/8.jpg)
8
Natural Language and Meaning
Meaning
Language
Ambiguity
Variability
![Page 9: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/9.jpg)
9
Variability of Semantic Expression
Model variability as relations between text expressions:
• Equivalence: text1 text2 (paraphrasing)• Entailment: text1 text2 – the general case
Dow ends up
Dow climbs 255
The Dow Jones Industrial Average closed up 255
Stock market hits a record high
Dow gains 255 points
![Page 10: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/10.jpg)
10
Typical Application Inference
Overture’s acquisition by Yahoo
Yahoo bought Overture
Question Expected answer formWho bought Overture? >> X bought Overture
• Similar for IE: X buy Y
• “Semantic” IR: t: Overture was bought …
• Summarization (multi-document) – identify redundant info
• MT evaluation (and recent ideas for MT)
• Educational applications, …
text hypothesized answer
entails
![Page 11: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/11.jpg)
11
KRAQ'05 Workshop - KNOWLEDGE and REASONING for ANSWERING QUESTIONS
(IJCAI-05)
CFP:– Reasoning aspects:
* information fusion, * search criteria expansion models * summarization and intensional answers, * reasoning under uncertainty or with incomplete
knowledge,– Knowledge representation and integration:
* levels of knowledge involved (e.g. ontologies, domain knowledge),
* knowledge extraction models and techniques to optimize response accuracy
… but similar needs for other applications – can entailment provide a common empirical task?
![Page 12: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/12.jpg)
12
Classical Entailment Definition
• Chierchia & McConnell-Ginet (2001):A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true
• Strict entailment - doesn't account for some uncertainty allowed in applications
![Page 13: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/13.jpg)
13
“Almost certain” Entailments
t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting.
h: Ivan Getting invented the GPS.
![Page 14: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/14.jpg)
14
Applied Textual Entailment• Directional relation between two text
fragments: Text (t) and Hypothesis (h):
t entails h (th) if humans reading t will infer that h is most likely true
• Operational (applied) definition:– Human gold standard - as in NLP applications– Assuming common background knowledge – which
is indeed expected from applications
![Page 15: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/15.jpg)
16
Evaluation Dataset
![Page 16: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/16.jpg)
17
Generic Dataset by Application Use
• 7 application settings in RTE-1, 4 in RTE-2/3– QA – IE– “Semantic” IR– Comparable documents / multi-doc summarization– MT evaluation– Reading comprehension – Paraphrase acquisition
• Most data created from actual applications output• ~800 examples in development and test sets• 50-50% YES/NO split
![Page 17: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/17.jpg)
18
Some Examples
TEXTHYPOTHESISTASKENTAIL-
MENT
1Regan attended a ceremony in Washington to commemorate the landings in Normandy.
Washington is located inNormandy.
IEFalse
2Google files for its long awaited IPO.Google goes public.IRTrue
3
…: a shootout at the Guadalajara airport in May, 1993, that killed Cardinal Juan Jesus Posadas Ocampo and six others.
Cardinal Juan Jesus Posadas Ocampo died in 1993.
QATrue
4
The SPD got just 21.5% of the votein the European Parliament elections,while the conservative opposition partiespolled 44.5%.
The SPD is defeated by
the opposition parties.IETrue
![Page 18: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/18.jpg)
19
Final Dataset (RTE-2)• Average pairwise inter-judge agreement: 89.2%
– Average Kappa 0.78 – substantial agreement– Better than RTE-1
• Removed 18.2% of pairs due to disagreement (3-4 judges)
• Disagreement example:– (t) Women are under-represented at all political levels ...
(h) Women are poorly represented in parliament.
• Additional review removed 25.5% of pairs– too difficult / vague / redundant
![Page 19: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/19.jpg)
20
Final Dataset (RTE-3)
• Each pair judged by three annotators
• Pairs on which the annotators disagreed were filtered-out.
• Average pairwise annotator agreement: 87.8% (Kappa level of 0.75)
• Filtered-out pairs:– 19.2 % due to disagreement– 9.4 % as controversial, too difficult, or too similar to
other pairs
![Page 20: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/20.jpg)
21
Progress from 1 to 3
• More realistic application data:– RTE-1: some partly synthetic examples– RTE-2&3 mostly:
• Input from common benchmarks for the different applications• Output from real systems
– Test entailment potential across applications• Text length:
– RTE-1&2: one-two sentences– RTE-3: 25% full paragraphs, requires discourse modeling/anaphora
• Improve data collection and annotation– Revised and expanded guidelines– Most pairs triply annotated, some across organizers sites
• Provide linguistic pre-processing, RTE Resources Pool• RTE-3 pilot task by NIST: 3-way judgments; explanations
![Page 21: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/21.jpg)
22
Suggested Perspective
RE the Arthur Bernstein competition:
“… Competition, even a piano competition, is legitimate … as long as it is just an anecdotal side effect of the musical culture scene, and doesn’t threat to overtake the center stage”
Haaretz Israeli News Paper,Culture Section, April 1st, 2005
![Page 22: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/22.jpg)
23
Participating Systems
![Page 23: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/23.jpg)
24
Participation
• Popular challenges, world wide:– RTE-1 – 17 groups – RTE-2 – 23 groups – RTE-3 – 26 groups
• 14 Europe, 12 US
• 11 newcomers (~40 groups so far)
• 79 dev-set downloads (44 planned, 26 maybe)
• 42 test-set downloads
• Joint ACL-07/PASCAL workshop (~70 participants)
![Page 24: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/24.jpg)
25
Methods and Approaches• Estimate similarity match between t and h
(coverage of h by t): – Lexical overlap (unigram, N-gram, subsequence)– Lexical substitution (WordNet, statistical)– Lexical-syntactic variations (“paraphrases”)– Syntactic matching/edit-distance/transformations– Semantic role labeling and matching– Global similarity parameters (e.g. negation, modality)– Anaphora resolution
• Probabilistic tree-transformations• Cross-pair similarity• Detect mismatch (for non-entailment)• Logical interpretation and inference
![Page 25: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/25.jpg)
26
Dominant approach: Supervised Learning
• Features model various aspects of similarity and mismatch• Classifier determines relative weights of information sources• Train on development set and auxiliary t-h corpora
t,hSimilarity Features:
Lexical, n-gram,syntacticsemantic, global
Feature vector
Classifier
YES
NO
![Page 26: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/26.jpg)
27
Parse-based Proof Systems
rainverb
whenadj
leaveverb
whaexpletive
ROOTi
itotheri
Marynoun
Johnnoun
subj
conj
N2noun
N1nounconjN2noun
It rained when John and Mary left
itother
rainverb
whenadj
leaveverb
wha
ROOTi
i
Marynoun
subj
leftverb
ROOT
i
Marynoun
subj
V1verb
whenadj
ROOTi
i
V2verb
ROOT
V2verb
i
wha
It rained when Mary left
Mary left
expletive
(Bar-Haim et al., RTE-3)
![Page 27: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/27.jpg)
28
Resources• WordNet, Extended WordNet, distributional similarity
– Britain UK
– steal take
• DIRT (paraphrase rules)– X file a lawsuit against Y X accuse Y (world knowledge)
– X confirm Y X approve Y (linguistic knowledge)
• FrameNet, ProBank, VerbNet– For semantic role labeling
• Entailment pairs corpora– Automatically acquired training
• No dedicated resources for entailment yet
![Page 28: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/28.jpg)
29
Accuracy Results – RTE-1
0.4
0.5
0.6
MIT
RE
Bar
Ila
nU
NE
DD
ublin
Edi
nbur
gh-
Dub
linS
tanf
ord
UIU
CIR
ST
IRS
TU
NE
DE
dinb
urgh
-A
mst
erda
mS
tanf
ord
LCC
Am
ster
dam
accuracy
0.01 sig
0.05 sig
![Page 29: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/29.jpg)
30
Results (RTE-2)
First Author (Group)AccuracyAverage Precision
Hickl (LCC)75.4%80.8%
Tatu (LCC)73.8%71.3%
Zanzotto (Milan & Rome)63.9%64.4%
Adams (Dallas)62.6%62.8%
Bos (Rome & Leeds)61.6%66.9%
11 groups58.1%-60.5%
7 groups52.9%-55.6%
Average: 60%Median: 59%
![Page 30: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/30.jpg)
31
Results: RTE-3
Accuracy1. Hickl - LCC0.80
2. Tatu - LCC0.72
3. Iftene - Uni. Iasi0.69
4. Adams - Uni. Dallas0.67
5. Wang - DFKI0.66
Baseline (all YES)0.51
Two systems above 70%
Most systems (65%) in the range 60-70%; they were just 30% at RTE-2
![Page 31: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/31.jpg)
32
Current Limitations
• Simple methods perform quite well, but not best• System reports point at:
– Lack of knowledge (syntactic transformation rules, paraphrases, lexical relations, etc.)
– Lack of training data
• It seems that systems that coped better with these issues performed best:– Hickl et al. - acquisition of large entailment corpora for
training– Tatu et al. – large knowledge bases (linguistic and world
knowledge)
![Page 32: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/32.jpg)
33
Impact
• High interest in the research community– Papers, conference sessions and areas, PhD theses,
funded projects– Special issue - Journal of Natural Language Engineering– ACL-07 tutorial
• Initial contribution to specific applications– QA – Harabagiu & Hickl, ACL-06; CLEF-06/07– RE – Romano et al., EACL-06
• RTE-4 – by NIST, with CELCT– Within TAC, a new semantic evaluation conference
(with QA and summarization, subsuming DUC)
![Page 33: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/33.jpg)
34
New Potentials for Machine Learning
![Page 34: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/34.jpg)
35
Classical Approach = Interpretation
Stipulated Meaning
Representation(by scholar)
Language(by nature)
Variability
Logical forms, word senses, semantic roles, named entity types, … - scattered tasks
Feasible/suitable framework for applied semantics?
![Page 35: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/35.jpg)
36
Textual Entailment = Text Mapping
Assumed Meaning (by humans)
Language(by nature)
Variability
![Page 36: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/36.jpg)
37
General Case – Inference
MeaningRepresentation
Language
Entailment mapping is the actual applied goal - and also a touchstone for understanding!
Interpretation becomes a possible mean
Inference
Interpretation
Textual Entailment
![Page 37: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/37.jpg)
38
Machine Learning Perspectives
• Issues with interpretation approach:– Hard to agree on target representations– Costly to annotate semantic representations for training– Has it been a barrier?
• Language-level entailment mapping refers to texts– Texts are semantic-theory neutral– Amenable for unsupervised/semi-supervised learning
• It would be interesting to explore (many do) – language-based representations of meaning, inference knowledge,
and ontology,– for which learning and inference methods may be easier to develop.– Artificial intelligence through natural language?
![Page 38: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/38.jpg)
39
Major Learning Directions
• Learning entailment knowledge (!!!)– Learning entailment relations between words/expressions
– Integrating with manual resources and knowledge
• Inference methods– Principled frameworks for probabilistic inference
• Estimate likelihood of deriving hypothesis from text
• Fusing information levels
– More than bags of features
• Relational learning relevant for both• How can we increase ML researchers involvement?
![Page 39: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/39.jpg)
40
Learning Entailment Knowledge
• Entailing “topical” terms from words/texts– E.g. medicine, law, cars, computer security, …
– An unsupervised version of text categorization
• Learning entailment graph for terms/expressions– Partial knowledge: statistical, lexical resources, Wikipedia, …
– Estimate link likelihood in context
acquire/v
own/v
acquisition/n
buy/v
purchase/n
derived
WN-syn
Dist. sim
entails
entails? ?
?
![Page 40: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/40.jpg)
41
Meeting the knowledge challenge – by a coordinated effort?
• A vast amount of “entailment rules” needed• Speculation: can we have a joint community
effort for knowledge acquisition?– Uniform representations
– Mostly automatic acquisition (millions of rules)
– Human Genome Project analogy
• Preliminary: RTE-3 Resources Pool at ACLWiki(set up by Patrick Pantel)
![Page 41: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/41.jpg)
42
Textual Entailment ≈ Human Reading Comprehension
• From a children’s English learning book(Sela and Greenberg):
Reference Text: “…The Bermuda Triangle lies in the Atlantic Ocean, off the coast of Florida. …”
Hypothesis (True/False?): The Bermuda Triangle is near the United States
???
![Page 42: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/42.jpg)
43
Where are we (from RTE-1)?
![Page 43: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/43.jpg)
44
Cautious Optimism
1) Textual entailment provides a unified framework for applied semantics
– Towards generic inference “engines” for applications
2) Potential for:– Scalable knowledge acquisition,
boosted by (mostly unsupervised) learning
– Learning-based inference methods
Thank you!
![Page 44: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/44.jpg)
45
Summary: Textual Entailment as Goal
• The essence of our proposal: Base applied inference on entailment “engines” and KBsFormulate various semantic problems as entailment tasks
• Interpretations and “mapping” methods may compete/complement
• Open question: which inferences– can be represented at language level?
– require logical or specialized representation and inference? (temporal, spatial, mathematical, …)
![Page 45: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/45.jpg)
46
Collecting QA Pairs• Motivation: a passage containing the answer slot
filler should entail the corresponding answer statement.– E.g. for: Who invented the telephone?, and answer Bell,
text should entail Bell invented the telephone
• QA systems were given TREC and CLEF questions.
• Hypothesis generated by “plugging” the system answer term into the affirmative form of the question
• Texts correspond to the candidate answer passages
![Page 46: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/46.jpg)
47
Collecting IE Pairs• Motivation: a sentence containing a target
relation instance should entail an instantiated template of the relation– E.g: X is located in Y
• Pairs were generated in several ways– Outputs of IE systems:
• for ACE-2004 and MUC-4 relations
– Manually:• for ACE-2004 and MUC-4 relations
• for additional relations in news domain
![Page 47: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/47.jpg)
48
Collecting IR Pairs
• Motivation: relevant documents should entail a given “propositional” query.
• Hypotheses are propositional IR queries, adapted and simplified from TREC and CLEF– drug legalization benefits
drug legalization has benefits
• Texts selected from documents retrieved by different search engines
![Page 48: 1 The PASCAL Recognizing Textual Entailment Challenges - RTE-1,2,3 Ido DaganBar-Ilan University, Israel with …](https://reader035.fdocuments.net/reader035/viewer/2022062421/56649c995503460f9495608f/html5/thumbnails/48.jpg)
49
Collecting SUM (MDS) Pairs• Motivation: identifying redundant statements
(particularly in multi-document summaries)• Using web document clusters and system summary• Picking for hypotheses sentences having high lexical
overlap with summary• In final pairs:
– Texts are original sentences (usually from summary)– Hypotheses:
• Positive pairs: simplify h until entailed by t• Negative pairs: simplify h similarly
• In RTE-3: using Pyramid benchmark data