Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer...

86
Textual Entailment Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay November 9, 2011 Arindam (IITB) Textual Entailment November 9, 2011 1 / 59

Transcript of Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer...

Page 1: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Textual Entailment

Arindam Bhattacharya

M.Tech, Computer ScienceIndian Institute of Technology, Bombay

November 9, 2011

Arindam (IITB) Textual Entailment November 9, 2011 1 / 59

Page 2: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 2 / 59

Page 3: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 3 / 59

Page 4: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 3 / 59

Page 5: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Definition

Classical Definition

A text t entails a hypothesis h if h is true in every circumstance (possibleworld) in which t is true.

Strict Entailment! Doesn’t account for real world uncertainties.

Example:

T: Ram was born and brought up in Maharashtra.H: Ram can speak Marathi.

Applied Definition

t entails h (t ⇒ h) if humans reading t will infer that h is most likely true.

Arindam (IITB) Textual Entailment November 9, 2011 3 / 59

Page 6: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Definition

Classical Definition

A text t entails a hypothesis h if h is true in every circumstance (possibleworld) in which t is true.

Strict Entailment! Doesn’t account for real world uncertainties.

Example:

T: Ram was born and brought up in Maharashtra.H: Ram can speak Marathi.

Applied Definition

t entails h (t ⇒ h) if humans reading t will infer that h is most likely true.

Arindam (IITB) Textual Entailment November 9, 2011 3 / 59

Page 7: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Probabilistic Interpretation

Applied definition sounds good.

But doesn’t sound concrete of mathematical.

Probabilistic interpretation

t probabilistically entails h if:

P(h is true | t) > P(h is true)

P(h is true |t) is called Entailment Confidence.

Arindam (IITB) Textual Entailment November 9, 2011 4 / 59

Page 8: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Probabilistic Interpretation

Applied definition sounds good.

But doesn’t sound concrete of mathematical.

Probabilistic interpretation

t probabilistically entails h if:

P(h is true | t) > P(h is true)

P(h is true |t) is called Entailment Confidence.

Arindam (IITB) Textual Entailment November 9, 2011 4 / 59

Page 9: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Goal

Figure: Textual Entailment

Arindam (IITB) Textual Entailment November 9, 2011 5 / 59

Page 10: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 6 / 59

Page 11: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Triggers are linguistic features that affect entailment [?].

Here are some examples to show how these various factors affectentailment.

Synonymy: Very common form of entailment trigger, where a word isreplaced by its synonym.

T: World War I began in 1914.H: World War I started in 1914.

Arindam (IITB) Textual Entailment November 9, 2011 6 / 59

Page 12: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Hypernymy/Hyponymy: Certain concept can be either generalized orspecialized, leading to entailment.

T: Reptiles have scale.H: Snakes have scale. (Specialization or

Hyponymy)

T: Beckham plays football.H: Beckham plays a game. (Generalization or

Hypernymy)

Arindam (IITB) Textual Entailment November 9, 2011 7 / 59

Page 13: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Co-reference: One of the main sources for text entailment. Especially withlong text containing paragraphs!

T: Barrack Obama came to India. TheAmerican President had a meeting withManmahon Singh.

H: Barrack Obama had a meeting withManmahon Singh.

Arindam (IITB) Textual Entailment November 9, 2011 8 / 59

Page 14: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Modality/Polarity/Factive: Plays critical role in entailment as they affectthe degree of reliability on the remaining sentence. Especiallytroublesome for lexical approaches.

Modality denotes possibility or necessity and sometimes may lead towrong entailment. e.g. may, can, shall, must etc. aremodality triggers.

T: The government may approve theanti-corruption bill.

H: The government approved the anti-corruptionbill.

Arindam (IITB) Textual Entailment November 9, 2011 9 / 59

Page 15: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Polarity determines whether the fact asserted or its negation is goingto occur. e.g. not, never, deny etc. are polarity triggers.

T: The watchman denied that he was sleeping.H: The watchman was sleeping.

Factivity deals with presupposition. It states a fact assuming anotherhas occurred. e.g. realize, regret etc. are factivity triggers.

T: Martha regrets eating John’s homemade cake.H: Martha ate John’s homemade cake.

Arindam (IITB) Textual Entailment November 9, 2011 10 / 59

Page 16: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Passivization: In some case one of the text or hypothesis was is in activewhile the other is in passive. Subject and object of the mainverb gets reversed. Can only be handled by assigningsemantic roles to each entity.

T: Yahoo bought Overture.H: Overture was bought by Yahoo.

Arindam (IITB) Textual Entailment November 9, 2011 11 / 59

Page 17: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Dropping or Inserting Adjunct: Adding or dropping adjuncts affectentailment based on which of T or H is modified, and thepolarity.

T: Bob was running quickly.H: Bob was running.T: Carl was eating.H: Carl was eating slowly. [Incorrect entailment]T: Alice was not driving.H: Alice was not driving fast.T: Derek was not writing properly.H: Derek was not writing. [Incorrect entailment]

Arindam (IITB) Textual Entailment November 9, 2011 12 / 59

Page 18: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Protocols: Some common conventions such as mentioning birth-deathyear may trigger entailment.

T: Charles de Gaulle, 1890 – 1970, Frenchgeneral and statesman, was the first presidentof the Fifth Republic.

H: Charles de Gaulle died in 1970.

Arindam (IITB) Textual Entailment November 9, 2011 13 / 59

Page 19: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers

Numerals: In some cases, certain level of numeric calculation affectsentailment.

T: 3 men and 2 women were found dead in theapartment.

H: 5 people were found dead in an apparent.

Arindam (IITB) Textual Entailment November 9, 2011 14 / 59

Page 20: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 15 / 59

Page 21: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Role of Knowledge

Background knowledge is crucial in entailment as in any AIapplication!

Example

T: President of Russia visited Paris.

H: President of Russia visited France.

B: Paris is situated in France.

Background knowledge B alone should not entail the hypothesis Hand text T must contain necessary information (may not besufficient).

(T ∧ B) |= H

butB 2 H

Arindam (IITB) Textual Entailment November 9, 2011 15 / 59

Page 22: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Role of Knowledge

Background knowledge is crucial in entailment as in any AIapplication!

Example

T: President of Russia visited Paris.

H: President of Russia visited France.

B: Paris is situated in France.

Background knowledge B alone should not entail the hypothesis Hand text T must contain necessary information (may not besufficient).

(T ∧ B) |= H

butB 2 H

Arindam (IITB) Textual Entailment November 9, 2011 15 / 59

Page 23: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Role of Knowledge

Background knowledge is crucial in entailment as in any AIapplication!

Example

T: President of Russia visited Paris.

H: President of Russia visited France.

B: Paris is situated in France.

Background knowledge B alone should not entail the hypothesis Hand text T must contain necessary information (may not besufficient).

(T ∧ B) |= H

butB 2 H

Arindam (IITB) Textual Entailment November 9, 2011 15 / 59

Page 24: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 16 / 59

Page 25: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Recognizing Textual Entailment Challenges

Goal

The recognizing textual entailment is an attempt to promote an abstractgeneric task that captures major semantic inference needs acrossapplications.

Held every year starting 2005.

RTE - 1,2 and 3 organized by PASCAL1.

Organized by Text Analysis Conference (TAC) since then.

Shifted focus to real world applications since RTE-5 (2009) ratherthan T-H pair entailment recognition.

1Pattern Analysis, Statistical Modeling and Computational Learning

Arindam (IITB) Textual Entailment November 9, 2011 16 / 59

Page 26: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

RTE-7 2011

Main Task: Given a corpus and a set of ”candidate” sentencesretrieved by Lucene from that corpus, RTE systems are required toidentify all the sentences from among the candidate sentences thatentail a given Hypothesis.

each topic contains two sets of documents (“A” and “B”)Corpus is the set “A” and H is a sentence taken from “B”

Arindam (IITB) Textual Entailment November 9, 2011 17 / 59

Page 27: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 18 / 59

Page 28: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Resources used for Textual Entailment

Resource Type Author Brief Description

WordNet LexicalDB

Princeton Univer-sity

DB of nouns, verbs, adjec-tives and adverbs

Verbnet LexicalDB

University of Col-orado, Boulder

Lexicon for English verbsorganized into classes ex-tending Levin (1993) classesthrough refinement and addi-tion of subclasses to achievesyntactic and semantic co-herence among members of aclass.

Roget’sThe-saurus

Thesaurus Peter Mark Roget Roget’s Thesaurus is awidely-used English the-saurus. The electronicedition (version 1.02) ismade available by Universityof Chicago.

Table: Knowledge Resources

Arindam (IITB) Textual Entailment November 9, 2011 18 / 59

Page 29: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Resources used for Textual Entailment

Resource Type Author Brief Description

DIRTPara-phraseCollec-tion

Collectionof para-phrases

University of Al-berta

DIRT (Discovery of InferenceRules from Text) knowledgecollection of paraphrasesfrom over a 1GB set ofnewspaper text.

TEASECollec-tion

Collectionof En-tailmentRules

Bar-Ilan University Output of the TEASE algo-rithm. Collection of severalentailment templates fromweb resources.

Table: Knowledge Resources

Arindam (IITB) Textual Entailment November 9, 2011 19 / 59

Page 30: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 20 / 59

Page 31: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Sub-tasks

Recognizing textual entailment requires various sub-tasks such as:

Phrasal Verb Recognition

Named Entity Recognition

Semantic Role Labeling

An example illustrates the need for these tasks

Arindam (IITB) Textual Entailment November 9, 2011 20 / 59

Page 32: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Example

Arindam (IITB) Textual Entailment November 9, 2011 21 / 59

Page 33: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

General Strategy

A general two-step strategy involves:

1 Representation of the information into a form that can be used bythe entailment algorithm

2 Entailment Recognition Algorithm that matches the text T alongwith knowledge B with hypothesis H

Arindam (IITB) Textual Entailment November 9, 2011 22 / 59

Page 34: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Representation

Raw Text T Re-representation φ(T )

Lexical

Syntactic

Semantic

Logical

Figure: Various Representations

The complexity of representation increases as we go higher.

Arindam (IITB) Textual Entailment November 9, 2011 23 / 59

Page 35: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Recognition

Text Hypothesis

Knowledge Base

⊆e ?

φ(T) φ(H

)

φ(B

)

Y/N

Figure: General Strategy

⊆e checks if the degree of subsumption of φ(H) with φ(T ) and φ(B)is over a certain threshold e

Arindam (IITB) Textual Entailment November 9, 2011 24 / 59

Page 36: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 25 / 59

Page 37: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Lexical Approaches

Shallow Approaches: operate on surface level

Carries out some basic preprocessing

Does not compute elaborate representations

Make the entailment decision solely based on the lexical evidences

Arindam (IITB) Textual Entailment November 9, 2011 25 / 59

Page 38: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Preprocessing

Surface preprocessing includes:

tokenizationstemming/lemmatizationidentifying the stop words

Some systems does a bit deeper preprocessing such as:

Phrasal Verb Recognition e.g. take off, put onIdiom processing e.g. A Picture Paints a Thousand WordsNamed Entity Recognition and Normalization

Arindam (IITB) Textual Entailment November 9, 2011 26 / 59

Page 39: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Representation

Lexical approaches use on of following representation

Bag-of-words: Both T and H are represented as a set of words.n-grams: Sequence of n tokens are grouped together. Bag of words isan extreme case of n-gram, with n=1.

Example:“The fixed routine of a bedtime story before sleeping has arelaxing effect.”

Bag-of words: The, fixed, routine, of, a, bedtime, story, before,sleeping, has, relaxing, effectBigram model (n-gram with n=2): The fixed, fixed routine, routine of,of a, a bedtime, bedtime story, story before, before sleeping, sleepinghas, has a, a relaxing, relaxing effect

Arindam (IITB) Textual Entailment November 9, 2011 27 / 59

Page 40: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Example: LLM Algorithm

Local Lexical Matching (LLM) is a lexical approach for textentailment that uses bag of words representation

INPUT: Text T and Hypothesis H.OUTPUT: The matching score.for all word in T and H do

if word in stopWordList thenremove word ;

end ifif no words left in T or H then

return 0;end if

end fornumberMatched = 0;for all word WT in T do

LemmaT = Lemmatize(WT );for all word WH in H do

LemmaH = Lemmatize(WH );if LexicalCompare(LemmaH , LemmaT ) then

numberMatched + +;end if

end forend for

Figure: LLM Algorithm

Arindam (IITB) Textual Entailment November 9, 2011 28 / 59

Page 41: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

LexicalCompare

The LexicalCompare() procedure is checks similarity with help ofWordNet.

if LemmaH == LemmaT thenreturn TRUE;

end ifif HypernymDistance(WH , WT ) ≤ dHyp then

return TRUE;end ifif MeronymDistance(WH , WT ) ≤ dMer then

return TRUE;end ifif MemberOfDistance(WH , WT ) ≤ dMem then

return TRUE;end ifif SynonymOf(WH , WT ) then

return TRUE;end if

Figure: Lexical Compare Procedure

Arindam (IITB) Textual Entailment November 9, 2011 29 / 59

Page 42: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 30 / 59

Page 43: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Textual Entailment as a Classification Task

Figure: Text Entailment as a Classification Task

Arindam (IITB) Textual Entailment November 9, 2011 30 / 59

Page 44: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Feature Space

What could be a possible feature space? Most important decision!

Distance Features Features of some distance between T and H.Entailment Triggers Features that triggers entailment (or

non-entailment)Syntactic Feature Syntax of T-H pair modeled to exploit rewrite

rules.

Arindam (IITB) Textual Entailment November 9, 2011 31 / 59

Page 45: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Distance Features

Possible Features

Number of words in common.Longest common subsequnce.Longest common syntactic subtree.

Requires representation of T and H as

Bag-of words or n-gramsSyntactic representationSemantic Representation

Arindam (IITB) Textual Entailment November 9, 2011 32 / 59

Page 46: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Distance Features

Possible Features

Number of words in common.Longest common subsequnce.Longest common syntactic subtree.

Requires representation of T and H as

Bag-of words or n-gramsSyntactic representationSemantic Representation

Arindam (IITB) Textual Entailment November 9, 2011 32 / 59

Page 47: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Distance Features

For example:

T: At the end of the year, all solid companies pay dividends.H: At the end of the year, all solid insurance companies pay

dividends.

The above example, possible 〈feature, value〉 pair could be〈WordsInCommon, 11〉 or 〈LongestSubsequence, 8〉.

Arindam (IITB) Textual Entailment November 9, 2011 33 / 59

Page 48: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Triggers [?]

Capture presence of linguistic features that triggers entailment.

Example

T: The government may approve the anti-corruption bill.H: The government approved the anti-corruption bill.

A 〈feature, value〉 pair could be 〈modal , 1〉

Arindam (IITB) Textual Entailment November 9, 2011 34 / 59

Page 49: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Exploiting Re-write rules

How the rewrite rules are exploited is illustrated by following example.Consider the the pair:

T: Loki was killed by Thor.

H: Loki died.

Using the syntactic pair features we can learn rules such as:

Figure: Exploiting rewrite rules

Arindam (IITB) Textual Entailment November 9, 2011 35 / 59

Page 50: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 36 / 59

Page 51: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Textual Entailment as Graph Matching

Convert hypothesis and text into graphs.

Either syntactic or semantic.

Measure similarity.

Similarity score gives entailment.

Arindam (IITB) Textual Entailment November 9, 2011 36 / 59

Page 52: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Different from Classical Graph Matching!

Scoring is not symmetric.

Node similarity can not be reduced to label level (i.e token level).

Consideration of linguistically motivated graph transformation(nominalization, passivization).

Arindam (IITB) Textual Entailment November 9, 2011 37 / 59

Page 53: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Text to Graph

Generation of dependency graph using a dependency parser

Graph edges are labeled by hand made rules (e.g. subj, amod)

Applying certain enhancements

Arindam (IITB) Textual Entailment November 9, 2011 38 / 59

Page 54: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Enhancements to Dependency Graph

Collapse Collocations and Named-Entities.Collocations: [blow] [off] → [blow off]Named entities: [Micheal] [Jackson] → [Micheal Jackson]

Dependency Folding so that certain dependencies (such asmodifying prepositions become labels).

Skeleton -[in]-> cupboard.

Arindam (IITB) Textual Entailment November 9, 2011 39 / 59

Page 55: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Enhancements to Dependency Graph

Semantic Role Labeling.

Arcs are labeled with Propbank style semantic roles.This helps to create links between words which share a deep semanticrelation not evident in the surface syntax.e.g. Pakistan got independence in [1947]Temporal .

Co-reference LinksUsing a co-reference resolution tagger, coref links are added throughoutthe graph.In the case of multiple sentence texts, it is our only “link” in the graphbetween entities in the two sentences.

Arindam (IITB) Textual Entailment November 9, 2011 40 / 59

Page 56: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Entailment Model

Entailment model determines the matching cost between graphs of Tand H.

The final cost is a linear combination of cost of matching vertices andedges.

Cost = α ∗ VertexCost + (1− α) ∗ EdgeCost

Arindam (IITB) Textual Entailment November 9, 2011 41 / 59

Page 57: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Additional Checks

Certain additional checks can be applied to the system to improve itsperformance [?]. They are listed below.

Negation Check: Check if there is a negation in a sentence.Example,

T:Clinton’s book is not a bestseller.H:Clinton’s book is a bestseller.

Factive Check: Non-factive verbs (claim, think, charged, etc.) incontrast to factive verbs (know, regret, etc.) have sententialcomplements which do not represent true propositions.

T:Clonaid claims to have cloned 13 babies worldwide.H:Clonaid has cloned 13 babies.

Arindam (IITB) Textual Entailment November 9, 2011 42 / 59

Page 58: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Additional Checks

Superlative Check: invert the typical monotonicity of entailment.Example,

T: The Osaka World Trade Center is the tallest building in WesternJapan.H: The Osaka World Trade Center is the tallest building in Japan.

Antonym Check: It is observed that the WordNet::Similaritymeasures gave high similarity to antonyms. Explicit check of whethera matching involved antonyms is done and unless one of the verticeshad a negation modifier, its rejected.

Arindam (IITB) Textual Entailment November 9, 2011 43 / 59

Page 59: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

An Example

T: In 1994, Amazon.com was founded by Jeff Bezos.H: Bezos established a company.

VC = (0 + 0.4 + 0)/3 = 0.13

EC = 0 (isomorphic edges)

Cost = (0.55) ∗ (0.13) + (0.45) ∗ (0) = 0.0715 (let α = 0.55)

Arindam (IITB) Textual Entailment November 9, 2011 44 / 59

Page 60: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 45 / 59

Page 61: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

What is UNL?

tool for representing text in terms of semantic relation betweendifferent entities

consist of Universal Words (UW), Relations and Attributes

Arindam (IITB) Textual Entailment November 9, 2011 45 / 59

Page 62: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Example

Google goes public.

Google(icl>organization)

go(icl>do).@present.@entry

public(aoj>thing, ant>private)

agt obj

agt(go(icl>do, equ>travel, obj>thing).@present.@entry , Google(icl>organization))

obj(go(icl>do, equ>travel, obj>thing).@present.@entry , public(aoj>thing, ant>private))

Arindam (IITB) Textual Entailment November 9, 2011 46 / 59

Page 63: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Outline

1 IntroductionDefinitionEntailment TriggersRole of KnowledgeRTE ChallengesResources

2 General Strategy

3 Lexical Approach

4 Machine Learning Approach

5 Graphical Approach

6 Deep Semantic ApproachText Entailment using UNL

Arindam (IITB) Textual Entailment November 9, 2011 47 / 59

Page 64: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

The Approach [CS626-449: Lecture 29, Prasad Pradip Joshi]

Represent both text and hypothesis in their UNL form and do analysison the UNL expressions

List of atomic facts (predicates) emerging from the UNL graph of thehypothesis statement must be a subset (either explicitly or implicitly)of the atomic facts emerging from the UNL graph of the textstatement

The algorithm has two main parts:

Extending the set of atomic truths of the text graph based on thosewhich are present. (referred to as growth-rules)Carrying out the matching of the atomic facts in the hypothesis andthe text graph (referred to as matching-rules)

Arindam (IITB) Textual Entailment November 9, 2011 47 / 59

Page 65: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Illustration [CS626-449: Lecture 29, Prasad Pradip Joshi]

Text: Manmohan Singh along with president George Bush signed a letter in 2006.

Hypothesis: Bush signed a document.

Text representationagt(sign@entry@past,Manmohan Singh)

cag(sign@entry@past,President)

nam(President,George Bush)

obj(sign@entry@past,letter@indef)

tim(sign@entry@past,2006)

aoj(President,George Bush)

cag(sign@entry@past,George Bush)

Hypothesis Representationagt(sign@entry@past,Bush)

obj(sign@entry@past,document@indef)

tim(sign@entry@past,2006)

Arindam (IITB) Textual Entailment November 9, 2011 48 / 59

Page 66: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Illustration [CS626-449: Lecture 29, Prasad Pradip Joshi]

Text: Manmohan Singh along with president George Bush signed a letter in 2006.

Hypothesis: Bush signed a document.

Text representationagt(sign@entry@past,Manmohan Singh)

cag(sign@entry@past,President)

nam(President,George Bush)

obj(sign@entry@past,letter@indef)

tim(sign@entry@past,2006)

aoj(President,George Bush)

cag(sign@entry@past,George Bush)

Hypothesis Representationagt(sign@entry@past,Bush)

obj(sign@entry@past,document@indef)

tim(sign@entry@past,2006)

Arindam (IITB) Textual Entailment November 9, 2011 48 / 59

Page 67: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Illustration [CS626-449: Lecture 29, Prasad Pradip Joshi]

Text: Manmohan Singh along with president George Bush signed a letter in 2006.

Hypothesis: Bush signed a document.

Text representationagt(sign@entry@past,Manmohan Singh)

cag(sign@entry@past,President)

nam(President,George Bush)

obj(sign@entry@past,letter@indef)

tim(sign@entry@past,2006)

aoj(President,George Bush)

cag(sign@entry@past,George Bush)

Hypothesis Representationagt(sign@entry@past,Bush)

obj(sign@entry@past,document@indef)

tim(sign@entry@past,2006)

Arindam (IITB) Textual Entailment November 9, 2011 48 / 59

Page 68: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Illustration [CS626-449: Lecture 29, Prasad Pradip Joshi]

Text: Manmohan Singh along with president George Bush signed a letter in 2006.

Hypothesis: Bush signed a document.

Text representationagt(sign@entry@past,Manmohan Singh)

cag(sign@entry@past,President)

nam(President,George Bush)

obj(sign@entry@past,letter@indef)

tim(sign@entry@past,2006)

aoj(President,George Bush)

cag(sign@entry@past,George Bush)

Hypothesis Representationagt(sign@entry@past,Bush)

obj(sign@entry@past,document@indef)

tim(sign@entry@past,2006)

Arindam (IITB) Textual Entailment November 9, 2011 48 / 59

Page 69: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Illustration [CS626-449: Lecture 29, Prasad Pradip Joshi]

Text: Manmohan Singh along with president George Bush signed a letter in 2006.

Hypothesis: Bush signed a document.

Text representationagt(sign@entry@past,Manmohan Singh)

cag(sign@entry@past,President)

nam(President,George Bush)

obj(sign@entry@past,letter@indef)

tim(sign@entry@past,2006)

aoj(President,George Bush)

cag(sign@entry@past,George Bush)

Hypothesis Representationagt(sign@entry@past,Bush)

obj(sign@entry@past,document@indef)

tim(sign@entry@past,2006)

Arindam (IITB) Textual Entailment November 9, 2011 48 / 59

Page 70: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Results

On the training set, (200 pairs of gold standard UNL from RTE andFRACAS) the precision value stands at 96.55% and the recall standsat 95.72%

Using UNL enconvertor (70.1% accurate), on phenomenon studiedFRACAS (100 pairs), precision is 63.04% and recall is 60.1%

On complete FRACAS dataset, precision 60.1% and recall 46%

Arindam (IITB) Textual Entailment November 9, 2011 49 / 59

Page 71: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Growth Rule [CS626-449: Lecture 29, Prasad Pradip Joshi]

pos-mod rule:

Presence of pos(A,B) add mod(A,B)Navy of India → Indian Navy

plc closure:

Presence of plc(A,B) and plc(B,C) leads to the addition of plc(A,C)Paris is capital of France. France is in Europe. → Paris is in Europe.

Introduction of words based on UNL relations and attributes:Attributes:

@end → ‘finish’ or ‘over’

Relations:

‘plc’ → ‘located’‘pos’ → ‘belongs to’ or ‘owned by’

Arindam (IITB) Textual Entailment November 9, 2011 50 / 59

Page 72: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Growth Rules [Maheshwari, 2009]

Figure: Growth Rules

Arindam (IITB) Textual Entailment November 9, 2011 51 / 59

Page 73: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Matching Rules [CS626-449: Lecture 29, Prasad Pradip Joshi]

Two types

Matching the UNL relations (predicate names)

Look up whether a relation belongs to the same family as othere.g. agt(agent),cag(co-agent),aoj(attribute of object)

Matching the argument part.

A narrowing edit of thing pointed to by ’aoj’A broadening edit of thing pointed to by ’obj’

Arindam (IITB) Textual Entailment November 9, 2011 52 / 59

Page 74: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Universal Words

Representation

UNL:

<UW> := < integer > (<POS><WORDNETID>)

Natural Language:

<UW> := < root > [< suffix > ]

Example

Concept: a piece of furniture with tableware for a meal laid out on it

UNL Representation: 104379964

NL Representation : table(icl>furniture)

Arindam (IITB) Textual Entailment November 9, 2011 53 / 59

Page 75: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Relations

labeled arcs connecting a node to another node in a UNL graph

correspond to two-place semantic predicates holding between twoUniversal Words

used to describe semantic dependencies between syntacticconstituents

organized in a hierarchy where lower nodes subsume upper nodes

Arindam (IITB) Textual Entailment November 9, 2011 54 / 59

Page 76: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Relations

Example

Bob slept = agt(slept,Bob)

Alice died = obj(died,Alice)

John believes in Mary = aoj(believes,John)

John worked while Peter talked = coo(worked,talked)

Arindam (IITB) Textual Entailment November 9, 2011 55 / 59

Page 77: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Attributes

arcs linking a node to itself

In opposition to relations, they correspond to one-place predicates

used to represent information conveyed by natural languagegrammatical categories (such as tense, mood, aspect, number, etc)

Syntax

<attribute> := @<attribute-name>

<attribute-name> := <character>+

Arindam (IITB) Textual Entailment November 9, 2011 56 / 59

Page 78: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Pair Features

Example

T: At the end of the year, all solid companies pay dividends.H: At the end of the year, all solid insurance companies pay dividends.

Possible feature pairs:Bag of words:

Text

endT

yearT

solidT

...

Hypothesis

endH

yearH

solidH

...

We can learn:T implies H as when T contains “end”T does not imply H when H contains “end”

Totally useless???

Arindam (IITB) Textual Entailment November 9, 2011 57 / 59

Page 79: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Pair Features

Example

T: At the end of the year, all solid companies pay dividends.H: At the end of the year, all solid insurance companies pay dividends.

Possible feature pairs:Bag of words:

Text

endT

yearT

solidT

...

Hypothesis

endH

yearH

solidH

...

We can learn:T implies H as when T contains “end”T does not imply H when H contains “end”

Totally useless???

Arindam (IITB) Textual Entailment November 9, 2011 57 / 59

Page 80: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Pair Features

Example

T: At the end of the year, all solid companies pay dividends.H: At the end of the year, all solid insurance companies pay dividends.

Possible feature pairs:Bag of words:

Text

endT

yearT

solidT

...

Hypothesis

endH

yearH

solidH

...

We can learn:T implies H as when T contains “end”T does not imply H when H contains “end”

Totally useless???

Arindam (IITB) Textual Entailment November 9, 2011 57 / 59

Page 81: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Effectively using Pair Feature Space [?]

Example

T : At the end of the year, all solid companies pay dividends.H1: At the end of the year, all solid insurance companies pay dividends.H2: At the end of the year, all solid companies pay cash dividends.

Distance feature will plot < T ,H1 > and < T ,H2 > to be samepoints.

We need a space that considers the content and the structure oftextual entailment examples.

Arindam (IITB) Textual Entailment November 9, 2011 58 / 59

Page 82: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Effectively using Pair Feature Space [?]

Example

T : At the end of the year, all solid companies pay dividends.H1: At the end of the year, all solid insurance companies pay dividends.H2: At the end of the year, all solid companies pay cash dividends.

Distance feature will plot < T ,H1 > and < T ,H2 > to be samepoints.

We need a space that considers the content and the structure oftextual entailment examples.

Arindam (IITB) Textual Entailment November 9, 2011 58 / 59

Page 83: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Effectively using Pair Feature Space [?]

Example

T : At the end of the year, all solid companies pay dividends.H1: At the end of the year, all solid insurance companies pay dividends.H2: At the end of the year, all solid companies pay cash dividends.

Distance feature will plot < T ,H1 > and < T ,H2 > to be samepoints.

We need a space that considers the content and the structure oftextual entailment examples.

Arindam (IITB) Textual Entailment November 9, 2011 58 / 59

Page 84: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Effectively using Pair Feature Space [?]

Example

T : At the end of the year, all solid companies pay dividends.H1: At the end of the year, all solid insurance companies pay dividends.H2: At the end of the year, all solid companies pay cash dividends.

Distance feature will plot < T ,H1 > and < T ,H2 > to be samepoints.

We need a space that considers the content and the structure oftextual entailment examples.

Arindam (IITB) Textual Entailment November 9, 2011 58 / 59

Page 85: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Effectively using Pair Feature Space [?]

Example

T : At the end of the year, all solid companies pay dividends.H1: At the end of the year, all solid insurance companies pay dividends.H2: At the end of the year, all solid companies pay cash dividends.

Distance feature will plot < T ,H1 > and < T ,H2 > to be samepoints.

We need a space that considers the content and the structure oftextual entailment examples.

Arindam (IITB) Textual Entailment November 9, 2011 58 / 59

Page 86: Arindam Bhattacharya - CSE, IIT Bombaycs621-2011/te.pdf · Arindam Bhattacharya M.Tech, Computer Science Indian Institute of Technology, Bombay ... Ram can speak Marathi. Applied

Kernel Trick

Syntactic pair feature space.

Cross Pair Similarity

K (< T ′,H ′ >,< T ′′,H ′′ >) = K (< T ′,T ′′ >) + K (< H ′,H ′′ >)

defining the distance K (P1,P2) instead of features as

maxc∈C

(KT (t(H ′, c), t(H ′′, i)) + KT (t(T ′, c), t(T ′′, i)))

Makes Pair Feature look useful.

Arindam (IITB) Textual Entailment November 9, 2011 59 / 59