Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond...
-
Upload
shavonne-amy-chapman -
Category
Documents
-
view
215 -
download
2
Transcript of Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond...
![Page 1: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/1.jpg)
Learning to “Read Between the Lines” using Bayesian Logic Programs
Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku
The University of Texas at AustinJuly 2012
1
![Page 2: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/2.jpg)
Information Extraction• Information extraction (IE) systems extract factual
information that occurs in text [Cowie and Lenhert, 1996; Sarawagi, 2008]
• Natural language text is typically “incomplete”– Commonsense information is not explicitly stated– Easily inferred facts are omitted from the text
• Human readers use commonsense knowledge and “read between the lines” to infer implicit information
• IE systems have no access to commonsense knowledge and hence cannot infer implicit information
2
![Page 3: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/3.jpg)
Example
Natural language text“Barack Obama is the President of the United States of America.”
Query“Barack Obama is the citizen of what country?”
IE systems cannot answer this query since citizenship information is not explicitly stated!
3
![Page 4: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/4.jpg)
Objective
• Infer implicit facts from explicitly stated information– Extract explicitly stated facts using an IE system– Learn common sense knowledge in the form of
logical rules to deduce additional facts– Employ models from statistical relational
learning (SRL) that allow probabilities to be estimated using well-founded probabilistic graphical models
4
![Page 5: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/5.jpg)
Related Work
•Learning propositional rules [Nahm and Mooney, 2000]
– Learn propositional rules from the output of an IE system on computer-related job postings
– Perform logical deduction to infer new facts– Purely logical deduction is brittle
• Cannot assign probabilities or confidence estimates to inferences
5
![Page 6: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/6.jpg)
Related Work• Learning first-order rules
– Logical deduction using probabilistic rules [Carlson et al., 2010; Doppa et al., 2010]
• Modify existing rule learners like FOIL and FARMER to learn probabilistic rules
• Probabilities are not computed using well-founded probabilistic graphical models
– Use Markov Logic Networks (MLNs) [Domingos and
Lowd, 2009] based approaches to infer additional facts [Schoenmackers et al., 2010; Sorower et al., 2011]
• Grounding process could result in intractably large networks for large domains
6
![Page 7: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/7.jpg)
Related Work
• Learning for Textual Entailment [Lin and Pantel, 2001; Yates and Etzioni, 2007; Berant et al., 2011]
– Textual entailment rules have a single antecedent in the body of the rule
– Approaches from statistical relational learning have not been applied so far
– Do not use extractions from a traditional IE system to learn rules
7
![Page 8: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/8.jpg)
Our Approach
• Use an off-the shelf IE system to extract facts
• Learn commonsense knowledge from the extracted facts in the form of probabilistic first-order-rules
• Infer additional facts based on the learned rules using Bayesian Logic Programs (BLPs) [Kersting and De Raedt, 2001]
8
![Page 9: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/9.jpg)
System ArchitectureTraining
DocumentsInformation Extractor
(IBM SIRE)Extracted
Facts
Inductive LogicProgramming
(LIME)
First-OrderLogical Rules
BLP Weight Learner(version of EM)
Bayesian LogicProgram (BLP)
BLP InferenceEngine
TestDocument
Extractions
Inferences withprobabilities 9
.
.
.
.
.
.
Barack Obama is the current President of USA……. Obama was born on August 4, 1961, in Hawaii, USA.
.
.
.
.
.
.
nationState(USA)Person(BarackObama)isLedBy(USA,BarackObama)hasBirthPlace(BarackObama,USA)hasCitizenship(BarackObama,USA)
nationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) ∧ employs(B,A) hasCitizenship(A,B)
hasCitizenship(A,B) | nationState(B) , isLedBy(B,A) .9hasCitizenship(A,B) | nationState(B) , employs(B,A) .6
nationState(malaysian)Person(mahathir-mohamad)isLedBy(malaysian,mahathir-mohamad)employs(malaysian,mahatir-mohamad)
hasCitizenship(mahathir-mohamad, malaysian) 0.75
![Page 10: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/10.jpg)
Bayesian Logic Programs[Kersting and De Raedt, 2001]
• Set of Bayesian clauses a | a1,a2,....,an– Definite clauses in first-order logic, universally quantified– Head of the clause - a– Body of the clause - a1, a2, …, an – Associated conditional probability table (CPT)
• P(head | body) • Bayesian predicates a, a1, a2, …, an have finite
domains– Combining rule like noisy-or for mapping multiple CPTs
into a single CPT• Given a set of Bayesian clauses and a query, SLD
resolution is used to construct ground Bayesian networks for probabilistic inference
10
![Page 11: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/11.jpg)
Why BLPs?
• Pure logical deduction is brittle and results in many undifferentiated inferences
• Inference in BLPs is probabilistic, i.e. inferences are assigned probabilities– Probabilities can be used to select only high-
confidence inferences
• Efficient grounding mechanism in BLPs enables our approach to scale
11
![Page 12: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/12.jpg)
Inductive Logic Programming (ILP) for learning first-order rules
ILP Rule Learner
ILP Rule Learner
Target relationhasCitizenship(X,Y)
Positive instanceshasCitizenship(BarackObama, USA)
hasCitizenship(GeorgeBush, USA)
hasCitizenship(IndiraGandhi,India)
.
.
Negative instanceshasCitizenship(BarackObama, India)
hasCitizenship(GeorgeBush, India)
hasCitizenship(IndiraGandhi,USA)
.
.
KBhasBirthPlace(BarackObama,USA)person(BarackObama)nationState(USA)nationState(India)
.
.
RulesnationState(Y) ∧ isLedBy(Y,X) hasCitizenship(X,Y)
..
RulesnationState(Y) ∧ isLedBy(Y,X) hasCitizenship(X,Y)
..
Generated using clo
sed-
world assu
mption
Generated using clo
sed-
world assu
mption
12
![Page 13: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/13.jpg)
Inference using BLPs
Test document“Malaysian Prime Minister Mahathir Mohamad Wednesday announced for the first time that he has appointed his deputy Abdullah Ahmad Badawi as his successor.”
Extracted factsnationState(malaysian)Person(mahathir-mohamad)isLedBy(malaysian,mahathir-mohamad)employs(malaysian,mahatir-mohamad)
Learned rulesnationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)nationState(B) ∧ employs(B,A) hasCitizenship(A,B)
13
![Page 14: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/14.jpg)
Logical Inference in BLPs
Rule 1nationState(B) ∧ isLedBy(B,A) hasCitizenship(A,B)
nationState(malaysian) isLedBy(malaysian,mahathir-mohamad)
hasCitizenship(mahathir-mohamad, malaysian)
14
![Page 15: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/15.jpg)
Logical Inference in BLPs
Rule 2nationState(B) ∧ employs(B,A) hasCitizenship(A,B)
nationState(malaysian) employs(malaysian,mahathir-mohamad)
hasCitizenship(mahathir-mohamad, malaysian)
15
![Page 16: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/16.jpg)
Probabilistic inference in BLPs
nationState(malaysian)
isLedBy(malaysian, mahathir-mohamad)
- - -
- - -
- - -
- - -
Logical
And
employs(malaysian, mahathir-mohamad)
dummy1 dummy2
hasCitizenship(mahathir-mohamad,
malaysian)Marginal Probability ??
- - -
- - -
- - -
- - -
Logical
And- - -
- - -
- - -
- - -
Noisy
Or
16
![Page 17: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/17.jpg)
Sample rules learnedgovernmentOrganization(A) ∧ employs(A,B) hasMember(A,B)
eventLocation(A,B) ∧ bombing(A) thingPhysicallyDamage(A,B)
isLedBy(A,B) hasMemberPerson(A,B)
17
![Page 18: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/18.jpg)
Experimental Evaluation
• Data– DARPA’s intelligence community (IC) data set
from the Machine Reading Project (MRP)– Consists of news articles on politics,
terrorism, and other international events– 10,000 documents in total
• Perform 10-fold cross validation
18
![Page 19: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/19.jpg)
Experimental Evaluation
• Learning first-order rules using LIME [McCreath and Sharma, 1998]
– Learn rules for 13 target relations– Learn rules using both positive and negative
instances and using only positive instances– Include all unique rules learned from different
models
• Learning BLP parameters– Learn noisy-or parameters using Expectation
Maximization (EM)– Set priors to maximum likelihood estimates
19
![Page 20: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/20.jpg)
Experimental Evaluation
• Performance evaluation– Manually evaluated inferred facts from 40
documents, randomly selected from each test set– Compute two precision scores
• Unadjusted (UA) – does not account for extractor’s mistakes
• Adjusted (AD) – account for extractor’s mistakes
– Rank inferences using marginal probabilities and evaluate top-n
20
![Page 21: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/21.jpg)
Experimental Evaluation
• Systems compared– BLP Learned Weights
• Noisy-or parameters learned using online EM– BLP Manual Weights
• Noisy-or parameters set to 0.9– Logical Deduction– MLN Learned Weights
• Learn weights using generative online weight learner– MLN Manual Weights
• Assign a weight of 10 to all rules and MLE priors to all predicates
21
![Page 22: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/22.jpg)
Unadjusted Precision
22
![Page 23: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/23.jpg)
Adjusted Precision
23
![Page 24: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/24.jpg)
Future Work
• Improve the performance of weight learning for BLPs and MLNs– Learn parameters on larger data sets
• Improve performance of MLNs– Use open-world assumption for learning– Add constraints required to prevent inference of facts
like employs(a,a)– Specialize types that do not have strictly defined types
• Develop an online rule learner that can learn rules from uncertain training data
24
![Page 25: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/25.jpg)
Conclusions• Efficient learning of probabilistic first-order
rules that represent common sense knowledge using extractions from an IE system
• Inference of implicitly stated facts with high precision using BLPs
• Superior performance of BLPs over purely logical deduction and MLNs
25
![Page 26: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/26.jpg)
Questions??
26
![Page 27: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/27.jpg)
Back Up
27
![Page 28: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/28.jpg)
Results for Logical Deduction
UA AD
Precision 29.73 (443/1490)
35.24 (443/1257)
28
![Page 29: Learning to “Read Between the Lines” using Bayesian Logic Programs Sindhu Raghavan, Raymond Mooney, and Hyeonseo Ku The University of Texas at Austin July.](https://reader035.fdocuments.net/reader035/viewer/2022070401/56649f1b5503460f94c31a9f/html5/thumbnails/29.jpg)
Experimental Evaluation
• Learning BLP parameters– Use logical-and model to combine evidence
from the conjuncts in the body of the clause– Use noisy-or model to combine evidence from
several ground rules that have the same head– Learn noisy-or parameters using Expectation
Maximization (EM)– Set priors to maximum likelihood estimates
29