Coupling Semi-Supervised Learning of Categories and Relations by Andrew Carlson, Justin Betteridge,...

Coupling Semi-Supervised Learning of Categories and

Relationsby

Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell

School of Computer ScienceCarnegie Mellon University

presented byThomas Packer

Bootstrapped Information Extraction

• Semi-Supervised:– Seed knowledge (predicate instances & patterns)– Pattern learners (uses learned instances)– Instance learners (uses learned patterns)

• Feedback Loop:– Rel1(X, Y)

– Sent1(X, Y), Rel0(X, Y) Pat1

– Pat1: Sent2(A, B) Rel1(A, B)

Challenges and Previous Solutions

• Semantic drift: Feedback loop amplifies error and ambiguities.

• Semi-Supervised learning often suffers from being under-constrained.

• Multiple mutually-exclusive predicate learning: Positive examples of one predicate are also negative examples of others.

• Category and predicate learning: Arguments must be of certain types.

Does More Look Harder?

Approach

• Simultaneous bootstrapped training of multiple categories and multiple relations.

• Growing related knowledge provides constraints to guide continued learning.

• Ontology Constraints:– Mutually exclusive predicates imply negative instances

and patterns.– Hypernyms imply positive instances.– Relation argument type constraints imply positive

category and negative relation instances.

Mutual Exclusion Constraint

• “city” and “scientist” categories are mutually exclusive.

• If “Boston” is an instance of “city”, then it is also a negative instance of “scientist”.

• If “mayor of arg1” is a pattern for “city”, then it is also a negative pattern for “scientist”.

Hypernym Constraints

• “athlete” is a hyponym of “person”.• If “John McEnroe” is a positive instance of

athlete, then it is also a positive instance of “person”.

Type Checking Constraints

• The “ceoOf()” relation must have arguments of type “person” and “company”.

• If “bicycle” is not a “person” then “ceoOf(bicycle, Microsoft)” is a negative instance of “ceoOf()”.

• If “ceoOf(Steve Ballmer, Microsoft)” is true, then “Steve Ballmer” is a positive instance of “person”. “Microsoft” handled similarly.

Coupled Bootstrap Learner

Knowledge Constraints Makes Extraction Easier

Conclusion

• Clearly shows improvements based on constraints.

• Could probably benefit by– adding probabilistic reasoning– larger corpus– higher thresholds– more contrastive categories– other techniques discussed in this class

Questions

Coupling Semi-Supervised Learning of Categories and Relations by Andrew Carlson, Justin Betteridge,...

Documents

Transcript of Coupling Semi-Supervised Learning of Categories and Relations by Andrew Carlson, Justin Betteridge,...