Lecture 12: Informa1on Extrac1on

108
Lecture 12: Informa 1on Extrac1on

Transcript of Lecture 12: Informa1on Extrac1on

Page 1: Lecture 12: Informa1on Extrac1on

Lecture12:Informa1onExtrac1on

Page 2: Lecture 12: Informa1on Extrac1on

ThisLecture

‣ Howdowerepresentinforma1onforinforma1onextrac1on?

‣ OpenInforma1onExtrac1on

‣ Rela1onextrac1on

‣ Slotfilling

‣ Seman1crolelabeling/abstractmeaningrepresenta1on

Page 3: Lecture 12: Informa1on Extrac1on

Represen1ngInforma1on

Page 4: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

Page 5: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

‣ Statementsarelogicalexpressionsthatevaluatetotrueorfalse

Page 6: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

BrutusstabsCaesar

‣ Statementsarelogicalexpressionsthatevaluatetotrueorfalse

Page 7: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

BrutusstabsCaesar stab(Brutus,Caesar)=>true

‣ Statementsarelogicalexpressionsthatevaluatetotrueorfalse

Page 8: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

BrutusstabsCaesar stab(Brutus,Caesar)=>true

‣ Statementsarelogicalexpressionsthatevaluatetotrueorfalse

Caesarwasstabbed

Page 9: Lecture 12: Informa1on Extrac1on

Seman1cRepresenta1ons

Examplecredit:AsadSayeed

personBrutusCaesarObamaBush…

stabBrutus Caesar

‣ “World”isasetofen11esandpredicates

president

Obama

Bush…

BrutusstabsCaesar stab(Brutus,Caesar)=>true

‣ Statementsarelogicalexpressionsthatevaluatetotrueorfalse

Caesarwasstabbed ∃xstab(x,Caesar)=>true

Page 10: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

Page 11: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)

Page 12: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)∧with(e,knife)

Page 13: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)∧with(e,knife) ∧loca1on(e,theater)

Page 14: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)∧with(e,knife) ∧loca1on(e,theater)

∧1me(e,IdesofMarch)

Page 15: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)

‣ Letsusdescribeeventsashavingproper1es

∧with(e,knife) ∧loca1on(e,theater)

∧1me(e,IdesofMarch)

Page 16: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)

‣ Letsusdescribeeventsashavingproper1es‣ Unifiedrepresenta1onofeventsanden11es:

∧with(e,knife) ∧loca1on(e,theater)

∧1me(e,IdesofMarch)

Page 17: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)

‣ Letsusdescribeeventsashavingproper1es‣ Unifiedrepresenta1onofeventsanden11es:

somecleverdriverinAmerica

∧with(e,knife) ∧loca1on(e,theater)

∧1me(e,IdesofMarch)

Page 18: Lecture 12: Informa1on Extrac1on

Neo-DavidsonianEvents

Examplecredit:AsadSayeed

BrutusstabbedCaesarwithaknifeatthetheaterontheIdesofMarch

∃estabs(e,Brutus,Caesar)

‣ Letsusdescribeeventsashavingproper1es‣ Unifiedrepresenta1onofeventsanden11es:

∃xdriver(x)∧clever(x)∧loca1on(x,America)

somecleverdriverinAmerica

∧with(e,knife) ∧loca1on(e,theater)

∧1me(e,IdesofMarch)

Page 19: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw.

Page 20: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw.

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 21: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 22: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

who?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 23: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

who?whichaYernoon?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 24: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

who?

???

whichaYernoon?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 25: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

‣ Needtoimputemissinginforma1on,resolvecoreference,etc.

who?

???

whichaYernoon?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 26: Lecture 12: Informa1on Extrac1on

RealText

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw. whichTuesday?

‣ Needtoimputemissinginforma1on,resolvecoreference,etc.

‣ S1llunclearhowtorepresentsomethingspreciselyorhowthatinforma1oncouldbeleveraged(severalprominentRepublicans)

who?

???

whichaYernoon?

∃esign(e,BarackObama)∧pa1ent(e,ACA)∧1me(e,Tuesday)

Page 27: Lecture 12: Informa1on Extrac1on

OtherChallenges

BobandAlicewerefriendsunGlhemovedawaytoaHendcollege

Page 28: Lecture 12: Informa1on Extrac1on

OtherChallenges

BobandAlicewerefriendsunGlhemovedawaytoaHendcollege

∃e1∃e2friends(e1,Bob,Alice)∧moved(e2,Bob)∧end_of(e1,e2)

Page 29: Lecture 12: Informa1on Extrac1on

OtherChallenges

BobandAlicewerefriendsunGlhemovedawaytoaHendcollege

‣ Howtorepresenttemporalinforma1on?

∃e1∃e2friends(e1,Bob,Alice)∧moved(e2,Bob)∧end_of(e1,e2)

Page 30: Lecture 12: Informa1on Extrac1on

OtherChallenges

BobandAlicewerefriendsunGlhemovedawaytoaHendcollege

‣ Howtorepresenttemporalinforma1on?

∃e1∃e2friends(e1,Bob,Alice)∧moved(e2,Bob)∧end_of(e1,e2)

BobandAlicewerefriendsunGlaroundthe+mehemovedawaytoaHendcollege

Page 31: Lecture 12: Informa1on Extrac1on

OtherChallenges

BobandAlicewerefriendsunGlhemovedawaytoaHendcollege

‣ Howtorepresenttemporalinforma1on?

‣ Represen1ngtrulyopen-domaininforma1onisverycomplicated!Wedon’thaveaformalrepresenta1onthatcancaptureeverything

∃e1∃e2friends(e1,Bob,Alice)∧moved(e2,Bob)∧end_of(e1,e2)

BobandAlicewerefriendsunGlaroundthe+mehemovedawaytoaHendcollege

Page 32: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons

Page 33: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons‣ CraYedannota1onstocapturesomesubsetofphenomena:predicate-argumentstructures(seman1crolelabeling),1me(temporalrela1ons),…

Page 34: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons‣ CraYedannota1onstocapturesomesubsetofphenomena:predicate-argumentstructures(seman1crolelabeling),1me(temporalrela1ons),…

‣ Slotfilling:specificontology,populateinforma1oninapredefinedway

Page 35: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons‣ CraYedannota1onstocapturesomesubsetofphenomena:predicate-argumentstructures(seman1crolelabeling),1me(temporalrela1ons),…

‣ Slotfilling:specificontology,populateinforma1oninapredefinedway

(Earthquake:magnitude=8.0,epicenter=centralItaly,…)

Page 36: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons

‣ En1ty-rela1on-en1tytriples:focusonen11esandtheirrela1ons(notethatprominenteventscans1llbeen11es)

‣ CraYedannota1onstocapturesomesubsetofphenomena:predicate-argumentstructures(seman1crolelabeling),1me(temporalrela1ons),…

‣ Slotfilling:specificontology,populateinforma1oninapredefinedway

(Earthquake:magnitude=8.0,epicenter=centralItaly,…)

Page 37: Lecture 12: Informa1on Extrac1on

(Atleast)ThreeSolu1ons

‣ En1ty-rela1on-en1tytriples:focusonen11esandtheirrela1ons(notethatprominenteventscans1llbeen11es)

(LadyGaga,singerOf,BadRomance)

‣ CraYedannota1onstocapturesomesubsetofphenomena:predicate-argumentstructures(seman1crolelabeling),1me(temporalrela1ons),…

‣ Slotfilling:specificontology,populateinforma1oninapredefinedway

(Earthquake:magnitude=8.0,epicenter=centralItaly,…)

Page 38: Lecture 12: Informa1on Extrac1on

OpenIE‣ En1ty-rela1on-en1tytriplesaren’tnecessarilygroundedinanontology

‣ Extractstringsandletadownstreamsystemfigureitout

Page 39: Lecture 12: Informa1on Extrac1on

OpenIE‣ En1ty-rela1on-en1tytriplesaren’tnecessarilygroundedinanontology

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw.

‣ Extractstringsandletadownstreamsystemfigureitout

Page 40: Lecture 12: Informa1on Extrac1on

OpenIE‣ En1ty-rela1on-en1tytriplesaren’tnecessarilygroundedinanontology

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw.

(BarackObama,signed,theAffordableCareact)

‣ Extractstringsandletadownstreamsystemfigureitout

Page 41: Lecture 12: Informa1on Extrac1on

OpenIE‣ En1ty-rela1on-en1tytriplesaren’tnecessarilygroundedinanontology

BarackObamasignedtheAffordableCareactonTuesday.HegaveaspeechlaterthataCernoononhowtheactwouldhelptheAmericanpeople.SeveralprominentRepublicanswerequicktodenouncethenewlaw.

(BarackObama,signed,theAffordableCareact)

(SeveralprominentRepublicans,denounce,thenewlaw)

‣ Extractstringsandletadownstreamsystemfigureitout

Page 42: Lecture 12: Informa1on Extrac1on

IE:TheBigPicture

‣ Howdowerepresentinforma1on?Whatdoweextract?

‣ Slotfillers‣ En1ty-rela1on-en1tytriples(fixedontologyoropen)

‣ Seman1croles

‣ Abstractmeaningrepresenta1on

Page 43: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling/ AbstractMeaningRepresenta1on

Page 44: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

Page 45: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

FigurefromHeetal.(2017)

Page 46: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

‣ Iden1fypredicate,disambiguateit,iden1fythatpredicate’sarguments

FigurefromHeetal.(2017)

Page 47: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

‣ Iden1fypredicate,disambiguateit,iden1fythatpredicate’sarguments

‣ VerbrolesfromPropbank(Palmeretal.,2005)

FigurefromHeetal.(2017)

Page 48: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

‣ Iden1fypredicate,disambiguateit,iden1fythatpredicate’sarguments

‣ VerbrolesfromPropbank(Palmeretal.,2005)

FigurefromHeetal.(2017)

quicken:

Page 49: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

FigurefromHeetal.(2017)

Page 50: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

FigurefromHeetal.(2017)

‣ Iden1fypredicates(love)usingaclassifier(notshown)

Page 51: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

FigurefromHeetal.(2017)

‣ Iden1fypredicates(love)usingaclassifier(notshown)

‣ Iden1fyARG0,ARG1,etc.asataggingtaskwithaBiLSTMcondi1onedonlove

Page 52: Lecture 12: Informa1on Extrac1on

Seman1cRoleLabeling

FigurefromHeetal.(2017)

‣ Iden1fypredicates(love)usingaclassifier(notshown)

‣ Iden1fyARG0,ARG1,etc.asataggingtaskwithaBiLSTMcondi1onedonlove

‣ Othersystemsincorporatesyntax,jointpredicate-argumentfinding

Page 53: Lecture 12: Informa1on Extrac1on

SRLforQA

ShenandLapata(2007)

‣ Ques1onandseveralanswercandidates

Q:Whodiscoveredprions?

AC1:In1997,StanleyB.Prusiner,ascienGstintheUnitedStates,discoveredprions…

AC2:Prionswereresearchedby…

Page 54: Lecture 12: Informa1on Extrac1on

SRLforQA

ShenandLapata(2007)

‣ Ques1onandseveralanswercandidates

Q:Whodiscoveredprions?

AC1:In1997,StanleyB.Prusiner,ascienGstintheUnitedStates,discoveredprions…

AC2:Prionswereresearchedby…

Page 55: Lecture 12: Informa1on Extrac1on

SRLforQA

ShenandLapata(2007)

‣ Ques1onandseveralanswercandidates

Q:Whodiscoveredprions?

AC1:In1997,StanleyB.Prusiner,ascienGstintheUnitedStates,discoveredprions…

AC2:Prionswereresearchedby…

Page 56: Lecture 12: Informa1on Extrac1on

SRLforQA

ShenandLapata(2007)

‣ Ques1onandseveralanswercandidates

Q:Whodiscoveredprions?

AC1:In1997,StanleyB.Prusiner,ascienGstintheUnitedStates,discoveredprions…

AC2:Prionswereresearchedby…

Scorebymatchingexpectedanswerphrase(EAP)againstanswercandidate(AC)

Page 57: Lecture 12: Informa1on Extrac1on

AbstractMeaningRepresenta1on

‣ Graph-structuredannota1on

Theboywantstogo

Banarescuetal.(2014)

Page 58: Lecture 12: Informa1on Extrac1on

AbstractMeaningRepresenta1on

‣ Graph-structuredannota1on

Theboywantstogo

Banarescuetal.(2014)

‣ SupersetofSRL:fullsentenceanalyses,containscoreferenceandmul1-wordexpressionsaswell

Page 59: Lecture 12: Informa1on Extrac1on

AbstractMeaningRepresenta1on

‣ Graph-structuredannota1on

Theboywantstogo

Banarescuetal.(2014)

‣ SupersetofSRL:fullsentenceanalyses,containscoreferenceandmul1-wordexpressionsaswell

‣ F1scoresinthe60s:hard!

Page 60: Lecture 12: Informa1on Extrac1on

AbstractMeaningRepresenta1on

‣ Graph-structuredannota1on

Theboywantstogo

Banarescuetal.(2014)

‣ SupersetofSRL:fullsentenceanalyses,containscoreferenceandmul1-wordexpressionsaswell

‣ F1scoresinthe60s:hard!

‣ Socomprehensivethatit’shardtopredict,buts1lldoesn’thandletenseorsomeotherthings…

Page 61: Lecture 12: Informa1on Extrac1on

Summariza1onwithAMR

Liuetal.(2015)

Page 62: Lecture 12: Informa1on Extrac1on

Summariza1onwithAMR

Liuetal.(2015)

‣MergeAMRsacrossmul1plesentences

Page 63: Lecture 12: Informa1on Extrac1on

Summariza1onwithAMR

Liuetal.(2015)

‣MergeAMRsacrossmul1plesentences

‣ Summariza1on=subgraphextrac1on

Page 64: Lecture 12: Informa1on Extrac1on

Summariza1onwithAMR

Liuetal.(2015)

‣MergeAMRsacrossmul1plesentences

‣ Summariza1on=subgraphextrac1on

‣ Norealsystemsactuallyworkthisway(morewhenwetalkaboutsummariza1on)

Page 65: Lecture 12: Informa1on Extrac1on

SlotFilling

Page 66: Lecture 12: Informa1on Extrac1on

SlotFilling‣Mostconserva1ve,narrowformofIE

FreitagandMcCallum(2000)

Page 67: Lecture 12: Informa1on Extrac1on

SlotFilling‣Mostconserva1ve,narrowformofIE

IndianExpress—Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)southeastofthecityofAs-Sulaymaniyah,theUSGeologicalSurveysaid,reportsReuters.USGeologicalSurveyiniGallysaidthequakewasofamagnitude7.2,beforerevisingitto7.3.

magnitude time

epicenter

FreitagandMcCallum(2000)

Page 68: Lecture 12: Informa1on Extrac1on

SlotFilling‣Mostconserva1ve,narrowformofIE

IndianExpress—Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)southeastofthecityofAs-Sulaymaniyah,theUSGeologicalSurveysaid,reportsReuters.USGeologicalSurveyiniGallysaidthequakewasofamagnitude7.2,beforerevisingitto7.3.

magnitude time

epicenter

Speaker:AlanClark

“GenderRolesintheHolyRomanEmpire”

AllagherCenterMainAuditorium

Thistalkwilldiscuss…

speakertitle

location

FreitagandMcCallum(2000)

Page 69: Lecture 12: Informa1on Extrac1on

SlotFilling‣Mostconserva1ve,narrowformofIE

IndianExpress—Amassiveearthquakeofmagnitude7.3struckIraqonSunday,103kms(64miles)southeastofthecityofAs-Sulaymaniyah,theUSGeologicalSurveysaid,reportsReuters.USGeologicalSurveyiniGallysaidthequakewasofamagnitude7.2,beforerevisingitto7.3.

magnitude time

epicenter

Speaker:AlanClark

“GenderRolesintheHolyRomanEmpire”

AllagherCenterMainAuditorium

Thistalkwilldiscuss…

speakertitle

location

FreitagandMcCallum(2000)

‣ Oldwork:HMMs,laterCRFstrainedperrole

Page 70: Lecture 12: Informa1on Extrac1on

SlotFilling:MUC

HaghighiandKlein(2010)

‣ Keyaspect:needtocombineinforma1onacrossmul1plemen1onsofanen1tyusingcoreference

Page 71: Lecture 12: Informa1on Extrac1on

SlotFilling:Forums

‣ Extractproductoccurrencesincybercrimeforums,butnoteverythingthatlookslikeaproductisaproduct

Portnoffetal.(2017),Durrenetal.(2017)Notaproductinthiscontext

Page 72: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on

Page 73: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

Page 74: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Page 75: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Page 76: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Located_In

Na1onality

Page 77: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Located_In

Na1onality

‣ UseNER-likesystemtoiden1fyen1tyspans,classifyrela1onsbetweenen1typairswithaclassifier

Page 78: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Located_In

‣ Systemscanbefeature-basedorneural,lookatsurfacewords,syntac1cfeatures(dependencypaths),seman1croles

Na1onality

‣ UseNER-likesystemtoiden1fyen1tyspans,classifyrela1onsbetweenen1typairswithaclassifier

Page 79: Lecture 12: Informa1on Extrac1on

Rela1onExtrac1on‣ Extracten1ty-rela1on-en1tytriplesfromafixedinventory

ACE(2003-2005)

DuringthewarinIraq,AmericanjournalistsweresomeGmescaughtinthelineoffire

Located_In

‣ Systemscanbefeature-basedorneural,lookatsurfacewords,syntac1cfeatures(dependencypaths),seman1croles

Na1onality

‣ Problem:limiteddataforscalingtobigontologies

‣ UseNER-likesystemtoiden1fyen1tyspans,classifyrela1onsbetweenen1typairswithaclassifier

Page 80: Lecture 12: Informa1on Extrac1on

HearstPanerns

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

Page 81: Lecture 12: Informa1on Extrac1on

HearstPanerns

Hearst(1992)

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

Page 82: Lecture 12: Informa1on Extrac1on

HearstPanerns

Hearst(1992)

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

BerlinisacityYisaX

Page 83: Lecture 12: Informa1on Extrac1on

HearstPanerns

Hearst(1992)

Xsuchas[list] ciGessuchasBerlin,Paris,andLondon.

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

BerlinisacityYisaX

Page 84: Lecture 12: Informa1on Extrac1on

HearstPanerns

Hearst(1992)

Xsuchas[list] ciGessuchasBerlin,Paris,andLondon.

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

Berlinisacity

otherciGesincludingBerlin

YisaX

otherXincludingY

Page 85: Lecture 12: Informa1on Extrac1on

HearstPanerns

Hearst(1992)

Xsuchas[list] ciGessuchasBerlin,Paris,andLondon.

‣ Syntac1cpanernsespeciallyforfindinghypernym-hyponympairs(“isa”rela1ons)

Berlinisacity

otherciGesincludingBerlin

YisaX

otherXincludingY

‣ Totallyunsupervisedwayofharves1ngworldknowledgefortaskslikeparsingandcoreference(BansalandKlein,2011-2012)

Page 86: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

Page 87: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

‣ Lotsofrela1onsinourknowledgebasealready(e.g.,23,000film-directorrela1ons);usethesetobootstrapmoretrainingdata

Page 88: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

‣ Iftwoen11esinarela1onappearinthesamesentence,assumethesentenceexpressestherela1on

‣ Lotsofrela1onsinourknowledgebasealready(e.g.,23,000film-directorrela1ons);usethesetobootstrapmoretrainingdata

Page 89: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

[StevenSpielberg]’sfilm[SavingPrivateRyan]islooselybasedonthebrothers’story

‣ Iftwoen11esinarela1onappearinthesamesentence,assumethesentenceexpressestherela1on

‣ Lotsofrela1onsinourknowledgebasealready(e.g.,23,000film-directorrela1ons);usethesetobootstrapmoretrainingdata

Director

Page 90: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

[StevenSpielberg]’sfilm[SavingPrivateRyan]islooselybasedonthebrothers’story

‣ Iftwoen11esinarela1onappearinthesamesentence,assumethesentenceexpressestherela1on

‣ Lotsofrela1onsinourknowledgebasealready(e.g.,23,000film-directorrela1ons);usethesetobootstrapmoretrainingdata

Allisonco-producedtheAcademyAward-winning[SavingPrivateRyan],directedby[StevenSpielberg]

Director

Director

Page 91: Lecture 12: Informa1on Extrac1on

DistantSupervision

Mintzetal.(2009)

‣ Learndecentlyaccurateclassifiersfor~100Freebaserela1ons‣ Couldbeusedtocrawlthewebandexpandourknowledgebase

Page 92: Lecture 12: Informa1on Extrac1on

OpenIE

Page 93: Lecture 12: Informa1on Extrac1on

OpenInforma1onExtrac1on

‣ Typicallynofixedrela1oninventory

‣ “Open”ness—wanttobeabletoextractallkindsofinforma1onfromopen-domaintext

‣ Acquirecommonsenseknowledgejustfrom“reading”aboutit,butneedtoprocesslotsoftext(“machinereading”)

Page 94: Lecture 12: Informa1on Extrac1on

TextRunner‣ Extractposi1veexamplesof(e,r,e)triplesviaparsingandheuris1cs

‣ TrainaNaiveBayesclassifiertofiltertriplesfromrawtext:usesfeaturesonPOStags,lexicalfeatures,stopwords,etc.

Bankoetal.(2007)

Page 95: Lecture 12: Informa1on Extrac1on

TextRunner‣ Extractposi1veexamplesof(e,r,e)triplesviaparsingandheuris1cs

BarackObama,44thpresidentoftheUnitedStates,wasbornonAugust4,1961inHonolulu

=>Barack_Obama,wasbornin,Honolulu

‣ TrainaNaiveBayesclassifiertofiltertriplesfromrawtext:usesfeaturesonPOStags,lexicalfeatures,stopwords,etc.

Bankoetal.(2007)

Page 96: Lecture 12: Informa1on Extrac1on

TextRunner‣ Extractposi1veexamplesof(e,r,e)triplesviaparsingandheuris1cs

BarackObama,44thpresidentoftheUnitedStates,wasbornonAugust4,1961inHonolulu

=>Barack_Obama,wasbornin,Honolulu

‣ TrainaNaiveBayesclassifiertofiltertriplesfromrawtext:usesfeaturesonPOStags,lexicalfeatures,stopwords,etc.

‣ 80xfasterthanrunningaparser(whichwasslowin2007…)

Bankoetal.(2007)

Page 97: Lecture 12: Informa1on Extrac1on

TextRunner‣ Extractposi1veexamplesof(e,r,e)triplesviaparsingandheuris1cs

BarackObama,44thpresidentoftheUnitedStates,wasbornonAugust4,1961inHonolulu

=>Barack_Obama,wasbornin,Honolulu

‣ TrainaNaiveBayesclassifiertofiltertriplesfromrawtext:usesfeaturesonPOStags,lexicalfeatures,stopwords,etc.

‣ 80xfasterthanrunningaparser(whichwasslowin2007…)

Bankoetal.(2007)

‣ Usemul1pleinstancesofextrac1onstoassignprobabilitytoarela1on

Page 98: Lecture 12: Informa1on Extrac1on

Exploi1ngRedundancy

Bankoetal.(2007)

Page 99: Lecture 12: Informa1on Extrac1on

Exploi1ngRedundancy

Bankoetal.(2007)

‣ 9Mwebpages/133Msentences

Page 100: Lecture 12: Informa1on Extrac1on

Exploi1ngRedundancy

Bankoetal.(2007)

‣ 9Mwebpages/133Msentences

‣ 2.2tuplesextractedpersentence,filterbasedonprobabili1es

Page 101: Lecture 12: Informa1on Extrac1on

Exploi1ngRedundancy

Bankoetal.(2007)

‣ Concrete:definitelytrueAbstract:possiblytruebutunderspecified

‣ 9Mwebpages/133Msentences

‣ 2.2tuplesextractedpersentence,filterbasedonprobabili1es

Page 102: Lecture 12: Informa1on Extrac1on

Exploi1ngRedundancy

Bankoetal.(2007)

‣ Concrete:definitelytrueAbstract:possiblytruebutunderspecified

‣ Hardtoevaluate:canassessprecision ofextractedfacts,buthowdoweknowrecall?

‣ 9Mwebpages/133Msentences

‣ 2.2tuplesextractedpersentence,filterbasedonprobabili1es

Page 103: Lecture 12: Informa1on Extrac1on

ReVerb

Faderetal.(2011)

‣Moreconstraints:openrela1onshavetobeginwithverb,endwithpreposi1on,becon1guous(e.g.,wasbornon)

Page 104: Lecture 12: Informa1on Extrac1on

ReVerb

Faderetal.(2011)

‣Moreconstraints:openrela1onshavetobeginwithverb,endwithpreposi1on,becon1guous(e.g.,wasbornon)

‣ Extractmoremeaningfulrela1ons,par1cularlywithlightverbs

Page 105: Lecture 12: Informa1on Extrac1on

ReVerb

Faderetal.(2011)

‣ Foreachverb,iden1fythelongestsequenceofwordsfollowingtheverbthatsa1sfyaPOSregex(V.*P)andwhichsa1sfyheuris1clexicalconstraintsonspecificity

‣ Findthenearestargumentsoneithersideoftherela1on

Page 106: Lecture 12: Informa1on Extrac1on

ReVerb

Faderetal.(2011)

‣ Foreachverb,iden1fythelongestsequenceofwordsfollowingtheverbthatsa1sfyaPOSregex(V.*P)andwhichsa1sfyheuris1clexicalconstraintsonspecificity

‣ Findthenearestargumentsoneithersideoftherela1on

‣ Annotatorslabeledrela1onsin500documentstoassessrecall

Page 107: Lecture 12: Informa1on Extrac1on

QAfromOpenIE

Choietal.(2015)

Page 108: Lecture 12: Informa1on Extrac1on

Takeaways

‣ Rela1onextrac1on:cancollectdatawithdistantsupervision,usethistoexpandknowledgebases

‣ Slotfilling:1edtoaspecificontology,butgivesfine-grainedinforma1on

‣ OpenIE:extractslotsofthings,buthardtoknowhowgoodorusefultheyare

‣ Cancombinewithstandardques1onanswering

‣ Addnewfactstoknowledgebases

‣ SRL/AMR:handleabunchofphenomena,butmoreorlesslikesyntax++intermsofwhattheyrepresent

‣Many,manyapplica1onsandtechniques