Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging...

42
Leveraging Procedural Knowledge for Task-Oriented Search Zi Yang, Eric Nyberg Language Technologies Institute School of Computer Science Carnegie Mellon University {ziy, ehn}@cs.cmu.edu

Transcript of Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging...

Page 1: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

LeveragingProceduralKnowledgeforTask-OrientedSearch

ZiYang,EricNyberg

LanguageTechnologiesInstituteSchoolofComputerScienceCarnegieMellonUniversity

{ziy,ehn}@cs.cmu.edu

Page 2: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Outline

• Background• ProblemDefinition• ProposedApproach• Experiment• Conclusion

2

Page 3: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

• Decomposethetaskintorequiredsubtasksmanually• Formulatequeriesmanually

• Entity-centricsearch– Seekforattribute,feature,relatedentity,action,etc.

• Task-orientedsearch– Solutionseekinganddecisionsupport.

Entity-centricvs.Task-orientedSearch

organizeaconference

chooseahotel

comparebanquetoption

recruitvolunteers

contactthe publisher considerthenumber andsize ofconference rooms

arrangemealcatering andmenu plan

checkfordiscounted rate

Howdosearchersaccomplishtasksusinginteractivesearch?

3

Page 4: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

HowdoSearchEnginesAssistSearchers?

• QuerysuggestionasanexampleEntity-centricsearch

Suggestattribute,feature,relatedentity,action,etc.

KnowledgeofattributeandfeaturesDescriptiveknowledge

Descriptiveknowledgebase

Task-orientedsearch

Suggestrequiredsubtasks,actions,solutions,etc.

Knowledgeexercisedintheaccomplishmentofatask,i.e.howtodothingsProceduralknowledge

ExistingsolutionsProblemstudiedinthiswork

Proceduralknowledgebase4

Page 5: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

ThinkReversely!

• Canwelearnproceduralknowledgefromusers’searchactivitiesand/orquerysuggestions,andbuildaPKBautomatically?

Task-orientedsearch

Suggestrequiredsubtasks,actions,solutions,etc.

Knowledgeofexercisedintheaccomplishmentofatask,i.e.howtodothingsProceduralknowledge

Problemalsostudiedinthiswork

AutomaticallybuiltPKBProceduralknowledgebase

5

Page 6: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

RelatedWork

• Searchintent&task-orientedsearch– Complexsearchtaskassistantfromquerylog[Hassanetal.2012,2014]

– Task-orientedquestionsandhow-toWebqueries[Weber2012]

– IMine,SubtaskMining@NCTIR[Liu2014]• Proceduralknowledgeacquisition– Ontologiesproposedforstructuredrepresentationofproceduralknowledge[Fukazawa2010,Pareti 2014]

– Extractionbasedonstructuralinformation[Jung2010],definitionofrulesortemplates[Addis2009]

– Terminology:goal vs.target vs.purpose, instruction vs.actionsequence,step vs.action,etc.

6

Page 7: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Outline

• Background• ProblemDefinition– Terminology– Problem1:SearchTaskSuggestion(STS)– Problem2:AutomaticProceduralKnowledgeBaseConstruction(APKBC)

– STSandAPKBC

• ProposedApproach• Experiment• Conclusion

7

Page 8: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Proceduralknowledgegraph/base(PKB)

Terminology

How to Clean a Birdbath

How to Fix a Leaky Faucet

Ashortandconcisesummary

Adetailedexplanation

Atask

Is-achieved-byrelationbetweenaparenttaskand

alistofsubtasks• Numbered“Steps”• Bulletedsubsteps• Outgoing freelinks

8

Page 9: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Problem1:SearchTaskSuggestion(STS)• Whenusersturntosearchenginesforinformationseeking

andproblemsolving,howtoleverageexistingproceduralknowledgetosuggestsubsearchtask(i.e.query)?

SearchTaskSuggestion:GivenaproceduralknowledgegraphGandatask-orientedsearchq,weaimto

Task-orientedsearch Proceduralknowledgebase

searchtaskq taskt

1(a)identify thetaskfromT theuserintendstoaccomplish

taskss1,…,sn

1(b) retrievealistofn sub tasks

searchtasksp1,…,pk 1(c)suggestthe

corresponding subsearchtask9

Page 10: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstruction:Givenataskt,weaimto

Task-orientedsearch Proceduralknowledgebase

Problem2:AutomaticProceduralKnowledgeBaseConstruction(APKBC)

tasktsearchtaskq2(a)identifyasearchtask

taskss1,…,sn

2(c)identifyn (≤k)searchtaskstogeneraten tasksthatcanbeperformed toaccomplishthetaskt withtextdescription.

searchtasksp1,…,pk

2(b)collectkrelatedsearchtasks

• Usersstillfaceadhoc situations(tasks)thatarenotcoveredbyanexistingPKB,butothersearchersmayhaveinteractedwithsearchenginestoattemptasolution.

• CanweconstructaPKBusingsearchqueriesandrelevantdocumentsreturnedfromsearchengines?

10

Page 11: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Outline

• Background• ProblemDefinition• ProposedApproach– BasicIdea– Three-wayParallelCorpusConstruction– FeatureDefinitionandModelConstruction

• Experiment• Conclusion

11

Page 12: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Queryable Phrase/TaskDescriptionExtraction:BasicIdea

• Jointlearningfromavailableartifacts

ExistingPKBs• Can indicatehowto

accomplishtasks• Arenot optimizedfor

interactivesearch

Existingsearchlog• Can reveal howto

formulatequeries• Cannot coverhowto

searchforproceduralknowledge

ExistingWebdocuments• Can exemplifyhowto

describetasks• Donot focuson

procedure

Canwetaketheadvantageofalltheartifactsandlearnfromeachother?

Queryphraseextraction

Three-wayparallelcorpusconstruction

Taskdescriptionextraction12

Page 13: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction

• Parallelcorpus:=asetofmatchingtriples

• Example:GrowTallerhttp://www.wikihow.com/Grow-Taller

⟨ aqueryq,ataskt,atextualcontextc⟩

13

Page 14: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction(cont’d)

• Step1:Extractingseedtriplesfromsearchquerylog– Scanthroughtheentiresearchquerylogtofindeachqueryq

thatmatchesthedescriptionoftaskt.– Extractthetextualcontentfromthetoprelevantdocumentsto

retrievethecontextc.Taskdescriptions inPKBs(GrowTaller)• Ifyou’refromatallfamilyandyou’renot

growingbyyourmid-teens, orifyourheighthasn’tchangedmuchfrombeforepubertyorduringpuberty, thenit’s agoodideatoseeadoctor…

• Thehuman growthhormone (HGH)isproducednaturallyinourbodies, especiallyduringdeeporslowwavesleep.Gettinggood,sound sleepwillencouragetheproductionofHGH,whichiscreatedinthepituitarygland.

• …Therearetonsof“growtaller”exercisesontheInternet,whichclaimtohelpyougrow…

ContextsretrievedfromtheWeb• …Ifyou’refromatallfamily

andyou’renotgrowingbyyourmid-teens, orifyourheighthasn’t changedmuchfrombeforepubertytoduringpuberty, thenit’sagoodideatoseeadoctor.

• Thegrowthhormone (HGH)isproducednaturallyinthepituitaryglandduringdeeporslowwavesleep.

Searchqueriesinasession

growtaller

14

Exactmatchingisusedintheexperiment.

Page 15: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction(cont’d)

• Step2(optional):ManuallycreatingsearchtasksfortasksinthePKB– Usethesummaryofthetaskt toformasearchqueryq and

issueitthesearchenginetoextractcontextc.– Excludethistripledueto“artificiality”!

Taskdescriptions inPKBs(GrowTaller)• Ifyou’refromatallfamilyandyou’renot

growingbyyourmid-teens, orifyourheighthasn’tchangedmuchfrombeforepubertyorduringpuberty, thenit’s agoodideatoseeadoctor…

• Thehuman growthhormone (HGH)isproducednaturallyinourbodies, especiallyduringdeeporslowwavesleep.Gettinggood,sound sleepwillencouragetheproductionofHGH,whichiscreatedinthepituitarygland.

• …Therearetonsof“growtaller”exercisesontheInternet,whichclaimtohelpyougrow…

ContextsretrievedfromtheWeb• …Ifyou’refromatallfamily

andyou’renotgrowingbyyourmid-teens, orifyourheighthasn’t changedmuchfrombeforepubertytoduringpuberty, thenit’sagoodideatoseeadoctor.

• Thegrowthhormone (HGH)isproducednaturallyinthepituitaryglandduringdeeporslowwavesleep.

Searchqueriesinasession

growtaller

15

Page 16: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction(cont’d)

• Step3:Collectingrelatedqueries– Combinetheuser-issuedqueriesfromthesamesession(from

Step1)andthelistofqueriessuggestedbythesearchengine(fromSteps1and2).

Taskdescriptions inPKBs(GrowTaller)• Ifyou’refromatallfamilyandyou’renot

growingbyyourmid-teens, orifyourheighthasn’tchangedmuchfrombeforepubertyorduringpuberty, thenit’s agoodideatoseeadoctor…

• Thehuman growthhormone (HGH)isproducednaturallyinourbodies, especiallyduringdeeporslowwavesleep.Gettinggood,sound sleepwillencouragetheproductionofHGH,whichiscreatedinthepituitarygland.

• …Therearetonsof“growtaller”exercisesontheInternet,whichclaimtohelpyougrow…

ContextsretrievedfromtheWeb• …Ifyou’refromatallfamily

andyou’renotgrowingbyyourmid-teens, orifyourheighthasn’t changedmuchfrombeforepubertytoduringpuberty, thenit’sagoodideatoseeadoctor.

• Thegrowthhormone (HGH)isproducednaturallyinthepituitaryglandduringdeeporslowwavesleep.

Searchqueriesinasession

growtaller

humangrowthhormone

growtallerexercises

16

Page 17: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction(cont’d)

• Step4:Expandingparallelcorpus– Foreachrelatedqueryp,findthesubtasks1,…,sn thatcontains

p initssummaryorexplanation,andretrieveitscontextd.– Discardunmatchedrelatedqueries ortaskdescriptions.

Taskdescriptions inPKBs(GrowTaller)• Ifyou’refromatallfamilyandyou’renot

growingbyyourmid-teens, orifyourheighthasn’tchangedmuchfrombeforepubertyorduringpuberty, thenit’s agoodideatoseeadoctor…

• Thehuman growthhormone (HGH)isproducednaturallyinourbodies, especiallyduringdeeporslowwavesleep.Gettinggood,sound sleepwillencouragetheproductionofHGH,whichiscreatedinthepituitarygland.

• …Therearetonsof“growtaller”exercisesontheInternet,whichclaimtohelpyougrow…

ContextsretrievedfromtheWeb• …Ifyou’refromatallfamily

andyou’renotgrowingbyyourmid-teens, orifyourheighthasn’t changedmuchfrombeforepubertytoduringpuberty, thenit’sagoodideatoseeadoctor.

• Thegrowthhormone (HGH)isproducednaturallyinthepituitaryglandduringdeeporslowwavesleep.

Searchqueriesinasession

growtaller

humangrowthhormone

growtallerexercises

17

Exactmatchingisusedintheexperiment.

Page 18: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Three-wayParallelCorpusConstruction(cont’d)

• Step5:AnnotatingBIO– Findthecontiguoussequenceofwordsfromthetaskt (context

c)thatismostrelevanttothequeryq (taskt’ssummaryorexplanation).

Taskdescriptions inPKBs(GrowTaller)• Ifyou’refromatallfamilyandyou’renot

growingbyyourmid-teens, orifyourheighthasn’tchangedmuchfrombeforepubertyorduringpuberty, thenit’s agoodideatoseeadoctor…

• Thehuman growthhormone (HGH)isproducednaturallyinourbodies, especiallyduringdeeporslowwavesleep.Gettinggood,sound sleepwillencouragetheproductionofHGH,whichiscreatedinthepituitarygland.

• …Therearetonsof“growtaller”exercisesontheInternet,whichclaimtohelpyougrow…

ContextsretrievedfromtheWeb• …Ifyou’refromatallfamily

andyou’renotgrowingbyyourmid-teens, orifyourheighthasn’t changedmuchfrombeforepubertytoduringpuberty, thenit’sagoodideatoseeadoctor.

• Thegrowthhormone (HGH)isproducednaturallyinthepituitaryglandduringdeeporslowwavesleep.

Searchqueriesinasession

growtaller

humangrowthhormone

growtallerexercises

18

BQ IQ

BQ IQ IQ

BTE ITE …Exactmatchingisusedforannotatingtask intheexperiment.

Selectedthesentencesfromcontext thatcontainallthetokens inthetask summaryand70%+ofthetokens inthetask explanation,andannotatedtheminimalspanthatcontainsthoseoverlappingtokens.

Page 19: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

FeatureDefinition

• Featurelistforbothcontext andtask

19

Category Description/Motivation CountLocation(LOC): Appearsinthetask summaryandexplanation 2

“Skimmable information thatreaderscanquicklyunderstand”shouldbeprovidedinthetitleandthebeginningsentenceofeachstep.

Part ofspeech(POS) 36

Boththearticletitleandthefirstsentenceineachstepbeginwithaverbinbareinfinitiveform.

Parse(PAR)

Basic Stanforddependency types 50

Namedentity,nounphrase,verbphrase 3

Identify thetaskfacets(subsidiary resourcesorconstraints,etc.)

Word,context

Surface, stem,TF-IDFscore 3

Surface,stem,TF-IDFscore,POStagsofprevious/nextword 78

Page 20: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

ModelConstruction

• Wordsequencelabelingforquery construction,tasksummaryandexplanationconstruction

20

Query construction Tasksummaryconstruction

Taskexplanationconstruction

Problem Wordsequence labelingproblems

Model MQ MTS MTE

Features The samefeatureset,exceptthatlocationisonlyusedforquery

Training set Features X t, labelsY t extractedfromtaskdescription

Features X c,labels Y c extractedfromcontext

Predictionobjective

yt*=argmax p (y t |x t ;M Q)y t ∈{BQ, IQ,O}|t |

yc*=argmax p(y c |x c ;MTS)y t ∈{BTS, ITS,O}|c |

yc*=argmax p(y c |x c ;MTE)y t ∈{BTE, ITE,O}|c |

Output yt *=O…OBQIQIQO…O yc *=O…OBTSITSITSO…OBTEITEITEO…O

Page 21: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Task-orientedsearch Proceduralknowledgebase

STSandAPKBC

tasktsearchtaskq2(a)identify searchtask

taskss1,…,sn

2(c)identifyandgeneratesubtaskssearchtasks

p1,…,pk

2(b)collectrelatedsearchtasks

1(a)identify task

1(c)suggestandcreatesubsearchtask

1(b) retrievesubtasks

Exactmatchingorretrievalbasedmethod

Needasearch intentmodeltoretrievetask-orientedsearchtasks(futurework)

RefertoPKBtoretrieverelatedsubtasks

Generatequeryable phrases/taskdescriptionsusinganalgorithmthatlearnshowsearchersformulatequeries/editorsdescribeproceduralknowledge

21

Page 22: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Outline

• Background• ProblemDefinition• ProposedApproach• Experiment– DataPreparation– ExperimentSettings– SearchTaskSuggestionResult– ProceduralKnowledgeBaseConstructionResult

• Conclusion

22

Page 23: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

DataPreparation

• EnglishwikiHowdatadump• AOLsearchquerylog• Queriessuggestedbysearchengines• Contextextractedfromsearchengines

23

Page 24: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

ExperimentSettings

• Sequencelabeling vs.end-to-end evaluation

Sequencelabelingevaluation End-to-end evaluation

Goldstandard

Automaticallylabeledparallelcorpus

Manualjudgment

Testset 10-foldcrossvalidation 50randomlysampledtriples

Evaluationmethods

Precision,Recall,F-1,averagedonalltestinstances(macro-averaged) andoneachtaskthenacrossalltasks(micro-averaged),F-1basedROUGE-2and-S4

Macro-averagedandmicro-averagedPrecision@8, MAP

Baselinemethods

CRF(proposed), HMM(surface),LR,SVM,featureablation

Google, Bing,wikiHow

Featureextractors,learners

StanfordCoreNLP:sentence,token, stem,POS,dependencyparse,chunk,namedentityMALLET:CRF,HMM;LibLinear:LR,SVM

24

Page 25: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

SearchTaskSuggestionResult

• Queryconstructionresult– TheproposedCRF-basedapproachoutperformsother

classifiers*,esp.independentclassifiers(max.SVM).– Alsooutperformseachfeaturecategory**(max.W/WORD),

andLOUstudyns (max.W/OPOS).

.7471 .6930.8112 .8087

.6855 .6612.7922 .7892

.6803 .6175.7713 .7657.7466 .6870.8113 .8082

.0000

.2000

.4000

.6000

.8000

MacroF1 MicroF1 ROUGE-2 ROUGE-S4

CRF HMM SVM LR TFIDF

W/POS W/PAR W/LOC W/WORD W/OPOS

W/OPAR W/OLOC W/OWORD LOCAL CONTEXT25

Page 26: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

SearchTaskSuggestionResult(cont’d)

PROPOSED GOOGLE BING

Task:slimdown

weightloss slimdowndiet the slimdownclub

heavyfood 7dayslimdown howtoslimdownfast

junkfood weightloss slimdownchallenge

keepupthemood slimdownthighs howtoslimdownlegs

Task:playredalert2

buildabarracks redalert 2complete(iso)original2disc

playredalert 2game

buildawarfactory playredalert2free playra2online

radarchould playredalert2onlinefree redalert2download

buildapowerplant/tesla reactor playredalert3 freeredalert3

• End-to-endexample– Slimdown– Playredalert2

26

Page 27: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

SearchTaskSuggestionResult(cont’d)

• End-to-endevaluation– Proposedapproachistailoredfortask-orientedsearch.– Currentgeneral-purposecommercialsearchenginesare

designedforentity-centricsearch– Currentsearchenginestendtosuggestqueriesbyappending

keywordssuchasproduct,image,logo,online,free,etc..4457 .4457

.3361

.0972 .0973.0553.0333 .0313 .0120

.0676 .0612 .0549

.0000

.1000

.2000

.3000

.4000

.5000

MacroP MicroP MAP

PROPOSED GOOGLE BING LOG

27

Page 28: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstructionResult

.4207.3455

.4463 .4392

.1175 .1119

.2425 .2301

.3556.3153

.3822 .3788.4129

.3198.4170 .4118

.0000

.1000

.2000

.3000

.4000

MacroF1 MicroF1 ROUGE-2 ROUGE-S4

CRF HMM SVM LR TFIDF

W/POS W/PAR W/WORD W/OPOS W/OPAR

W/OWORD LOCAL CONTEXT

• Tasksummarygenerationresult– Allscoresarelowerthaninthequeryconstructiontask.– CRF outperformsotherclassifiers*(max.SVM),eachfeature

categoryns (max.W/POS),andLOUstudyns (max.W/OWORD).

28

Page 29: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstructionResult(cont’d)

• Taskexplanationgenerationresult– CRF outperformsotherclassifiers*(max.HMM,implyingthe

importanceofsurfaceformsandsequencelabelingnature).– Alsooutperformseachfeaturecategoryns (max.W/WORD).– LOUstudyshowsW/OPAR performsthebestintermsof

ROUGE..3853 .3577 .3698 .3686

.0000 .0050

.2450 .2324

.3639.3176 .3489 .3472

.3718.3468

.3804 .3793

.0000

.1000

.2000

.3000

.4000

MacroF1 MicroF1 ROUGE-2 ROUGE-S4

CRF HMM SVM LR TFIDF

W/POS W/PAR W/WORD W/OPOS W/OPAR

W/OWORD LOCAL CONTEXT29

Page 30: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

• End-to-endexample– Searchenginewouldsuggest“signupforairbnb coupon”for

“signupforairbnb”,whichimpliesanimportantresourceforthetask.

Task:signupforairbnb

Airbnb isnolongerrunningthe$50 OFF$200promobutyoucanstillsave$25OFFYourFirstAirbnb Stayof$75ormorebycopyingandpastingthislink intoyourbrowser…

Task:makeblueberrybananabread

Pleasedon’tuse regularwholewheatinthisrecipe– theloafwillturnoutverydense

Addthe wetingredients– theeggmixturetotheflourmixtureandstirwitharubberspatulauntiljustcombined

Ifyou’reinneedofaquick, easyanddelicious waytouseuptheripebabanas inyourhouse…definitely

Task:becomeacellphonedealer

However, thecellphoneprovidermayplacerestrictionsonthemannerinwhichyoucanuseitscompanyname,phonebrandsandimages

Visit thestate’sbusiness licensingagency’swebsiteandyourcity’s occupational/business licensingdepartment’swebsitetodetermineifyouneedalicenseforyourprepaidcellphonebusiness

AutomaticProceduralKnowledgeBaseConstructionResult(cont’d)

30

Page 31: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstructionResult(cont’d)

• End-to-endevaluation– Automaticapproachperformsworththanmanualcuration in

buildinganewPKBfromscratch.– Butstilldiscoverrelevantsubtasksthatarenotcoveredinthe

currentPKB,whichdeliversthefreshestinformationthatishardlyaddedandupdatedinstantlyinamanualprocess.

.0997 .0995 .0527.2046 .2041 .1331

.9677 .9515 .9404

.0000

.2000

.4000

.6000

.8000

1.0000

MacroP MicroP MAP

Proposed SummaryGeneration Proposed ExplanationGeneration wikiHow

31

Page 32: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Outline

• Background• ProblemDefinition• ProposedApproach• Experiment• Conclusion

32

Page 33: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Conclusion

• Investigatedtwoproblems– Searchtasksuggestionusingproceduralknowledge– Automaticproceduralknowledgebaseconstructionfromsearch

activities• Proposedtocreateathree-wayparallelcorpusofqueries,query

contexts,andtaskdescriptions.• AppliedCRF-basedsequencelabelingmodelsforquery

constructionandtaskdescriptiongeneration.• Futurework

– Userstudy– Jointranking– APKBCusinganaturallanguagegenerationapproach

33

Page 34: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

Thanks!Questions?

http://github.com/ziy/pkb

Code&Resources

AnsweringTask-OrientedQuestionsfromtheWebWebQA Workshop,Thursday11am

RelatedWorkshopTalk

ZiYangLanguageTechnologiesInstituteSchoolofComputerScienceCarnegieMellonUniversityziy@cs.cmu.edu

Contact

TravelissponsoredbySIGIRStudentTravelGrant!

Acknowledgement

Page 35: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

ParallelCorpusConstructionResult

• Relatedquery tosubtask mapping– Identified1,182query-taskpairsusingexactmatching.

• Task tocontextmapping– Selectedthesentences thatcontainallthetokensinthetask

summaryand70%+ofthetokensinthetask explanation.– Annotatedtheminimalspanthatcontainsthoseoverlapping

tokens.

35

Page 36: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

HowDoSearchEnginesandUsersResponsetoTask-OrientedQueries?

• Thenumber(andpercentage)ofsuggestedqueries(orqueriesissuedinthesamesession)thatarementionedwithinthedescriptionofsomesubtask.– “NewWords”:E.g.slimdown->slimdowndiet– Lowqualitymaybeduetoanover-simplifiedsessiondetectionmethod

0

0.2

0.4

0.6

0.8

Fullphrase Newwords

Averagednumber

0246810

Fullphrase Newwords

Percentage(%)

Google

Bing

Log

36

Page 37: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

SearchTaskSuggestion

Givenatask-orientedsearchtaskrepresentedbyqueryq(a)Identifytask

– RetrievealistofcandidatetasksfromPKBthatmentionthequeryq ineitherthesummaryorexplanation.

– Selectthetaskt thatmaximizesthelikelihoodofeachcandidateoccurrence,i.e.p(yt=BQIQ…IQ|xt;MQ).

(b)Retrievesubtasks– Retrieve the first-level subtasks s1, …, sn of task t.

(c)Suggestandcreatesubsearchtask– Extract query candidates for each subtask si usingMQ again.– Rankbyp(ysi=BQIQ…IQ|xsi;MQ).

37

Page 38: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstruction

Givenataskt,(a)Identifysearchtask

– ApplyMQ toextractatask-orientedsearchqueryq.(b)Collectrelatedsearchtasks

– Identifythequeriespi relatedto q inbothsearchlogsandsuggestedqueries.

(c)Identifyandgeneratesubtasks– Extract relevantdocumentsnippets for each relatedquerypi

fromsearchengines.– ApplyMTS/Etoextracttask summaryandexplanation.

38

Searchenginesareabletocorrectlysuggestrelatedtaskstotheuser,ratherthanrelatedentitiesorattributes.

Searchlogsrevealhowaspecificuserworkstoaccomplishatask.

Page 39: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

DataPreparation

• EnglishwikiHowdatadump– UsedamodifiedversionofWikiTeam tool.– Obtained149,975articlesthatarenon-redirect,innamespace

“0”,non-stub,with“Introduction”and“Steps”.– CreatedaPKBof1,488,587tasks,1,439,217relations.

• AOLsearchquerylog– 21M(10Munique)queriesintotal.– Afterdowncaseandremovenon-alphanumericcharacters,639

uniquequeriesmatch619tasksummariesafterwhitespaceandpunctuationmarksignored.

– Identified33,548relatedquerycandidatesbycollectingthequeriesthatwereissuedbythesameuserwithin30minutesafterissuedeachthematchingquery.

39

Page 40: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

DataPreparation(cont’d)

• Queriessuggestedbysearchengines– Randomlysampled1,000non-primitivetasksfromPKBthatdo

notappearinthequerylog.– Collected9,906relatedqueriessuggestedbyGoogle(avg.6.11,

max.8)and9,715(avg.5.99,max.13)relatedqueriessuggestedbyBingforthe1,639queries.

• Contextextractedfromsearchengines– ExtractedURLsfromGoogle’sfirstsearchresultpageand

excludedwikihow.comdomain(forgeneralizability),google.comdomain,URLsthathavenosubpaths (navigationalsearchresults),anddownloaded7,440contextdocuments.

– UsedBoilerpipe toextract7,437documentsascontexts,andadditional3,512documentsforend-to-endevaluation.

40

Page 41: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

SearchTaskSuggestionResult(cont’d)

• 5mostcontributingnon-wordfeatures– Queryphrasesaremorelikelyextractedfromthesummarypart

ofadescriptionduetoitsclarityandconciseness.– Singularnounsandverbsareindicatorstobeginaquery.– Verbphraseisusedtodecidewhethertocontinueaquery.

O à BQ BQ à IQ IQ à IQ

1 POS:NNP POS-1:VB LOC:sum

2 LOC: sum LOC:sum POS-1: IN

3 DEP:ccomp POS-1:VBP VP

4 POS: VB POS-1:NNP DEP:dobj

5 DEP:nsubjpass POS-1:NN POS+1:JJ

41

Page 42: Leveraging Procedural Knowledge for Task …ziy/slides/Yang-Task-Oriented-Search.pdfLeveraging Procedural Knowledge for Task-Oriented Search Zi Yang ... pituitary gland. • ... Surface,

AutomaticProceduralKnowledgeBaseConstructionResult(cont’d)

• 5mostcontributingnon-wordfeatures– Nounsandverbsarecrucialforconstructiontaskdescription.– Verbsaremorepreferredtobeginthesummarythannouns.– Tobeginanexplanation,itprefersthe“begin” ofasentence

and/adependencylabelofnsubj.– Verbphrasesarealsoimportant.

Summary Explanation

O à BTS BTS à ITS ITS à ITS O à BTE BTE à ITE ITE à ITE

1 POS:VB POS-1:VB POS-1:VBP Begin VP POS-1:NN

2 POS: VBP POS-1:VBP POS:NNP POS:VBG POS-1:NN VP

3 POS:NN POS-1:NNP POS-1: IN POS:NN POS-1:DT POS-1:NNS

4 DEP: appos POS-1:NN DEP:xcomp DEP:compound

NP POS-1: ,

5 POS:NNP DEP:case POS: JJR DEP:nsubj POS-1:VB POS-1:NNP42