Question-Answering: Overview

Question-Answering:Overview

Ling573Systems & Applications

March 31, 2011

Quick Schedule NotesNo Treehouse this week!

CS Seminar: Retrieval from Microblogs (Metzler)April 8, 3:30pm; CSE 609

RoadmapDimensions of the problemA (very) brief historyArchitecture of a QA systemQA and resourcesEvaluationChallengesLogistics Check-in

Dimensions of QABasic structure:

Question analysisAnswer searchAnswer selection and presentation

Dimensions of QABasic structure:

Question analysisAnswer searchAnswer selection and presentation

Rich problem domain: Tasks vary onApplicationsUsersQuestion typesAnswer typesEvaluationPresentation

ApplicationsApplications vary by:

Answer sourcesStructured: e.g., database fieldsSemi-structured: e.g., database with commentsFree text

Web Fixed document collection (Typical TREC QA)

Web Fixed document collection (Typical TREC QA) Book or encyclopedia Specific passage/article (reading comprehension)

Media and modality:Within or cross-language; video/images/speech

UsersNovice

Understand capabilities/limitations of system

UsersNovice

Understand capabilities/limitations of system

ExpertAssume familiar with capabiltiesWants efficient information accessMaybe desirable/willing to set up profile

Question TypesCould be factual vs opinion vs summary

Question TypesCould be factual vs opinion vs summaryFactual questions:

Yes/no; wh-questions

Yes/no; wh-questionsVary dramatically in difficulty

Factoid, List

Factoid, ListDefinitionsWhy/how..

Factoid, ListDefinitionsWhy/how..Open ended: ‘What happened?’

Affected by formWho was the first president?

Affected by formWho was the first president? Vs Name the first

president

AnswersLike tests!

Form:Short answerLong answerNarrative

AnswersLike tests!

Processing:Extractive vs generated vs synthetic

AnswersLike tests!

Processing:Extractive vs generated vs synthetic

In the limit -> summarizationWhat is the book about?

Evaluation & PresentationWhat makes an answer good?

Bare answer

Bare answerLonger with justification

Implementation vs Usability

QA interfaces still rudimentary Ideally should be

Implementation vs Usability

QA interfaces still rudimentary Ideally should be

Interactive, support refinement, dialogic

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-

70s)BASEBALL, LUNAR

70s)BASEBALL, LUNARLinguistically sophisticated:

Syntax, semantics, quantification, ,,,

Syntax, semantics, quantification, ,,,Restricted domain!

Spoken dialogue systems (Turing!, 70s-current)SHRDLU (blocks world), MIT’s Jupiter , lots more

Reading comprehension: (~2000)

(Very) Brief HistoryEarliest systems: NL queries to databases (60-s-70s)

BASEBALL, LUNAR Linguistically sophisticated:

Syntax, semantics, quantification, ,,, Restricted domain!

Spoken dialogue systems (Turing!, 70s-current) SHRDLU (blocks world), MIT’s Jupiter , lots more

Reading comprehension: (~2000) Information retrieval (TREC); Information extraction

General Architecture

Basic StrategyGiven a document collection and a query:

Basic StrategyGiven a document collection and a query:Execute the following steps:

Question processingDocument collection processingPassage retrievalAnswer processing and presentationEvaluation

Basic StrategyGiven a document collection and a query:Execute the following steps:

Question processingDocument collection processingPassage retrievalAnswer processing and presentationEvaluation

Systems vary in detailed structure, and complexity

AskMSRShallow Processing for QA

Deep Processing Technique for QA

LCC (Moldovan, Harabagiu, et al)

Query FormulationConvert question suitable form for IRStrategy depends on document collection

Web (or similar large collection):

Web (or similar large collection):‘stop structure’ removal:

Delete function words, q-words, even low content verbsCorporate sites (or similar smaller collection):

Query expansion Can’t count on document diversity to recover word

variation

variation Add morphological variants, WordNet as thesaurus

variation Add morphological variants, WordNet as thesaurus Reformulate as declarative: rule-based

Where is X located -> X is located in

Question ClassificationAnswer type recognition

Who -> PersonWhat Canadian city ->

Who -> PersonWhat Canadian city -> CityWhat is surf music -> Definition

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answer

Identifies type of entity (e.g. Named Entity) or form (biography, definition) to return as answerBuild ontology of answer types (by hand)

Train classifiers to recognize

Train classifiers to recognizeUsing POS, NE, words

Train classifiers to recognizeUsing POS, NE, wordsSynsets, hyper/hypo-nyms

Passage RetrievalWhy not just perform general information

retrieval?

retrieval?Documents too big, non-specific for answers

Identify shorter, focused spans (e.g., sentences)

retrieval?Documents too big, non-specific for answers

Identify shorter, focused spans (e.g., sentences) Filter for correct type: answer type classificationRank passages based on a trained classifier

Features: Question keywords, Named Entities Longest overlapping sequence, Shortest keyword-covering span N-gram overlap b/t question and passage

Passage RetrievalWhy not just perform general information retrieval?

Documents too big, non-specific for answersIdentify shorter, focused spans (e.g., sentences)

Filter for correct type: answer type classificationRank passages based on a trained classifier

Features: Question keywords, Named Entities Longest overlapping sequence, Shortest keyword-covering span N-gram overlap b/t question and passage

For web search, use result snippets

Answer ProcessingFind the specific answer in the passage

Answer ProcessingFind the specific answer in the passagePattern extraction-based:

Include answer types, regular expressions

Similar to relation extraction:Learn relation b/t answer type and aspect of question

E.g. date-of-birth/person name; term/definitionCan use bootstrap strategy for contexts, like Yarowsky

E.g. date-of-birth/person name; term/definitionCan use bootstrap strategy for contexts<NAME> (<BD>-<DD>) or <NAME> was born on <BD>

ResourcesSystem development requires resources

Especially true of data-driven machine learning

Especially true of data-driven machine learningQA resources:

Sets of questions with answers for development/test

Sets of questions with answers for development/testSpecifically manually constructed/manually annotated

Sets of questions with answers for development/testSpecifically manually constructed/manually annotated ‘Found data’

Trivia games!!!, FAQs, Answer Sites, etc

Trivia games!!!, FAQs, Answer Sites, etc Multiple choice tests (IP???)

Trivia games!!!, FAQs, Answer Sites, etc Multiple choice tests (IP???) Partial data: Web logs – queries and click-throughs

Information ResourcesProxies for world knowledge:

WordNet: Synonymy; IS-A hierarchy

WordNet: Synonymy; IS-A hierarchyWikipedia

WordNet: Synonymy; IS-A hierarchyWikipediaWeb itself….

Term management:Acronym listsGazetteers ….

Software ResourcesGeneral: Machine learning tools

Software ResourcesGeneral: Machine learning toolsPassage/Document retrieval:

Information retrieval engine:Lucene, Indri/lemur, MG

Sentence breaking, etc..

Sentence breaking, etc..Query processing:

Named entity extractionSynonymy expansionParsing?

Sentence breaking, etc..Query processing:

Named entity extraction Synonymy expansion Parsing?

Answer extraction: NER, IE (patterns)

EvaluationCandidate criteria:

RelevanceCorrectnessConciseness:

No extra informationCompleteness:

Penalize partial answersCoherence:

Easily readable Justification

Tension among criteria

EvaluationConsistency/repeatability:

Are answers scored reliability

Are answers scored reliability?

Automation:Can answers be scored automatically?Required for machine learning tune/test

Are answers scored reliability?

Automation:Can answers be scored automatically?Required for machine learning tune/test

Short answer answer keys Litkowski’s patterns

EvaluationClassical:

Return ranked list of answer candidates

Return ranked list of answer candidates Idea: Correct answer higher in list => higher score

Measure: Mean Reciprocal Rank (MRR)

Return ranked list of answer candidates Idea: Correct answer higher in list => higher score

Measure: Mean Reciprocal Rank (MRR)For each question,

Get reciprocal of rank of first correct answerE.g. correct answer is 4 => ¼None correct => 0

Average over all questions

Dimensions of TREC QAApplications

Open-domain free text searchFixed collections News, blogs

UsersNovice

Question types

UsersNovice

Question typesFactoid -> List, relation, etc

Answer types

UsersNovice

Answer typesPredominantly extractive, short answer in context

Evaluation:

Dimensions of TREC QA Applications

UsersNovice

Answer typesPredominantly extractive, short answer in context

Evaluation:Official: human; proxy: patterns

Presentation: One interactive track

Question-Answering: Overview

Documents

Transcript of Question-Answering: Overview

Question answering - Cornell University · 2012. 1. 28. · Question answering • Overview and task definition • History • Open-domain question answering • Basic system architecture

Answering Yali’s Question

Question Answering

QUESTION AND ANSWERING. Overview What is Question Answering? Why use it? How does it work? Problems Examples Future.

Question Answering - eecs.csuohio.edu

Relevant multimedia question answering

USI Answers: Natural Language Question Answering Over ... … · 2012). Several industrial applications of question answering have raised the interest and awareness of question answering

Question Rewriting for Conversational Question Answering

Unsupervised Multi-hop Question Answering by Question ...

Instant Question Answering System

Predicting Answering Behaviour in Online Question Answering Communities

Image Question Answering: A Visual Semantic Embedding ... · Image Question Answering: A Visual Semantic Embedding ... an automatic question generation ... Image Question Answering:

Evaluating Question Answering Validation

Question Answering Question Answering Question Answering Structure Survey Structure Survey Structure Survey Dictation Dictation.

Question Answering and Dialog Systems - ut · Factoid QA methods •IR-based question answering or text-based question answering •Relies on huge amounts of texts •Given a question,

Holding meeting & answering question

Question Answering Tutorial

Answering core question

1 CLEF 2011, Amsterdam QA4MRE, Question Answering for Machine Reading Evaluation Question Answering Track Overview Main Task Anselmo Peñas Eduard Hovy.

CLEF 2009, Corfu Question Answering Track Overview