Question Answering Tutorial

Question Answering TutorialJohn M. PragerIBM T.J. Watson Research [email protected]

Tutorial OverviewGround RulesPart I - Anatomy of QAA Brief History of QATerminologyThe essence of Text-based QABasic Structure of a QA SystemNE Recognition and Answer TypesAnswer ExtractionPart II - Specific ApproachesBy GenreBy SystemPart III - Issues and Advanced TopicsEvaluationNo AnswerQuestion DifficultyDimensions of QARelationship questionsDecomposition/Recursive QAConstraint-based QACross-Language QAReferences

Ground RulesBreaksQuestionsTopicsFocus on English TextTREC & AQUAINT & beyondGeneral PrinciplesTricks-of-the-TradeState-of-the-Art MethodologiesMy own System vs. My own ResearchCaution

CautionNothing in this Tutorial is trueNothing in this Tutorial is trueuniversally

Part I - Anatomy of QA

A Brief History of QATerminologyThe Essence of Text-based QABasic Structure of a QA SystemNE Recognition and Answer TypesAnswer Extraction

A Brief History of QANLP front-ends to Expert SystemsSHRDLU (Winograd, 1972)User manipulated, and asked questions about, blocks worldFirst real demo of combination of syntax, semantics, and reasoningNLP front-ends to DatabasesLUNAR (Woods,1973)User asked questions about moon rocksUsed ATNs and procedural semanticsLIFER/LADDER (Hendrix et al. 1977)User asked questions about U.S. Navy shipsUsed semantic grammar; domain information built into grammarNLP + logicCHAT-80 (Warren & Pereira, 1982)NLP query system in Prolog, about world geographyDefinite Clause GrammarsModern Era of QAMURAX (Kupiec, 2001)NLP front-end to EncyclopaediaNLP + hand-coded annotations to sourcesAskJeeves (www.ask.com)START (Katz, 1997)Started with text, extended to multimediaIR + NLPTREC-8 (1999) (Voorhees & Tice, 2000)Today all of the above

Some factoid questions from TREC8-99: How far is Yaroslavl from Moscow?15: When was London's Docklands Light Railway constructed?22: When did the Jurassic Period end?29: What is the brightest star visible from Earth?30: What are the Valdez Principles?73: Where is the Taj Mahal?134: Where is it planned to berth the merchant ship, Lane Victory, which Merchant Marine veterans are converting into a floating museum?197: What did Richard Feynman say upon hearing he would receive the Nobel Prize in Physics?198: How did Socrates die?199: How tall is the Matterhorn?200: How tall is the replica of the Matterhorn at Disneyland?227: Where does dew come from?269: Who was Picasso?298: What is California's state tree?

TerminologyQuestion TypeAnswer TypeQuestion FocusQuestion TopicCandidate PassageCandidate AnswerAuthority File/List

Terminology Question TypeQuestion Type: an idiomatic categorization of questions for purposes of distinguishing between different processing strategies and/or answer formatsE.g. TREC2003 FACTOID: How far is it from Earth to Mars? LIST: List the names of chewing gums DEFINITION: Who is Vlad the Impaler?Other possibilities: RELATIONSHIP: What is the connection between Valentina Tereshkova and Sally Ride? SUPERLATIVE: What is the largest city on Earth? YES-NO: Is Saddam Hussein alive? OPINION: What do most Americans think of gun control? CAUSE&EFFECT: Why did Iraq invade Kuwait?

Terminology Answer TypeAnswer Type: the class of object (or rhetorical type of sentence) sought by the question. E.g.PERSON (from Who )PLACE (from Where )DATE (from When )NUMBER (from How many )but alsoEXPLANATION (from Why )METHOD (from How )Answer types are usually tied intimately to the classes recognized by the systems Named Entity Recognizer.

Terminology Question FocusQuestion Focus: The property or entity that is being sought by the question.E.g.In what state is the Grand Canyon?What is the population of Bulgaria?What colour is a pomegranate?

Terminology Question TopicQuestion Topic: the object (person, place, ) or event that the question is about. The question might well be about a property of the topic, which will be the question focus.E.g. What is the height of Mt. Everest?height is the focusMt. Everest is the topic

Terminology Candidate PassageCandidate Passage: a text passage (anything from a single sentence to a whole document) retrieved by a search engine in response to a question.Depending on the query and kind of index used, there may or may not be a guarantee that a candidate passage has any candidate answers.Candidate passages will usually have associated scores, from the search engine.

Terminology Candidate AnswerCandidate Answer: in the context of a question, a small quantity of text (anything from a single word to a sentence or bigger, but usually a noun phrase) that is of the same type as the Answer Type.In some systems, the type match may be approximate, if there is the concept of confusability.Candidate answers are found in candidate passagesE.g.50Queen Elizabeth IISeptember 8, 2003by baking a mixture of flour and water

Terminology Authority ListAuthority List (or File): a collection of instances of a class of interest, used to test a term for class membership.Instances should be derived from an authoritative source and be as close to complete as possible.Ideally, class is small, easily enumerated and with members with a limited number of lexical forms.Good:Days of weekPlanetsElementsGood statistically, but difficult to get 100% recall:AnimalsPlantsColoursProblematicPeopleOrganizationsImpossibleAll numeric quantitiesExplanations and other clausal quantities

Essence of Text-based QANeed to find a passage that answers the question.Find a candidate passage (search)Check that semantics of passage and question matchExtract the answer(Single source answers)

Essence of Text-based QAFor a very small corpus, can consider every passage as a candidate, but this is not interestingNeed to perform a search to locate good passages.If search is too broad, have not achieved that much, and are faced with lots of noiseIf search is too narrow, will miss good passagesSearchTwo broad possibilities:Optimize searchUse iteration

Essence of Text-based QANeed to test whether semantics of passage match semantics of questionCount question words present in passageScore based on proximityScore based on syntactic relationshipsProve matchMatch

Essence of Text-based QAFind candidate answers of same type as the answer type sought in question.Has implications for size of type hierarchyWhere/when/whether to consider subsumptionConsider laterAnswer Extraction

Basic Structure of a QA-SystemSee for example Abney et al., 2000; Clarke et al., 2001; Harabagiu et al.; Hovy et al., 2001; Prager et al. 2000QuestionAnalysisAnswerExtractionSearchCorpusorWebQuestionAnswerDocuments/ passagesQueryAnswerType

Essence of Text-based QAHave three broad locations in the system where expansion takes place, for purposes of matching passagesWhere is the right trade-off?Question Analysis. Expand individual terms to synonyms (hypernyms, hyponyms, related terms)Reformulate questionIn Search EngineGenerally avoided for reasons of computational expenseAt indexing timeStemming/lemmatizationHigh-Level View of Recall

Essence of Text-based QAHave three broad locations in the system where narrowing/filtering/matching takes placeWhere is the right trade-off?

Question Analysis. Include all question terms in queryUse IDF-style weighting to indicate preferencesSearch EnginePossibly store POS information for polysemous termsAnswer ExtractionReward (penalize) passages/answers that (dont) pass testParticularly attractive for temporal modificationHigh-Level View of Precision

Answer Types and ModifiersMost likely there is no type for French CitiesSo will look for CITYinclude French/France in bag of words, and hope for the bestinclude French/France in bag of words, retrieve documents, and look for evidence (deep parsing, logic)use high-precision Language Identification on resultsIf you have a list of French cities, could eitherFilter results by listUse Answer-Based QA (see later)Use longitude/latitude information of cities and countriesName 5 French Cities

Answer Types and ModifiersMost likely there is no type for female figure skaterMost likely there is no type for figure skaterLook for PERSON, with query terms {figure, skater}What to do about female? Two approaches.Include female in the bag-of-words. Relies on logic that if femaleness is an interesting property, it might well be mentioned in answer passages. Does not apply to, say singer.Leave out female but test candidate answers for gender. Needs either an authority file or a heuristic test.Test may not be definitive.Name a female figure skater

Named Entity RecognitionBBNs IdentiFinder (Bikel et al. 1999)Hidden Markov ModelSheffield GATE (http://www.gate.ac.uk/)Development Environment for IE and other NLP activitiesIBMs Textract/Resporator (Byrd & Ravin, 1999; Wacholder et al. 1997; Prager et al. 2000)FSMs and Authority Files+ others

Inventory of semantic classes recognized by NER related closely to set of answer types system can handle

Named Entity Recognition

Probabilistic Labelling (IBM)In Textract, a Proper name can be one of the followingPERSONPLACEORGANIZATIONMISC_ENTITY (e.g. names of Laws, Treaties, Reports, )However, NER needs another class (UNAME) for any proper name it cant identify.In a large corpus, many entities end up being UNAMEs.If, for example, a Where question seeks a PLACE, and similarly for the others above, then is being classified as UNAME a death sentence? How will a UNAME ever be searched for?

Probabilistic Labelling (IBM)When entity is ambiguous or plain unknown, use a set of disjoint special labels in NER, instead of UNAMEAssumes NER is able to rule out some possibilities, at least sometimes.Annotate with all remaining possibilitiesUse these labels as part of answer typeE.g.UNP could be a PERSONUNL could be a PLACEUNO could be an ORGANIZATIONUNE could be a MISC_ENTITYSo{UNP UNL} could be a PERSON or a PLACEThis would be a good label for Beverly Hills

Probabilistic Labelling (IBM)So Who questions that would normally generate {PERSON} as answer type, now generate {PERSON UNP}Question: Who is David Beckham married to? Answer Passage: David Beckham, the soccer star engaged to marry Posh Spice, is being blamed for England 's World Cup defeat.Posh Spice gets annotated with {UNP UNO}Match occurs, answer found. Crowd erupts!

Issues with NERCoreferenceShould referring terms (definite noun phrases, pronouns) be labelled the same way as the referent terms?Nested Noun Phrases (and other structures of interest)What granularity?Partly depends on whether multiple annotations are allowedSubsumption and AmbiguityWhat label(s) to choose?Probabilistic labelling

How to Annotate? Baker will leave Jerusalemon Saturday and stop in Madridon the way home to talk to Spanish Prime Minister Felipe Gonzales.What about: The U.S. ambassador to Spain, Ed Romero ?

Answer ExtractionAlso called Answer Selection/PinpointingGiven a question and candidate passages, the process of selecting and ranking candidate answers.Usually, candidate answers are those terms in the passages which have the same answer type as that generated from the questionRanking the candidate answers depends on assessing how well the passage context relates to the question3 Approaches:Heuristic featuresShallow parse fragmentsLogical proof

Answer Extraction using FeaturesHeuristic feature sets (Prager et al. 2003+). See also (Radev at al. 2000)Calculate feature values for each CA, and then calculate linear combination using weights learned from training data.Ranking criteria:Good global context:the global context of a candidate answer evaluates the relevance of the passage from which the candidate answer is extracted to the question.Good local contextthe local context of a candidate answer assesses the likelihood that the answer fills in the gap in the question.Right semantic typethe semantic type of a candidate answer should either be the same as or a subtype of the answer type identified by the question analysis component.Redundancythe degree of redundancy for a candidate answer increases as more instances of the answer occur in retrieved passages.

Answer Extraction using Features (cont.)Features for Global ContextKeywordsInPassage: the ratio of keywords present in a passage to the total number of keywords issued to the search engine.NPMatch: the number of words in noun phrases shared by both the question and the passage.SEScore: the ratio of the search engine score for a passage to the maximum achievable score.FirstPassage: a Boolean value which is true for the highest ranked passage returned by the search engine, and false for all other passages.Features for Local ContextAvgDistance: the average distance between the candidate answer and keywords that occurred in the passage.NotInQuery: the number of words in the candidate answers that are not query keywords.

Answer Extraction using RelationshipsComputing Ranking Scores Linguistic knowledge to compute passage & candidate answer scoresPerform syntactic processing on question and candidate passagesExtract predicate-argument & modification relationships from parseQuestion: Who wrote the Declaration of Independence? Relationships: [X, write], [write, Declaration of Independence]Answer Text: Jefferson wrote the Declaration of Independence. Relationships: [Jefferson, write], [write, Declaration of Independence]Compute scores based on number of question relationship matchesPassage score: consider all instantiated relationshipsCandidate answer scores: consider relationships with variable

Answer Extraction using Relationships (cont.)Example: When did Amtrak begin operations?Question relationships[Amtrak, begin], [begin, operation], [X, begin]Compute passage scores: passages and relationshipsIn 1971, Amtrak began operations,[Amtrak, begin], [begin, operation], [1971, begin]Today, things are looking better, said Claytor, expressing optimism about getting the additional federal funds in future years that will allow Amtrak to begin expanding its operations.[Amtrak, begin], [begin, expand], [expand, operation], [today, look]Airfone, which began operations in 1984, has installed air-to-ground phones. Airfone also operates Railfone, a public phone service on Amtrak trains.[Airfone, begin], [begin, operation], [1984, operation], [Amtrak, train]

Answer Extraction using LogicLogical ProofConvert question to a goalConvert passage to set of logical forms representing individual assertionsAdd predicates representing subsumption rules, real-world knowledgeProve the goalSee section on LCC later

Question Answering Tutorial Part IIJohn M. PragerIBM T.J. Watson Research [email protected]

Part II - Specific Approaches

By GenreStatistical QAPattern-based QAWeb-based QAAnswer-based QA (TREC only)By SystemSMULCCUSC-ISIInsightMicrosoftIBM StatisticalIBM Rule-based

Approaches by GenreBy GenreStatistical QAPattern-based QAWeb-based QAAnswer-based QA (TREC only)Web-based QADatabase-based QAConsiderationsEffectiveness by question-typePrecision and recallExpandability to other domainsEase of adaptation to CL-QA

Statistical QAUse statistical distributions to model likelihoods of answer type and answerE.g. IBM (Ittycheriah, 2001) see later section

Pattern-based QAFor a given question type, identify the typical syntactic constructions used in text to express answers to such questionsTypically very high precision, but a lot of work to get decent recall

Web-Based QAExhaustive string transformationsBrill et al. 2002Learning Radev et al. 2001

Answer-Based QAProblem: Sometimes it is very easy to find an answer to a question using resource A, but the task demands that you find it in resource B.Solution: First find the answer in resource A, then locate the same answer, along with original question terms, in resource B.Artificial problem, but real for TREC participants.

Answer-Based QAWeb-Based solution:

When a QA system looks for answers within a relatively small textual collection, the chance of finding strings/sentences that closely match the question string is small. However, when a QA system looks for strings/sentences that closely match the question string on the web, the chance of finding correct answer is much higher. Hermjakob et al. 2002Why this is true:The Web is much larger than the TREC Corpus (3,000 : 1)TREC questions are generated from Web logs, and the style of language (and subjects of interest) in these logs are more similar to the Web content than to newswire collections.

Answer-Based QADatabase/Knowledge-base/Ontology solution:When question syntax is simple and reliably recognizable, can express as a logical formLogical form represents entire semantics of question, and can be used to access structured resource:WordNetOn-line dictionariesTables of facts & figuresKnowledge-bases such as CycHaving found answerconstruct a query with original question terms + answerRetrieve passagesTell Answer Extraction the answer it is looking for

Approaches of Specific SystemsSMU FalconLCCUSC-ISIInsightMicrosoftIBMNote: Some of the slides and/or examples in these sections are taken from papers or presentations from the respective system authors

SMU FalconHarabagiu et al. 2000

SMU FalconFrom question, dependency structure called question semantic form is createdQuery is Boolean conjunction of terms From answer passages that contain at least one instance of answer type, generate answer semantic form3 processing loops:Loop 1Triggered when too few or too many passages are retrieved from search engineLoop 2Triggered when question semantic form and answer semantic form cannot be unifiedLoop 3Triggered when unable to perform abductive proof of answer correctness

SMU FalconLoops provide opportunities to perform alternationsLoop 1: morphological expansions and nominalizationsLoop 2: lexical alternations synonyms, direct hypernyms and hyponymsLoop 3: paraphrasesEvaluation (Pasca & Harabagiu, 2001). Increase in accuracy in 50-byte task in TREC9Loop 1: 40%Loop 2: 52%Loop 3: 8%Combined: 76%

LCCMoldovan & Rus, 2001Uses Logic Prover for answer justificationQuestion logical formCandidate answers in logical formXWN glossesLinguistic axiomsLexical chainsInference engine attempts to verify answer by negating question and proving a contradictionIf proof fails, predicates in question are gradually relaxed until proof succeeds or associated proof score is below a threshold.

LCC: Lexical ChainsQ:1518 What year did Marco Polo travel to Asia?Answer: Marco polo divulged the truth after returning in 1292 from his travels, which included several months on Sumatra Lexical Chains: (1) travel_to:v#1 -> GLOSS -> travel:v#1 -> RGLOSS -> travel:n#1 (2) travel_to#1 -> GLOSS -> travel:v#1 -> HYPONYM -> return:v#1 (3) Sumatra:n#1 -> ISPART -> Indonesia:n#1 -> ISPART -> Southeast _Asia:n#1 -> ISPART -> Asia:n#1

Q:1570 What is the legal age to vote in Argentina?Answer: Voting is mandatory for all Argentines aged over 18.Lexical Chains: (1) legal:a#1 -> GLOSS -> rule:n#1 -> RGLOSS -> mandatory:a#1(2) age:n#1 -> RGLOSS -> aged:a#3(3) Argentine:a#1 -> GLOSS -> Argentina:n#1

LCC: Logic ProverQuestionWhich company created the Internet Browser Mosaic?QLF: (_organization_AT(x2) ) & company_NN(x2) & create_VB(e1,x2,x6) & Internet_NN(x3) & browser_NN(x4) & Mosaic_NN(x5) & nn_NNC(x6,x3,x4,x5)Answer passage... Mosaic , developed by the National Center for Supercomputing Applications ( NCSA ) at the University of Illinois at Urbana - Champaign ...ALF: ... Mosaic_NN(x2) & develop_VB(e2,x2,x31) & by_IN(e2,x8) & National_NN(x3) & Center_NN(x4) & for_NN(x5) & Supercomputing_NN(x6) & application_NN(x7) & nn_NNC(x8,x3,x4,x5,x6,x7) & NCSA_NN(x9) & at_IN(e2,x15) & University_NN(x10) & of_NN(x11) & Illinois_NN(x12) & at_NN(x13) & Urbana_NN(x14) & nn_NNC(x15,x10,x11,x12,x13,x14) & Champaign_NN(x16) ... Lexical Chains develop make and make create exists x2 x3 x4 all e2 x1 x7 (develop_vb(e2,x7,x1) make_vb(e2,x7,x1) & something_nn(x1) & new_jj(x1) & such_jj(x1) & product_nn(x2) & or_cc(x4,x1,x3) & mental_jj(x3) & artistic_jj(x3) & creation_nn(x3)).all e1 x1 x2 (make_vb(e1,x1,x2) create_vb(e1,x1,x2) & manufacture_vb(e1,x1,x2) & man-made_jj(x2) & product_nn(x2)). Linguistic axiomsall x0 (mosaic_nn(x0) -> internet_nn(x0) & browser_nn(x0))

USC-ISITextmap system Ravichandran and Hovy, 2002Hermjakob et al. 2003Use of Surface Text PatternsWhen was X born ->Mozart was born in 1756Gandhi (1869-1948)Can be captured in expressions was born in ( -These patterns can be learned

USC-ISI TextMapUse bootstrapping to learn patterns. For an identified question type (When was X born?), start with known answers for some values of XMozart 1756Gandhi 1869Newton 1642Issue Web search engine queries (e.g. +Mozart +1756 )Collect top 1000 documentsFilter, tokenize, smooth etc.Use suffix tree constructor to find best substrings, e.g.Mozart (1756-1791)FilterMozart (1756-Replace query strings with e.g. and

Determine precision of each patternFind documents with just question term (Mozart)Apply patterns and calculate precision

USC-ISI TextMapFinding AnswersDetermine Question typePerform IR QueryDo sentence segmentation and smoothingReplace question term by question tag i.e. replace Mozart with Search for instances of patterns associated with question typeSelect words matching Assign scores according to precision of pattern

InsightSoubbotin, 2002. Soubbotin & Soubbotin, 2003.Performed very well in TREC10/11 Comprehensive and systematic use of Indicative patternsE.g.cap word; paren; 4 digits; dash; 4 digits; parenmatchesMozart (1756-1791)The patterns are broader than named entities Semantics in syntaxPatterns have intrinsic scores (reliability), independent of question

InsightPatterns with more sophisticated internal structure are more indicative of answer2/3 of their correct entries in TREC10 were answered by patternsE.g.a == {countries}b == {official posts}w == {proper names (first and last)}e == {titles or honorifics}Patterns for Who is the President (Prime Minister) of given country?abewwewwdb,ab,aewwDefinition questions: (A is primary query term, X is answer)

For: Moulin Rouge, a cabaret

For: naturally occurring gas called methane

For: Michigans state flower is the apple blossom

InsightEmphasis on shallow techniques, lack of NLPLook in vicinity of text string potentially matching pattern for zeroing e.g. for occupational roles:FormerElectDeputyNegationComments:Relies on redundancy of large corpusWorks for factoid question types of TREC-QA not clear how it extendsNot clear how they match questions to patternsNamed entities within patterns have to be recognized

MicrosoftData-Intensive QA. Brill et al. 2002Overcoming the surface string mismatch between the question formulation and the string containing the answerApproach based on the assumption/intuition that someone on the Web has answered the question in the same way it was asked.Want to avoid dealing with:Lexical, syntactic, semantic relationships (bet. Q & A)Anaphora resolutionSynonymyAlternate syntaxIndirect answersTake advantage of redundancy on Web, then project to TREC corpus (Answer-based QA)

Microsoft AskMSRFormulate multiple queries each rewrite has intrinsic score. E.g. for What is relative humidity?[+is relative humidity, LEFT, 5][relative +is humidity, RIGHT, 5][relative humidity +is, RIGHT, 5][relative humidity, NULL, 2][relative AND humidity, NULL, 1]Get top 100 documents from GoogleExtract n-grams from document summariesScore n-grams by summing the scores of the rewrites it came fromUse tiling to merge n-gramsSearch for supporting documents in TREC corpus

Microsoft AskMSRQuestion is: What is the rainiest place on EarthAnswer from Web is: Mount WaialealePassage in TREC corpus is: In misty Seattle, Wash., last year, 32 inches of rain fell. Hong Kong gets about 80 inches a year, and even Pago Pago, noted for its prodigious showers, gets only about 196 inches annually. (The titleholder, according to the National Geographic Society, is Mount Waialeale in Hawaii, where about 460 inches of rain falls each year.) Very difficult to imagine getting this passage by other means

IBM Statistical QA (Ittycheriah, 2001)ATM predicts, from the question and a proposed answer, the answer type they both satisfy Given a question, an answer, and the predicted answer type, ASM seeks to model the correctness of this configuration.Distributions are modelled using a maximum entropy formulation Training data = human judgmentsFor ATM, 13K questions annotated with 31 categoriesFor ASM, ~ 5K questions from TREC plus triviap(c|q,a)= Se p(c,e|q,a)= Se p(c|e,q,a) p(e|q,a)q = questiona = answerc = correctnesse = answer typep(e|q,a) is the answer type model (ATM)p(c|e,q,a) is the answer selection model (ASM)

IBM Statistical QA (Ittycheriah)Question Analysis (by ATM)Selects one out of 31 categoriesSearchQuestion expanded by Local Context AnalysisTop 1000 documents retrievedPassage Extraction: Top 100 passages that:Maximize question word matchHave desired answer typeMinimize dispersion of question wordsHave similar syntactic structure to questionAnswer Extraction:Candidate answers ranked using ASM

IBM Rule-basedPredictive Annotation (Prager 2000, Prager 2003)

Want to make sure passages retrieved by search engine have at least one candidate answerRecognize that candidate answer is of correct answer type which corresponds to a label (or several) generated by Named Entity RecognizerAnnotate entire corpus and index semantic labels along with textIdentify answer types in questions and include corresponding labels in queries

IBM PIQUANTPredictive Annotation E.g.: Question is Who invented baseball?Who can map to PERSON$ or ORGANIZATION$Suppose we assume only people invent things (it doesnt really matter).

So Who invented baseball? -> {PERSON$ invent baseball}

Consider text but its conclusion was based largely on the recollections of a man named Abner Graves, an elderly mining engineer, who reported that baseball had been "invented" by Doubleday between 1839 and 1841.

IBM PIQUANTPredictive Annotation Previous exampleWho invented baseball? -> {PERSON$ invent baseball}However, same structure is equally effective at answering What sport did Doubleday invent? -> {SPORT$ invent Doubleday}

IBM Rule-BasedHandling Subsumption & DisjunctionIf an entity is of a type which has a parent type, then how is annotation done?If a proposed answer type has a parent type, then what answer type should be used?If an entity is ambiguous then what should the annotation be?If the answer type is ambiguous, then what should be used?

Guidelines:If an entity is of a type which has a parent type, then how is annotation done?If a proposed answer type has a parent type, then what answer type should be used?If an entity is ambiguous then what should the annotation be?If the answer type is ambiguous, then what should be used?

Subsumption & DisjunctionConsider New York City both a CITY and a PLACETo answer Where did John Lennon die?, it needs to be a PLACETo answer In what city is the Empire State Building?, it needs to be a CITY.Do NOT want to do subsumption calculation in search engineTwo scenarios 1. Expand Answer Type and use most specific entity annotation 1A { (CITY PLACE) John_Lennon die} matches CITY 1B {CITY Empire_State_Building} matches CITYOr2. Use most specific Answer Type and multiple annotations of NYC 2A {PLACE John_Lennon die} matches (CITY PLACE) 2B {CITY Empire_State_Building} matches (CITY PLACE)Case 2 preferred for simplicity, because disjunction in #1 should contain all hyponyms of PLACE, while disjunction in #2 should contain all hypernyms of CITYChoice #2 suggests can use disjunction in answer type to represent ambiguity:Who invented the laser -> {(PERSON ORGANIZATION) invent laser}

Clausal classesAny structure that can be recognized in text can be annotated.QuotationsExplanationsMethodsOpinionsAny semantic class label used in annotation can be indexed, and hence used as a target of search:What did Karl Marx say about religion?Why is the sky blue?How do you make bread?What does Arnold Schwarzenegger think about global warming?

Named Entity Recognition

IBMPredictive Annotation Improving Precision at no cost to RecallE.g.: Question is Where is Belize?Where can map to (CONTINENT$, WORLDREGION$, COUNTRY$, STATE$, CITY$, CAPITAL$, LAKE$, RIVER$ ). But we know Belize is a country.So Where is Belize? -> {(CONTINENT$ WORLDREGION$) Belize} Belize occurs 1068 times in TREC corpusBelize and PLACE$ co-occur in only 537 sentencesBelize and CONTINENT$ or WORLDREGION$ co-occur in only 128 sentences

Virtual Annotation (Prager 2001)Use WordNet to find all candidate answers (hypernyms)Use corpus co-occurrence statistics to select best onesRather like approach to WSD by Mihalcea and Moldovan (1999)

Parentage of nematode

Parentage of meerkat

Natural CategoriesBasic Objects in Natural Categories Rosch et al. (1976)According to psychological testing, these are categorization levels of intermediate specificity that people tend to use in unconstrained settings.

What is this?

What can we conclude?There are descriptive terms that people are drawn to use naturally.We can expect to find instances of these in text, in the right contexts.These terms will serve as good answers.

Virtual Annotation (cont.)Find all parents of query term in WordNetLook for co-occurrences of query term and parent in text corpusExpect to find snippets such as: meerkats and other Y Many different phrasings are possible, so we just look for proximity, rather than parse.Scoring:Count co-occurrences of each parent with search term, and divide by level number (only levels >= 1), generating Level-Adapted Count (LAC).Exclude very highest levels (too general).Select parent with highest LAC plus any others with LAC within 20%.

Parentage of nematode

Parentage of meerkat

Sample Answer PassagesWhat is a nematode? -> Such genes have been found in nematode worms but not yet in higher animals.

What is a meerkat? -> South African golfer Butch Kruger had a good round going in the central Orange Free State trials, until a mongoose-like animal grabbed his ball with its mouth and dropped down its hole. Kruger wrote on his card: "Meerkat."

Use Answer-based QA to locate answers

Use of Cyc as Sanity CheckerCyc: Large Knowledge-base and Inference engine (Lenat 1995)A post-hoc process for Rejecting insane answersHow much does a grey wolf weigh? 300 tonsBoosting confidence for sane answersSanity checker invoked withPredicate, e.g. weightFocus, e.g. grey wolfCandidate value, e.g. 300 tonsSanity checker returnsSane: + or 10% of value in CycInsane: outside of the reasonable rangePlan to use distributions instead of rangesDont knowConfidence score highly boosted when answer is sane

Cyc Sanity Checking ExampleTrec11 Q: What is the population of Maryland?Without sanity checkingPIQUANTs top answer: 50,000Justification: Marylands population is 50,000 and growing rapidly.Passage discusses an exotic species nutria, not humansWith sanity checkingCyc knows the population of Maryland is 5,296,486It rejects the top insane answersPIQUANTs new top answer: 5.1 million with very high confidence

Question Answering Tutorial Part IIIJohn M. PragerIBM T.J. Watson Research [email protected]

Part III Issues, Advanced TopicsEvaluationNo AnswerQuestion DifficultyFuture of QA/Hot topicsDimensions of QARelationship questionsDecomposition / Recursive QAConstraint-based QACross-Language QA

EvaluationRelatively straightforward for factoid questions.TREC-8 (1999) & TREC-9 (2000)50-byte and 250-byte tasksSystems returned top 5 answersMean Reciprocal Rank1 point if top answer is correct, else0.5 point if second answer is correct, else 0.2 point if fifth answer is correct, else 0

EvaluationFor each question, a set of correct answers Correctness testing is easy to automate with pattern files, but patterns are subjectivePatterns dont/cant test for justification

EvaluationTREC-10 (2001)Dropped 250-byte taskIntroduced NIL (No Answer ) questionsTREC-11 (2002)Instead of top 5 answers, systems returned top 1Answer must be exactDefinition questions (What/who is X?) droppedResults returned sorted in order of systems confidenceScored by Confidence Weighted Score (= Average Precision)TREC-12 (2003)Definition questions re-introduced, but answers assumed to be a collection of nuggetsList questions introduced, answers must be exactDefinition and List questions evaluated by F-measure biased to favour recallFactoid questions evaluated by fraction correct

Confidence-Weighted Score (Average Precision)= average of N different precision measuresScore1 participates in every termScore2 participates in all but first, ScoreN participates in just last term Much more weight given to early terms in sum

Contribution by Rank PositionFor N questions, if contribution of correct answer in position k is ck ck = ck+1 + 1/kNcN+1 = 0N =500

Average PrecisionN =500

Evaluation IssuesWhat is really meant by exact answer?What if there is a mistake in question?Suppose question is Who said X?, where X is a famous saying with a mistake in it.Maybe the answer is NILWhat granularity is required?Where is Chicago?What is acetominophen?Difficult to answer without model of user.

Questions with No AnswerSubtle difference between:This question has no answer (within the available resources),This question has no answer (at all), andI dont know the answerTREC-QA tests #1 (NIL questions), but systems typically answer as if #3Strategies used:When allowed top 5 answers (with confidences)Always put NIL in position X (X in {2,3,4,5})If some criterion succeeds, put NIL in position X (X in {1,2,3,4,5})Determine some threshold T, and insert NIL at corresponding position in confidence ranking (1-5, or not)When single answerDetermine some threshold T, and insert NIL if answer confidence < T

NIL and CWSWhen Confidence-Weighted Score is used, what should the NIL strategy be? If an answer has low confidence and is replaced by NIL, then what is its new confidence?Study strategy used by IBM in TREC11 (Chu-Carroll et al. 2003)

No-Answer Confidence-Based Calculation

Use TREC10 Data to determine strategy and thresholdsObserve that lowest-confidence questions are more often No-Answer than correct Examine TREC10 distribution to determine cut-off threshold. Convert all questions below this to NIL.Improves average confidence of block.Move converted block to rank with same average precision.Confidences based on Grammatical Relationships Semantic Relationships Redundancy

TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 3541xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 3850xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 2250.-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 1850........x....x..xxxx...x...xx....xxxxx--......xxx. 2 1750..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 1650..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 1550x..-x.....x.x.....-..........-...-..x.-....-..x... 6 650.x--......xx....-.-..x.-....-.-..x...........--... 9 550-.-.-..--...-x.xx....-.-x......-.....-..-...-.x.-. 13 550Key: XCorrect.Incorrect-NIL

TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 3541xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 3850xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 2250.-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 1850........x....x..xxxx...x...xx....xxxxx--......xxx. 2 1750..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 1650..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 1550x..-x.....x.x.....-..........-...-..x.-....-..x... 6 650.x--......xx....-.-..x.-....-.-..x...........--... 9 550-.-.-..--...-x.xx....-.-x......-.....-..-...-.x.-. 13 550Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C.Key: XCorrect.Incorrect-NILC

TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 3541xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 3850xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 2250.-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 1850........x....x..xxxx...x...xx....xxxxx--......xxx. 2 1750..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 1650..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 1550x..-x.....x.x.....-..........-...-..x.-....-..x... 6 650..xx............x.x....x....x.x..............xx... all 950x.x.x..xx...x........x.x.......x.....x..x...x...x. all 1350Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C.Key: XCorrect.Incorrect-NILC

TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 3541xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 3850xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 2250.-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 1850........x....x..xxxx...x...xx....xxxxx--......xxx. 2 1750..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 1650..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 1550x..-x.....x.x.....-..........-...-..x.-....-..x... 6 650..xx............x.x....x....x.x..............xx... all 950x.x.x..xx...x........x.x.......x.....x..x...x...x. all 1350Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C.Key: XCorrect.Incorrect-NILCalculate precision of block P = 22/100C

TREC10 Distribution NIL CORRECT OUT OF xxxxxxxxxxxxxx.xx.xxxxxxxxxxxxxx.x..xx.xx 0 3541xxxxxx-x.-x.xxxxxxxx..x-xxxxxxxxxx.xxxxx.x.xxx.-xx 4 3850xx.....x.-xx.....xx....x.xx.x..xxx.xx...xx.x..xx.x 1 2250.-...x.xx-..x..x.xx....xx.x...xx.....x..xxx....xx. 2 1850........x....x..xxxx...x...xx....xxxxx--......xxx. 2 1750..x.xxx...-x-...xx.....x...xx--.xx-....xx..x..x... 5 1650..x.x.-......x....x.x-.x.xx...-x-x-x-...-..x-x.x.x 8 1550x..-x.....x.x.....-..........-...-..x.-....-..x... 6 650..xx............x.x....x....x.x..............xx... all 950x.x.x..xx...x........x.x.......x.....x..x...x...x. all 1350Changing all answers in block to NIL gains 22-10 = 12 correct. Note confidence of leading element = C.Key: XCorrect.Incorrect-NILCalculate precision of block P = 22/100Calculate point with same local precision P. Note confidence K.CK

NIL Placement in TREC11 Answers ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????????????????????????? ?????????????????????????????????????????????????? ?????????????????????????????????????????????????? ??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????CSorted by confidence, but correctness unknownFind point with confidence C. (Block is of size 147)

NIL Placement in TREC11 Answers ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????????????????????????? ???????????????----------------------------------- -------------------------------------------------- --------------------------------------------------------------????????????????????????????????????????????????????????????????????????????????????????Find point with confidence K. Insert block at this point. Subtract C from all confidences to the right.Sorted by confidence, but correctness unknownFind point with confidence C. (Block is of size 147)Find point with confidence C. (Block is of size 147)Make all answers in block NIL, and add K-C to each confidence.

NIL Placement in TREC11 Answers - Impact ?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ???????????????????????????????????????????????????????????????????????????????????????????????????? ???????????????----------------------------------- -------------------------------------------------- --------------------------------------------------------------????????????????????????????????????????????????????????????????????????????????????????29 out of 46 NIL answers located recall of .639 previously-correct answers lostTotal of 20 correct questions gained 20/500 = 4%Minimal (< 0.5%) improvement in final AP score

Question ComplexitySimple questions are not a solved problem:Complex questions can be decomposed into simpler components.If simpler questions cannot be handled successfully, theres no hope for more complex ones.Areas not explored (intentionally) by TREC to date: spelling errors grammatical errors syntactic precision e.g. significance of articles not, only, just

Question ComplexityWhen was Queen Victoria born? King George IIIs only granddaughter to survive infancy was born in 1819 Victoria was the only daughter of Edward, Duke of Kent George IIIs fourth son Edward became Duke of Kent All of the current leading economic indicators point in the direction of the Federal Reserve Bank raising interest rates at next weeks meeting. Alan Greenspan, Fed chairman.42. (The Hitchhikers Guide to the Galaxy)Should the Fed raise interest rates?What is the meaning of life?

Question ComplexityNot a function of question alone, but rather the pair {question, corpus}In general, it is a function of the question and the resources to answer it, which include text corpora, databases, knowledge bases, ontologies and processing modulesComplexity Impedance Match

Future of QABy fixing resources, can make factoid QA more difficult by intentionally exploiting requirements for advanced NLP and/or reasoningQuestions that require more than one resource / document for an answerE.g. What is the relationship between A and B?Question decompositionCross-language QAHow to advance the field

Dimensions of QAAnswer TopologyCharacteristics of correct answer setLanguageVocabulary & SyntaxQuestion as a problemEnumeration, arithmetic, inferenceUser ModelWhos asking the questionOpinions, hypotheses, predictions, beliefs

Answer Set TopologyNo Answer, one, manyWhen are two different answers the same Natural variationSize of an elephantEstimationPopulationsVariation over timePopulations, Prime MinistersChoose correct presentation formatLists, charts, graphs, dialogues

LanguageThe biggest current roadblock to Question Answering is arguably Natural Language:AnaphoraDefinite Noun PhrasesSynonymsSubsumptionMetonymsParaphrasesNegation & other such qualificationNonce wordsIdiomsFigures of speechPoetic & other stylistic variations

Negation (1)Q: Who invented the electric guitar?A: While Mr. Fender did not invent the electric guitar, he did revolutionize and perfect it.Note: Not all instances of not will invalidate a passage.

Questions as Word ProblemsText MatchFind text that says London is the largest city in England (or paraphrase). Superlative SearchFind a table of English cities and their populations, and sort.Find a list of the 10 largest cities in the world, and see which are in England. Uses logic: if L > all objects in set R then L > all objects in set E < R.Find the population of as many individual English cities as possible, and choose the largest.HeuristicsLondon is the capital of England. (Not guaranteed to imply it is the largest city, but quite likely.)Complex Inference E.g. Birmingham is Englands second-largest city; Paris is larger than Birmingham; London is larger than Paris; London is in England. What is the largest city in England?

Negation (2)Name a US state where cars are manufactured. versusName a US state where cars are not manufactured.Certain kinds of negative events or instances are rarely asserted explicitly in text, but must be deduced by other means

Other Adverbial Modifiers (Only, Just etc.)Name an astronaut who nearly made it to the moonTo satisfactorily answer such questions, need to know what are the different ways in which events can fail to happen. In this case there are several.

Need for User ModelWhat is meant?The city: what granularity is required?The rock groupThe play/movieThe sports team (which one?)Can hardly choose the right answer without knowing who is asking the question, and why.Where is Chicago?What is mold?

Not all What is Questions are definitionalSubclass or instanceWhat is a powerful adhesive? Distinction from co-members of class What is a star fruit? Value or more common synonym What is a nanometer? What is rubella? Subclass/instance with property What is a yellow spotted lizard? Ambiguous: definition or instance What is an antacid?From a Web log:

Attention to DetailsTensesWho is the Prime Minister of Japan?NumberWhat are the largest snakes in the world?ArticlesWhat is mold?Where is the Taj Mahal?^^

Opinions, Hypotheses, Predictions and BeliefsWhat does X think about Y?Will X happen? X will happen, says Dr. AProf. B believes that X will happen.X will happen (asserted by article writer)e.g. Is global warming real?How many countries did the Pope visit in 1990? the Popes planned visit to Argentina

What is appropriate for QA?How much emphasis should be placed on:RetrievalBuilt-in knowledgeComputationEstimationInferenceSample questionsWhat is one plus one?How many $2 pencils can I buy for $10?How many genders are there?How many legs does a person have?How many books are there in a local library?What was the dilemma facing Hamlet?

Relationship QuestionsAn exercise in the ARDA AQUAINT program.What has been the relationship between Osama bin Laden and Sudan?What does Soviet Cosmonaut Valentina Tereshkova (Vladinrouna) and U.S. Astronaut Sally Ride have in common?What is the connection between actor and comedian Chris Rock and former Washington, D.C. mayor Marion Barry?Two approaches (Cycorp and IBM)

Cycorp ApproachUse original question terms as IR queryBreak top retrieved documents into sentencesGenerate Bayesian network with words as nodes from Sentence x Word matrixSelect ancestor terms to augment queryE.g. What is the connection between actor and comedian Chris Rock and former Washington, D.C. mayor Marion Barry?Augmentation terms = {drug, arrested}Iterate but where new network has sentences as nodesOutput sentences that are neighbours of augmented querySingle Strategy

IBM ApproachExtending pattern-based agentWhat is the relationship between X and Y? -> locate syntactic contexts with X and Y:conjunction subject-verb-objectobjects of prepositions.New profile-based agentLocal Context Analysis on documents containing either X or YForm vector of terms, normalize, intersect, sortWhat do Valentina Tereshkova and Sally Ride have in common? ->SpaceFirstWomanCollins (the first woman to ever fly the space shuttle) Multi-part Strategy, including:

Decomposition/Recursive QAWho/What is X require a profile of the subject QA-by-DossierCan generate auxiliary questions based on type of question focus.When/where was X born?When/where/how did X die? What occupation did X have?Can generate follow-up questions based on earlier answersWhat did X win?What did X write?What did X discover?

Constraint-based QAQA-by-Dossier-with-ConstraintsVariation of QA-by-DossierAsk auxiliary questions that constrain the answer to the original question.Prager et al. (submitted)

When did Leonardo paint the Mona Lisa?

ConstraintsCapitalize on existence of natural relationships between events/situations that can be used as constraintsE.g. A persons achievements occurred during his/her lifetime.Develop constraints for a person and an achievement event:date(died) = date(born) + 10date(event)

Auxiliary QuestionsWhen was Leonardo born?When did Leonardo die?

ScoreAnswer1.9915192.9819893.9614524.6019885.601990

Dossier-with-Constraints ProcessOriginalQuestionAuxiliaryQuestionsConstraintsConstraint Satisfaction +Confidence Combination++

Dossier forLeonardoBorn = 1452Died = 1519Painted Mona Lisa = 1503

Cross-Language QAProbably easiest approach is to translate question to language of collection, and perform monolingual QAAll considerations that apply to CL-IR apply to CL-QA, and then some:Named Entity RecognitionParsersOntologies

Cross-Language QAJung and Lee, 2002. User Query -> NLP -> SQL -> Relational DatabaseMorphological Analysis and Linguistic Resources are language dependent. Generate Lexico-Semantic patterns

Cross-Language QATREC CLIR for several yearsCLEF (Cross-Language Evaluation Forum) http://clef.iei.pi.cnr.it:2002/CLIR activities for several yearsCL-QA in 2003 http://clef-qa.itc.it/

ReferencesAbney, S., Collins, M. and Singhal, A. Answer Extraction. In Proceedings ANLP 2000.E. Brill, J. Lin, M. Banko, S. Dumais and A. Ng, Data-Intensive Question Answering, in Proceedings of the 10th Text Retrieval Conference (TREC-2001), NIST, Gaithersburg, MD, 2002.D. Bikel, R. Schwartz, R. Weischedel, "An Algorithm that Learns What's in a Name," Machine Learning, 1999. Byrd, R. and Ravin, Y. Identifying and Extracting Relations in Text. In Proceedings of NLDB 99, Klagenfurt, Austria, 1999.Jennifer Chu-Carroll, John Prager, Christopher Welty, Krzysztof Czuba and David Ferrucci. "A Multi-Strategy and Multi-Source Approach to Question Answering", Proceedings of TREC2002, Gaithersburg, MD, 2003.Clarke, C.L.A., Cormack, G.V., Kisman, D.I.E. and Lynam, T.R. Question answering by passage selection (Multitext experiments for TREC-9) in Proceedings of the 9th Text Retrieval Conference, pp. 673-683, NIST, Gaithersburg, MD, 2001.Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus and Paul Morarescu, FALCON: Boosting Knowledge for Answer Engines, in Proceedings of the 9th Text Retrieval Conference, pp. 479-488, NIST, Gaithersburg MD, 2001.Sanda Harabagiu, Dan Moldovan, Marius Pasca, Rada Mihalcea, Mihai Surdeanu, Razvan Bunescu, Roxana Girju, Vasile Rus and Paul Morarescu, The Role of Lexico-Semantic Feedback in Open-Domain Textual Question-Answering, in Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics (ACL-2001), July 2001, Toulouse France, pages 274-281. Gary G. Hendrix, Earl D. Sacerdoti, Daniel Sagalowicz, Jonathan Slocum: Developing a Natural Language Interface to Complex Data. VLDB 1977: 292

ReferencesHovy, E., Gerber, L., Hermjakob, U., Junk, M., and Lin, C-Y. Question answering in Webclopedia in Proceedings of the 9th Text Retrieval Conference, pp. 655-664, NIST, Gaithersburg, MD, 2001.Ulf Hermjakob, Abdessamad Echihabi and Daniel Marcu, Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Proceedings of TREC2002, Gaithersburg MD, 2003. Hanmin Jung, Gary Geunbae Lee, Multilingual Question Answering with High Portability on Relational Databases Workshop on Multilingual Summarization and Question Answering, COLING 2002Boris Katz. Annotating the World Wide Web using natural language. Proceedings RIAO 1997.Kupiec, J. Murax: A robust linguistic approach for question answering using an on-line encyclopedia. Proceedings 16th SIGIR, Pittsburgh, PA 2001.Lenat, D. B. 1995. "Cyc: A Large-Scale Investment in Knowledge Infrastructure." Communications of the ACM 38, no. 11.Mihalcea, R. and Moldovan, D. A Method for Word Sense Disambiguation of Unrestricted Text. Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL-99), pp. 152-158, College Park, MD, 1999.Miller, G. WordNet: A Lexical Database for English, Communications of the ACM 38(11) pp. 39-41, 1995.Dan I. Moldovan and Vasile Rus, ``Logic Form Transformation of WordNet and its Applicability to Question Answering'', Proceedings of the ACL 2001 Conference, July 2001,Toulouse, France. Marius Pasca and Sanda Harabagiu, High Performance Question/Answering, in Proceedings of the 24th Annual International ACL SIGIR Conference on Research and Development in Information Retrieval (SIGIR-2001), September 2001, New Orleans LA, pages 366-374.

ReferencesJohn M. Prager, Jennifer Chu-Carroll and Krzysztof Czuba, "A Multi-Strategy, Multi-Question Approach to Question Answering" submitted for publication.Prager, J.M., Chu-Carroll, J., Brown, E.W. and Czuba, K. "Question Answering by Predictive Annotation, in Advances in Open-Domain Question-Answering", Strzalkowski, T. and Harabagiu, S. Eds., Kluwer Academic Publishers, to appear 2003?.Prager, J.M., Radev, D.R. and Czuba, K. Answering What-Is Questions by Virtual Annotation. Proceedings of Human Language Technologies Conference, San Diego CA, March 2001.Prager, J.M., Brown, E.W., Coden, A. and Radev, R. "Question-Answering by Predictive Annotation. Proceedings of SIGIR 2000, pp. 184-191, Athens, Greece. Radev, D.R., Qi, H., Zheng, Z., Blair-Goldensohn, S., Zhang, Z., Fan, W. & Prager, J.M. Mining the Web for Answers to Natural Language Questions, Proceedings of CIKM, Altlanta GA., 2001.Radev, D.R., Prager, J.M. and Samn, V. "Ranking Suspected Answers to Natural Language Questions using Predictive Annotation. Proceedings of ANLP 2000, pp. 150-157, Seattle, WA. Deepak Ravichandran and Eduard Hovy, Learning Surface Text Patterns for a Question Answering System. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 41-47.Rosch, E. et al. Basic Objects in Natural Categories, Cognitive Psychology 8, pp. 382-439, 1976.Soubbotin, M. Patterns of Potential Answer Expressions as Clues to the Right Answers in Proceedings of the 10th Text Retrieval Conference, pp. 293-302, NIST, Gaithersburg, MD, 2002.Soubbotin, M. and Soubbotin, S. Use of Patterns for Detection of Answer Strings: A Systematic Approach in Proceedings of the 11th Text Retrieval Conference, pp. 325-331, NIST, Gaithersburg, MD, 2003.

ReferencesEllen M. Voorhees and Dawn Tice. 2000. Building a question answering test collection. In 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 200-207, Athens, August.N. Wacholder, Y. Ravin and M. Choi. Disambiguation of Proper Names in Text, Proceedings of ANLP97. Washington, DC, April 1997.Warren, David H.D., & Fernando C.N. Pereira (1982) "An efficient easily adaptable system for interpreting natural language queries," Computational Linguistics, 8:3-4, 110-122. Terry Winograd. 1972. Procedures as a representation for data in a computer program for under-standing natural language. Cognitive Psychology, 3(1).

Question Answering Tutorial

Documents

Transcript of Question Answering Tutorial