Answering Questions by Computer

87
1 Answering Questions by Computer

description

Answering Questions by Computer. Terminology – Question Type. Question Type : an idiomatic categorization of questions for purposes of distinguishing between different processing strategies and/or answer formats E.g. TREC2003 FACTOID: “How far is it from Earth to Mars?” - PowerPoint PPT Presentation

Transcript of Answering Questions by Computer

  • Answering Questions by Computer

  • Terminology Question TypeQuestion Type: an idiomatic categorization of questions for purposes of distinguishing between different processing strategies and/or answer formatsE.g. TREC2003 FACTOID: How far is it from Earth to Mars? LIST: List the names of chewing gums DEFINITION: Who is Vlad the Impaler?Other possibilities: RELATIONSHIP: What is the connection between Valentina Tereshkova and Sally Ride? SUPERLATIVE: What is the largest city on Earth? YES-NO: Is Saddam Hussein alive? OPINION: What do most Americans think of gun control? CAUSE&EFFECT: Why did Iraq invade Kuwait?

  • Terminology Answer TypeAnswer Type: the class of object (or rhetorical type of sentence) sought by the question. E.g.PERSON (from Who )PLACE (from Where )DATE (from When )NUMBER (from How many )but alsoEXPLANATION (from Why )METHOD (from How )Answer types are usually tied intimately to the classes recognized by the systems Named Entity Recognizer.

  • Terminology Question FocusQuestion Focus: The property or entity that is being sought by the question.E.g.In what state is the Grand Canyon?What is the population of Bulgaria?What colour is a pomegranate?

  • Terminology Question TopicQuestion Topic: the object (person, place, ) or event that the question is about. The question might well be about a property of the topic, which will be the question focus.E.g. What is the height of Mt. Everest?height is the focusMt. Everest is the topic

  • Terminology Candidate PassageCandidate Passage: a text passage (anything from a single sentence to a whole document) retrieved by a search engine in response to a question.Depending on the query and kind of index used, there may or may not be a guarantee that a candidate passage has any candidate answers.Candidate passages will usually have associated scores, from the search engine.

  • Terminology Candidate AnswerCandidate Answer: in the context of a question, a small quantity of text (anything from a single word to a sentence or bigger, but usually a noun phrase) that is of the same type as the Answer Type.In some systems, the type match may be approximate, if there is the concept of confusability.Candidate answers are found in candidate passagesE.g.50Queen Elizabeth IISeptember 8, 2003by baking a mixture of flour and water

  • Terminology Authority ListAuthority List (or File): a collection of instances of a class of interest, used to test a term for class membership.Instances should be derived from an authoritative source and be as close to complete as possible.Ideally, class is small, easily enumerated and with members with a limited number of lexical forms.Good:Days of weekPlanetsElementsGood statistically, but difficult to get 100% recall:AnimalsPlantsColoursProblematicPeopleOrganizationsImpossibleAll numeric quantitiesExplanations and other clausal quantities

  • Essence of Text-based QANeed to find a passage that answers the question.Find a candidate passage (search)Check that semantics of passage and question matchExtract the answer(Single source answers)

  • Ranking Candidate AnswersAnswer type: PersonText passage: Among them was Christa McAuliffe, the first private citizen to fly in space. Karen Allen, best known for her starring role in Raiders of the Lost Ark, plays McAuliffe. Brian Kerwin is featured as shuttle pilot Mike Smith...Q066: Name the first private citizen to fly in space.

  • Answer ExtractionAlso called Answer Selection/PinpointingGiven a question and candidate passages, the process of selecting and ranking candidate answers.Usually, candidate answers are those terms in the passages which have the same answer type as that generated from the questionRanking the candidate answers depends on assessing how well the passage context relates to the question3 Approaches:Heuristic featuresShallow parse fragmentsLogical proof

  • Features for Answer RankingNumber of question terms matched in the answer passageNumber of question terms matched in the same phrase as the candidate answerNumber of question terms matched in the same sentence as the candidate answerFlag set to 1 if the candidate answer is followed by a punctuation signNumber of question terms matched, separated from the candidate answer by at most three words and one commaNumber of terms occurring in the same order in the answer passage as in the questionAverage distance from candidate answer to question term matchesSIGIR 01

  • Heuristics for Answer Ranking in the Lasso SystemSame_Word_Sequence_score number of words from the question that are recognized in the same sequence in the passage.Punctuation_sign_score a flag set to 1 if the candidate answer is followed by a punctuation sign Comma_3_word_score measure the number of question words that follow the candidate, if the candidate is followed by a coma. Same_parse_subtree_score number of question words found in the parse sub-tree of the answerSame_sentence_score number of question words found in the answers sentence.Distance score adds the distance (measured in number of words) between the answer candidate and the other keywords in the window.

  • Heuristics for Answer Ranking in the Lasso SystemcontinuedFinally..

  • EvaluationEvaluation of this kind of system is usually based on some kind of TREC-like metric.In Q/A the most frequent metric isMean reciprocal rankYoure allowed to return N answers. Your score is based on 1/Rank of the first right answer.Averaged over all the questions you answer.

  • Answer Types and ModifiersMost likely there is no type for French CitiesSo will look for CITYinclude French/France in bag of words, and hope for the bestinclude French/France in bag of words, retrieve documents, and look for evidence (deep parsing, logic)use high-precision Language Identification on resultsIf you have a list of French cities, could eitherFilter results by listUse Answer-Based QA (see later)Use longitude/latitude information of cities and countriesName 5 French Cities

  • Answer Types and ModifiersMost likely there is no type for female figure skaterMost likely there is no type for figure skaterLook for PERSON, with query terms {figure, skater}What to do about female? Two approaches.Include female in the bag-of-words. Relies on logic that if femaleness is an interesting property, it might well be mentioned in answer passages. Does not apply to, say singer.Leave out female but test candidate answers for gender. Needs either an authority file or a heuristic test.Test may not be definitive.Name a female figure skater

  • Part II - Specific Approaches

    By GenreStatistical QAPattern-based QAWeb-based QAAnswer-based QA (TREC only)By SystemSMULCCUSC-ISIInsightMicrosoftIBM StatisticalIBM Rule-based

  • Statistical QAUse statistical distributions to model likelihoods of answer type and answerE.g. IBM (Ittycheriah, 2001) see later section

  • Pattern-based QAFor a given question type, identify the typical syntactic constructions used in text to express answers to such questionsTypically very high precision, but a lot of work to get decent recall

  • Web-Based QAExhaustive string transformationsBrill et al. 2002Learning Radev et al. 2001

  • Answer-Based QAProblem: Sometimes it is very easy to find an answer to a question using resource A, but the task demands that you find it in resource B.Solution: First find the answer in resource A, then locate the same answer, along with original question terms, in resource B.Artificial problem, but real for TREC participants.

  • Answer-Based QAWeb-Based solution:

    When a QA system looks for answers within a relatively small textual collection, the chance of finding strings/sentences that closely match the question string is small. However, when a QA system looks for strings/sentences that closely match the question string on the web, the chance of finding correct answer is much higher. Hermjakob et al. 2002Why this is true:The Web is much larger than the TREC Corpus (3,000 : 1)TREC questions are generated from Web logs, and the style of language (and subjects of interest) in these logs are more similar to the Web content than to newswire collections.

  • Answer-Based QADatabase/Knowledge-base/Ontology solution:When question syntax is simple and reliably recognizable, can express as a logical formLogical form represents entire semantics of question, and can be used to access structured resource:WordNetOn-line dictionariesTables of facts & figuresKnowledge-bases such as CycHaving found answerconstruct a query with original question terms + answerRetrieve passagesTell Answer Extraction the answer it is looking for

  • Approaches of Specific SystemsSMU FalconLCCUSC-ISIInsightMicrosoftIBMNote: Some of the slides and/or examples in these sections are taken from papers or presentations from the respective system authors

  • SMU FalconHarabagiu et al. 2000

  • SMU FalconFrom question, dependency structure called question semantic form is createdQuery is Boolean conjunction of terms From answer passages that contain at least one instance of answer type, generate answer semantic form3 processing loops:Loop 1Triggered when too few or too many passages are retrieved from search engineLoop 2Triggered when question semantic form and answer semantic form cannot be unifiedLoop 3Triggered when unable to perform abductive proof of answer correctness

  • SMU FalconLoops provide opportunities to perform alternationsLoop 1: morphological expansions and nominalizationsLoop 2: lexical alternations synonyms, direct hypernyms and hyponymsLoop 3: paraphrasesEvaluation (Pasca & Harabagiu, 2001). Increase in accuracy in 50-byte task in TREC9Loop 1: 40%Loop 2: 52%Loop 3: 8%Combined: 76%

  • LCCMoldovan & Rus, 2001Uses Logic Prover for answer justificationQuestion logical formCandidate answers in logical formXWN glossesLinguistic axiomsLexical chainsInference engine attempts to verify answer by negating question and proving a contradictionIf proof fails, predicates in question are gradually relaxed until proof succeeds or associated proof score is below a threshold.

  • LCC: Lexical ChainsQ:1518 What year did Marco Polo travel to Asia?Answer: Marco polo divulged the truth after returning in 1292 from his travels, which included several months on Sumatra Lexical Chains: (1) travel_to:v#1 -> GLOSS -> travel:v#1 -> RGLOSS -> travel:n#1 (2) travel_to#1 -> GLOSS -> travel:v#1 -> HYPONYM -> return:v#1 (3) Sumatra:n#1 -> ISPART -> Indonesia:n#1 -> ISPART -> Southeast _Asia:n#1 -> ISPART -> Asia:n#1

    Q:1570 What is the legal age to vote in Argentina?Answer: Voting is mandatory for all Argentines aged over 18.Lexical Chains: (1) legal:a#1 -> GLOSS -> rule:n#1 -> RGLOSS -> mandatory:a#1(2) age:n#1 -> RGLOSS -> aged:a#3(3) Argentine:a#1 -> GLOSS -> Argentina:n#1

  • LCC: Logic ProverQuestionWhich company created the Internet Browser Mosaic?QLF: (_organization_AT(x2) ) & company_NN(x2) & create_VB(e1,x2,x6) & Internet_NN(x3) & browser_NN(x4) & Mosaic_NN(x5) & nn_NNC(x6,x3,x4,x5)Answer passage... Mosaic , developed by the National Center for Supercomputing Applications ( NCSA ) at the University of Illinois at Urbana - Champaign ...ALF: ... Mosaic_NN(x2) & develop_VB(e2,x2,x31) & by_IN(e2,x8) & National_NN(x3) & Center_NN(x4) & for_NN(x5) & Supercomputing_NN(x6) & application_NN(x7) & nn_NNC(x8,x3,x4,x5,x6,x7) & NCSA_NN(x9) & at_IN(e2,x15) & University_NN(x10) & of_NN(x11) & Illinois_NN(x12) & at_NN(x13) & Urbana_NN(x14) & nn_NNC(x15,x10,x11,x12,x13,x14) & Champaign_NN(x16) ... Lexical Chains develop make and make create exists x2 x3 x4 all e2 x1 x7 (develop_vb(e2,x7,x1) make_vb(e2,x7,x1) & something_nn(x1) & new_jj(x1) & such_jj(x1) & product_nn(x2) & or_cc(x4,x1,x3) & mental_jj(x3) & artistic_jj(x3) & creation_nn(x3)).all e1 x1 x2 (make_vb(e1,x1,x2) create_vb(e1,x1,x2) & manufacture_vb(e1,x1,x2) & man-made_jj(x2) & product_nn(x2)). Linguistic axiomsall x0 (mosaic_nn(x0) -> internet_nn(x0) & browser_nn(x0))

  • USC-ISITextmap system Ravichandran and Hovy, 2002Hermjakob et al. 2003Use of Surface Text PatternsWhen was X born ->Mozart was born in 1756Gandhi (1869-1948)Can be captured in expressions was born in ( -These patterns can be learned

  • USC-ISI TextMapUse bootstrapping to learn patterns. For an identified question type (When was X born?), start with known answers for some values of XMozart 1756Gandhi 1869Newton 1642Issue Web search engine queries (e.g. +Mozart +1756 )Collect top 1000 documentsFilter, tokenize, smooth etc.Use suffix tree constructor to find best substrings, e.g.Mozart (1756-1791)FilterMozart (1756-Replace query strings with e.g. and

    Determine precision of each patternFind documents with just question term (Mozart)Apply patterns and calculate precision

  • USC-ISI TextMapFinding AnswersDetermine Question typePerform IR QueryDo sentence segmentation and smoothingReplace question term by question tag i.e. replace Mozart with Search for instances of patterns associated with question typeSelect words matching Assign scores according to precision of pattern

  • InsightSoubbotin, 2002. Soubbotin & Soubbotin, 2003.Performed very well in TREC10/11 Comprehensive and systematic use of Indicative patternsE.g.cap word; paren; 4 digits; dash; 4 digits; parenmatchesMozart (1756-1791)The patterns are broader than named entities Semantics in syntaxPatterns have intrinsic scores (reliability), independent of question

  • InsightPatterns with more sophisticated internal structure are more indicative of answer2/3 of their correct entries in TREC10 were answered by patternsE.g.a == {countries}b == {official posts}w == {proper names (first and last)}e == {titles or honorifics}Patterns for Who is the President (Prime Minister) of given country?abewwewwdb,ab,aewwDefinition questions: (A is primary query term, X is answer)

    For: Moulin Rouge, a cabaret

    For: naturally occurring gas called methane

    For: Michigans state flower is the apple blossom

  • InsightEmphasis on shallow techniques, lack of NLPLook in vicinity of text string potentially matching pattern for zeroing e.g. for occupational roles:FormerElectDeputyNegationComments:Relies on redundancy of large corpusWorks for factoid question types of TREC-QA not clear how it extendsNot clear how they match questions to patternsNamed entities within patterns have to be recognized

  • MicrosoftData-Intensive QA. Brill et al. 2002Overcoming the surface string mismatch between the question formulation and the string containing the answerApproach based on the assumption/intuition that someone on the Web has answered the question in the same way it was asked.Want to avoid dealing with:Lexical, syntactic, semantic relationships (bet. Q & A)Anaphora resolutionSynonymyAlternate syntaxIndirect answersTake advantage of redundancy on Web, then project to TREC corpus (Answer-based QA)

  • Microsoft AskMSRFormulate multiple queries each rewrite has intrinsic score. E.g. for What is relative humidity?[+is relative humidity, LEFT, 5][relative +is humidity, RIGHT, 5][relative humidity +is, RIGHT, 5][relative humidity, NULL, 2][relative AND humidity, NULL, 1]Get top 100 documents from GoogleExtract n-grams from document summariesScore n-grams by summing the scores of the rewrites it came fromUse tiling to merge n-gramsSearch for supporting documents in TREC corpus

  • Microsoft AskMSRQuestion is: What is the rainiest place on EarthAnswer from Web is: Mount WaialealePassage in TREC corpus is: In misty Seattle, Wash., last year, 32 inches of rain fell. Hong Kong gets about 80 inches a year, and even Pago Pago, noted for its prodigious showers, gets only about 196 inches annually. (The titleholder, according to the National Geographic Society, is Mount Waialeale in Hawaii, where about 460 inches of rain falls each year.) Very difficult to imagine getting this passage by other means

  • IBM Statistical QA (Ittycheriah, 2001)ATM predicts, from the question and a proposed answer, the answer type they both satisfy Given a question, an answer, and the predicted answer type, ASM seeks to model the correctness of this configuration.Distributions are modelled using a maximum entropy formulation Training data = human judgmentsFor ATM, 13K questions annotated with 31 categoriesFor ASM, ~ 5K questions from TREC plus triviap(c|q,a)= Se p(c,e|q,a)= Se p(c|e,q,a) p(e|q,a)q = questiona = answerc = correctnesse = answer typep(e|q,a) is the answer type model (ATM)p(c|e,q,a) is the answer selection model (ASM)

  • IBM Statistical QA (Ittycheriah)Question Analysis (by ATM)Selects one out of 31 categoriesSearchQuestion expanded by Local Context AnalysisTop 1000 documents retrievedPassage Extraction: Top 100 passages that:Maximize question word matchHave desired answer typeMinimize dispersion of question wordsHave similar syntactic structure to questionAnswer Extraction:Candidate answers ranked using ASM

  • IBM Rule-basedPredictive Annotation (Prager 2000, Prager 2003)

    Want to make sure passages retrieved by search engine have at least one candidate answerRecognize that candidate answer is of correct answer type which corresponds to a label (or several) generated by Named Entity RecognizerAnnotate entire corpus and index semantic labels along with textIdentify answer types in questions and include corresponding labels in queries

  • IBM PIQUANTPredictive Annotation E.g.: Question is Who invented baseball?Who can map to PERSON$ or ORGANIZATION$Suppose we assume only people invent things (it doesnt really matter).

    So Who invented baseball? -> {PERSON$ invent baseball}

    Consider text but its conclusion was based largely on the recollections of a man named Abner Graves, an elderly mining engineer, who reported that baseball had been "invented" by Doubleday between 1839 and 1841.

  • IBM PIQUANTPredictive Annotation Previous exampleWho invented baseball? -> {PERSON$ invent baseball}However, same structure is equally effective at answering What sport did Doubleday invent? -> {SPORT$ invent Doubleday}

  • IBM Rule-BasedHandling Subsumption & DisjunctionIf an entity is of a type which has a parent type, then how is annotation done?If a proposed answer type has a parent type, then what answer type should be used?If an entity is ambiguous then what should the annotation be?If the answer type is ambiguous, then what should be used?

    Guidelines:If an entity is of a type which has a parent type, then how is annotation done?If a proposed answer type has a parent type, then what answer type should be used?If an entity is ambiguous then what should the annotation be?If the answer type is ambiguous, then what should be used?

  • Subsumption & DisjunctionConsider New York City both a CITY and a PLACETo answer Where did John Lennon die?, it needs to be a PLACETo answer In what city is the Empire State Building?, it needs to be a CITY.Do NOT want to do subsumption calculation in search engineTwo scenarios 1. Expand Answer Type and use most specific entity annotation 1A { (CITY PLACE) John_Lennon die} matches CITY 1B {CITY Empire_State_Building} matches CITYOr2. Use most specific Answer Type and multiple annotations of NYC 2A {PLACE John_Lennon die} matches (CITY PLACE) 2B {CITY Empire_State_Building} matches (CITY PLACE)Case 2 preferred for simplicity, because disjunction in #1 should contain all hyponyms of PLACE, while disjunction in #2 should contain all hypernyms of CITYChoice #2 suggests can use disjunction in answer type to represent ambiguity:Who invented the laser -> {(PERSON ORGANIZATION) invent laser}

  • Clausal classesAny structure that can be recognized in text can be annotated.QuotationsExplanationsMethodsOpinionsAny semantic class label used in annotation can be indexed, and hence used as a target of search:What did Karl Marx say about religion?Why is the sky blue?How do you make bread?What does Arnold Schwarzenegger think about global warming?

  • Named Entity Recognition

  • IBMPredictive Annotation Improving Precision at no cost to RecallE.g.: Question is Where is Belize?Where can map to (CONTINENT$, WORLDREGION$, COUNTRY$, STATE$, CITY$, CAPITAL$, LAKE$, RIVER$ ). But we know Belize is a country.So Where is Belize? -> {(CONTINENT$ WORLDREGION$) Belize} Belize occurs 1068 times in TREC corpusBelize and PLACE$ co-occur in only 537 sentencesBelize and CONTINENT$ or WORLDREGION$ co-occur in only 128 sentences

  • Virtual Annotation (Prager 2001)Use WordNet to find all candidate answers (hypernyms)Use corpus co-occurrence statistics to select best onesRather like approach to WSD by Mihalcea and Moldovan (1999)

  • Parentage of nematode

  • Parentage of meerkat

  • Natural CategoriesBasic Objects in Natural Categories Rosch et al. (1976)According to psychological testing, these are categorization levels of intermediate specificity that people tend to use in unconstrained settings.

  • What is this?

  • What can we conclude?There are descriptive terms that people are drawn to use naturally.We can expect to find instances of these in text, in the right contexts.These terms will serve as good answers.

  • Virtual Annotation (cont.)Find all parents of query term in WordNetLook for co-occurrences of query term and parent in text corpusExpect to find snippets such as: meerkats and other Y Many different phrasings are possible, so we just look for proximity, rather than parse.Scoring:Count co-occurrences of each parent with search term, and divide by level number (only levels >= 1), generating Level-Adapted Count (LAC).Exclude very highest levels (too general).Select parent with highest LAC plus any others with LAC within 20%.

  • Parentage of nematode

  • Parentage of meerkat

  • Sample Answer PassagesWhat is a nematode? -> Such genes have been found in nematode worms but not yet in higher animals.

    What is a meerkat? -> South African golfer Butch Kruger had a good round going in the central Orange Free State trials, until a mongoose-like animal grabbed his ball with its mouth and dropped down its hole. Kruger wrote on his card: "Meerkat."

    Use Answer-based QA to locate answers

  • Use of Cyc as Sanity CheckerCyc: Large Knowledge-base and Inference engine (Lenat 1995)A post-hoc process for Rejecting insane answersHow much does a grey wolf weigh? 300 tonsBoosting confidence for sane answersSanity checker invoked withPredicate, e.g. weightFocus, e.g. grey wolfCandidate value, e.g. 300 tonsSanity checker returnsSane: + or 10% of value in CycInsane: outside of the reasonable rangePlan to use distributions instead of rangesDont knowConfidence score highly boosted when answer is sane

  • Cyc Sanity Checking ExampleTrec11 Q: What is the population of Maryland?Without sanity checkingPIQUANTs top answer: 50,000Justification: Marylands population is 50,000 and growing rapidly.Passage discusses an exotic species nutria, not humansWith sanity checkingCyc knows the population of Maryland is 5,296,486It rejects the top insane answersPIQUANTs new top answer: 5.1 million with very high confidence

  • AskMSRProcess the question byForming a search engine query from the original questionDetecting the answer typeGet some resultsExtract answers of the right type based onHow often they occur

  • AskMSR

  • Step 1: Rewrite the questionsIntuition: The users question is often syntactically quite close to sentences that contain the answer

    Where is the Louvre Museum located? The Louvre Museum is located in ParisWho created the character of Scrooge?Charles Dickens created the character of Scrooge.

  • Query rewritingClassify question into seven categories

    Who is/was/are/were?When is/did/will/are/were ?Where is/are/were ?

    a. Hand-crafted category-specific transformation rulese.g.: For where questions, move is to all possible locations Look to the right of the query terms for the answer.

    Where is the Louvre Museum located? is the Louvre Museum located the is Louvre Museum located the Louvre is Museum located the Louvre Museum is located the Louvre Museum located is

  • Step 2: Query search engineSend all rewrites to a Web search engineRetrieve top N answers (100-200)For speed, rely just on search engines snippets, not the full text of the actual document

  • Step 3: Gathering N-GramsEnumerate all N-grams (N=1,2,3) in all retrieved snippetsWeight of an n-gram: occurrence count, each weighted by reliability (weight) of rewrite rule that fetched the documentExample: Who created the character of Scrooge?Dickens 117Christmas Carol 78Charles Dickens 75Disney 72Carl Banks 54A Christmas 41Christmas Carol 45Uncle 31

  • Step 4: Filtering N-GramsEach question type is associated with one or more data-type filters = regular expressions for answer typesBoost score of n-grams that match the expected answer type.Lower score of n-grams that dont match.

  • Step 5: Tiling the Answers Dickens Charles Dickens Mr CharlesScores

    20

    15

    10 merged, discard old n-grams Mr Charles DickensScore 45

  • ResultsStandard TREC contest test-bed (TREC 2001): 1M documents; 900 questionsTechnique does ok, not great (would have placed in top 9 of ~30 participants)But with access to the Web They do much better, would have come in second on TREC 2001

  • IssuesIn many scenarios (e.g., monitoring an individuals email) we only have a small set of documentsWorks best/only for Trivial Pursuit-style fact-based questionsLimited/brittle repertoire ofquestion categoriesanswer data types/filtersquery rewriting rules

  • ISI: Surface patterns approachUse of Characteristic Phrases"When was bornTypical answers"Mozart was born in 1756."Gandhi (1869-1948)...Suggests phrases like" was born in " ( -as Regular Expressions can help locate correct answer

  • Use Pattern LearningExample:The great composer Mozart (1756-1791) achieved fame at a young ageMozart (1756-1791) was a geniusThe whole world would always be indebted to the great music of Mozart (1756-1791)Longest matching substring for all 3 sentences is "Mozart (1756-1791)Suffix tree would extract "Mozart (1756-1791)" as an output, with score of 3Reminiscent of IE pattern learning

  • Pattern Learning (cont.)Repeat with different examples of same question typeGandhi 1869, Newton 1642, etc.Some patterns learned for BIRTHDATEa. born in , b. was born on , c. ( -d. ( - )

  • Experiments6 different Q typesfrom Webclopedia QA Typology (Hovy et al., 2002a)BIRTHDATELOCATIONINVENTORDISCOVERERDEFINITIONWHY-FAMOUS

  • Experiments: pattern precisionBIRTHDATE table:1.0 ( - )0.85 was born on ,0.6 was born in 0.59 was born 0.53 was born0.50- ( 0.36 ( -INVENTOR1.0 invents 1.0the was invented by 1.0 invented the in

  • Experiments (cont.)DISCOVERER1.0when discovered 1.0's discovery of 0.9 was discovered by inDEFINITION1.0 and related 1.0form of , 0.94as , and

  • Experiments (cont.)WHY-FAMOUS1.0 called1.0laureate 0.71 is the ofLOCATION1.0's 1.0regional : : 0.92near in Depending on question type, get high MRR (0.60.9), with higher results from use of Web than TREC QA collection

  • Shortcomings & ExtensionsNeed for POS &/or semantic types"Where are the Rocky Mountains?"Denver's new airport, topped with white fiberglass cones in imitation of the Rocky Mountains in the background , continues to lie empty in NE tagger &/or ontology could enable system to determine "background" is not a location

  • Shortcomings... (cont.)Long distance dependencies"Where is London?"London, which has one of the most busiest airports in the world, lies on the banks of the river Thameswould require pattern like: , ()*, lies on Abundance & variety of Web data helps system to find an instance of patterns w/o losing answers to long distance dependencies

  • Shortcomings... (cont.)System currently has only one anchor wordDoesn't work for Q types requiring multiple words from question to be in answer"In which county does the city of Long Beach lie?"Long Beach is situated in Los Angeles Countyrequired pattern: is situated in Does not use case"What is a micron?"...a spokesman for Micron, a maker of semiconductors, said SIMMs are..."If Micron had been capitalized in question, would be a perfect answer

  • QA Typology from ISI (USC)Typology of typical Q forms94 nodes (47 leaf nodes)Analyzed 17,384 questions (from answers.com)

  • Question Answering ExampleHow hot does the inside of an active volcano get? get(TEMPERATURE, inside(volcano(active))) lava fragments belched out of the mountain were as hot as 300 degrees Fahrenheit fragments(lava, TEMPERATURE(degrees(300)), belched(out, mountain)) volcano ISA mountain lava ISPARTOF volcano lava inside volcano fragments of lava HAVEPROPERTIESOF lava The needed semantic information is in WordNet definitions, and was successfully translated into a form that was used for rough proofs

  • ReferencesAskMSR: Question Answering Using the Worldwide WebMichele Banko, Eric Brill, Susan Dumais, Jimmy Linhttp://www.ai.mit.edu/people/jimmylin/publications/Banko-etal-AAAI02.pdfIn Proceedings of 2002 AAAI SYMPOSIUM on Mining Answers from Text and Knowledge Bases, March 2002Web Question Answering: Is More Always Better?Susan Dumais, Michele Banko, Eric Brill, Jimmy Lin, Andrew Nghttp://research.microsoft.com/~sdumais/SIGIR2002-QA-Submit-Conf.pdfD. Ravichandran and E.H. Hovy. 2002. Learning Surface Patterns for a Question Answering System. ACL conference, July 2002.

  • Harder QuestionsFactoid question answering is really pretty silly.A more interesting task is one where the answers are fluid and depend on the fusion of material from disparate texts over time.Who is Condoleezza Rice?Who is Mahmoud Abbas?Why was Arafat flown to Paris?