Open Domain Question Answering Lide Wu Dept. of Computer Science Fudan University Shanghai 200433...
-
Upload
madison-leigh -
Category
Documents
-
view
222 -
download
1
Transcript of Open Domain Question Answering Lide Wu Dept. of Computer Science Fudan University Shanghai 200433...
Open Domain Question Answering
Lide Wu
Dept. of Computer Science
Fudan University
Shanghai 200433
China
Outline
• What is open domain question answering (ODQA)
• The state of arts of ODQA
• The future of ODQA
• ODQA as a grand challenge in CS/AI/IT
• Summary
What’s QA?
Free TextCorpus
question answer
When did Hawaii become
a state ? August
21, 1959
When did Hawaii become a state?
• AnswerBus Question Answering System - When did Hawaii become a ... Type in your question in English, French, Spanish, German, Italianor Portuguese. Question: When did Hawaii become a state? ... www.answerbus.com/cgi-bin/ answer.cgi?When%2Bdid%2BHawaii%2Bbecome%2Ba%2Bstate%3F - 4k - Cached - Similar pages
• uncategorized threads in About Hawaii... How did Hawaii become a state? What is the history of Hawaii?? ... When and whydid Hawaii become a state (cause and effect); Safe to live by Mauna Loa? ... www.greenspun.com/bboard/ q-and-a-one-category.tcl?topic=About%20Hawaii&category=uncategorized - 5k - Cached - Similar pages
• Is Hawaii Really a State of the Union?... Become a state, or remain a territory? Why was the option of independence not onthe ballot? Did Hawaii not have the option to become an independent country in ... www.hawaii-nation.org/statehood.html - 14k - Cached - Similar pages
• Hawaii Flag Printout - EnchantedLearning.com... __________________________________________. 3. When did Hawaii become a stateof the USA? _______________. Copyright ©2000-2003 EnchantedLearning.com. www.enchantedlearning.com/usa/ flags/hawaii/hawaiiflag.shtml - 3k - Cached - Similar pages
• PaleoZoo's Prehistoric Hawaii!... became extinct after rats and mongooses arrived in Hawaii. ... let Nature decide whena species should become extinct. They decided to save the nene, and they did. ... www.geobop.com/paleozoo/World/NA/US/HI/ - 42k - Cached - Similar pages
When did Hawaii become a state?• HAWAII SUPREME COURT DROPS GAY MARRIAGE CASE || Human Rights ...
... It did not bar future cases that seek the benefits, protections and responsibilitiesthat come ... Their ads claimed that Hawaii would become the "homosexual ... www.hrc.org/newsreleases/1999/991210.asp - 16k - 30 Jun 2003 - Cached - Similar pages
• Maui Trivia by MAUI CHEETAH... ~ Ans: Front Street in Lahaina ***** submitted by: THonings; Whendid hawaii become a state? ~~ Ans: 1959 ***** submitted ... www.mauigateway.com/~rw/trivia1.htm - 12k - Cached - Similar pages
• State Bird of Hawaii Unmasked as Canadian... it should be no surprise that Canada geese did it some ... But in their adopted tropicalhabitat of Hawaii, the birds "evolved to become more independent of ... news.nationalgeographic.com/news/2002/ 02/0206_020206_canadiangeese.html - 38k - Cached - Similar pages
• [PDF]BEFORE ARBITRATOR TAMOTSU TANAKA STATE OF HAWAII In the Matter of ...
File Format: PDF/Adobe Acrobat - View as HTML... training to insure that qualified employees become available. ... did not contravenethe provisions of the Collective ... DATED: Honolulu, Hawaii, December 10, 1998. ... www.state.hi.us/hrd/121098.pdf - Similar pages
Comparison to Search Engines
• More natural interface
Natural language question vs Keywords
• More compact answer
Exact answers vs Relevant documents
The General solution of QA
Question AnalysisModel
Search EngineModel
Answer extractionModel
Query set
Answer Type/Patterns
Potential segments
Question Analysis
•Input: Question ( When did Hawaii become a state?)
•Output: Answer type/Patters (Date)
Queries (A group of key words:
Hawaii, state, became…)
•Methods: POS tagging
Named entity tagging
BMP Chunking
Syntactic parsing
Semantic tagging
…..
Question Analysis
•Input: Question ( When did Hawaii become a state?)
•Output:
Answer type : Date
Patters :“ Hawaii became a state in….”
“In … Hawaii became a state.”
………….
Queries (A group of key words):
“When did Hawaii become a state”
“Hawaii became a state in….”
Hawaii, state, became
Search
• Input: Queries (“Hawaii became a state in”,
i.e. groups of key words or phrases
• Output: Text segments (snippets) relevant to
the answer such as the ones returned
by Google
• Methods: Search Engines for passages
Answer Extraction• Input: Question answer type/patterns from question analysis Snippets returned by search engines• Output: Answers• Methods: POS tagging Named entity tagging BMP Chunking Syntactic parsing Semantic tagging Co-reference resolution Logic Proving/Matching ………….
Answer ExtractionQuestion: When did Hawaii become a state?
Answer type: Date
Patterns from question analysis:
“Hawaii became a state ….”
“In … Hawaii became a state.”
………….
Snippets returned by search engines”:
“…Hawaii became the 50th state on Aug.21,1959…”
“…Hawaii joined the States in 1959……”
………………
Key techniques
CL:• Part-of-speech tagging• NE tagging• Semantic tagging• BNP Chunking• Reference resolution• Syntactic parsing
IR: Search Engine
AI:• Pattern Matching• Logic proving
Machine Learning
Key Knowledge
Dictionaries• WordNet• HowNet• FrameNet
World Knowledge
• Encyclopedia• Web
The State of The Arts: Introduction of TREC- QA Task
• http://trec.nist.gov
• Organized by NIST
• Sponsor : NIST, DARPA, and ARDA
• Start from 1999
• Have the most participants among tasks
TREC-QA2002 participants (35)• Alicante Unv. BBN,
• CMU-Javelin,
• Chinese Academy of Sciences,
• CL Research,
• Columbia Univ.-Illouz,
• Fudan University,
• IBM T.J. Watson Res. Ctr.-Ittycheriah,
• IBM T.J. Watson Res. Ctr.-Prager,
• InsightSoft-M,
• ITC-irst,
•Language Comuter Corporation,
•LIMSI,
•MIT,
•National Univ. of Singapore-Lee,
•National Univ. of Singapore-Hui,
• NTT Communication Science Labs,
•POSTECH,
•Syracuse University,
•The MITRE Corp.
•Tokyo Univ. of Science,
•Univ. of Amsterdam –Monz,
•Universit d’ Angers,
•Univ. of Avignon,
•Univ. of Illinois at Urbana/Champaign,
•Univ. of Iowa,
•Univ. of Limerick,
•Univ. of Michigan,
•Univ. of Montreal,
•Univ. of Pisa,
•Univ. of Sheffield,
•Univ of Southern California/ISI,
•Univ. of Waterloo,
• Univ. of York
Document set
• The document set is the set of documents on
the AQUAINT disk set.
• 3GB
• News
Evaluation500 questions (Ex. When did Hawaii become as state?)
For each question the answer is evaluated as• Incorrect (W): the answer-string does not contain a correct
answer or the answer is not responsive; • Unsupported (U): the answer-string contains a correct answer
but the document returned does not support that answer; • Non-exact (X): the answer-string contains a correct answer and
the document supports that answer, but the string contains more than just the answer (or is missing bits of the answer);
• Correct (R): the answer-string consists of exactly a correct answer and that answer is supported by the document returned.
• Only correct answers have scores
Score
500
1
#
500
1
i i
QuestionithToUpAnswersCorrectofS
Top 15 Groups (2002)
TREC-QA2003 participants (25)
• Alicante Unv. BBN,
• CMU-Javelin,
• Chinese Academy of Sciences,
• CL Research,
• Fudan University,
• IBM T.J. Watson Res. Ctr.-Ittycheriah,
• IBM T.J. Watson Res. Ctr.-Prager,
• ITC-irst,
• Language Comuter Corporation
•, Lexiclone Inc
• LIMSI,
• MIT,
• National Univ. of Singapore,
• NTT Communication Science Labs,
• New Mexico State Univ.
•The MITRE Corp.
•Univ. of Amsterdam –Monz,
•Univ. of Iowa,
•Univ. of Limerick,
• UPC&UdG
•Univ. of Pisa,
•Univ. of Sheffield,
•Univ of Southern California/ISI,
• Univ. of Waterloo,
• Univ. of Wales Bangor
TREC2004:Question Set
• A series of questions for each of a set of targets
• Number of targets: 50-100
• Each series will contain:– Several factoid questions– 0-2 list questions– A question called “other”
Example question• <target id="1" text="AmeriCorps">
<qa> <q id = "1.1" type="FACTOID"> When was AmeriCorps founded? </q> </qa> <qa> <q id = "1.2" type="FACTOID"> How many volunteers work for it? </q> </qa> <qa> <q id = "1.3" type="LIST"> What activities are its volunteers involved in? </q> </qa> <qa> <q id="1.4" type="OTHER"> Other </q> </qa></target>
Question Set
• Targets:– Suggested by mining Microsoft and AOL web
search logs
• The assessors created the questions before they did any searching of the document set to find answers to the questions.
The future of ODQA: A Roadmap---Adapted from NIST Vision paper
Variation of questions
The simplest questions
•Factual questions : What is Hawaii’s state flower?
•Void Questions : The answer is no longer
guaranteed to be present in the text collection
and the systems are expected to notify the
absence of an answer.
•List Questions : The answer is scattered across two or
more documents
Context Questions : A group of relevant questions “within a
context”
List Questions The answer is scattered across two or more documents
What countries from the South America did the Pope visit and when?
Answer:• Argentina – 1987 [Document Source 1]• Columbia – 1986 [Document Source 2]• Brazil – 1982, 1991 [Document Source 3]
Context QuestionsA group of relevant questions “within a context”
• Context: Topic 168
- Title: Financing AMTRAK
- Description: The role of the Federal Government in financing the operation of the National Railroad Transportation Corporation (AMTRAK).
• (Q1) Why AMTRAK cannot be considered economically viable ?• (Q2) Should it be privatized ?• (Q3) How much larger are the government subsidies to AMTRAK as
compared to those given to air transportation ?
Definition/Template Question
•There are some template for this kind of questions
•Example: Who is XXX?
The template consists of
The address, phone number, Fax number,
Email address, Website,….
The Education history
The working experience
The contributions
………
Question with ambiguity
The answer will comprise an
explanation of possible
ambiguities and a justification of
why the answer is right
ExamplesWhere is the Taj Mahal? Answer: If you are interested in the Indian landmark, it is in Agra, India. If instead you want to find the location of the Casino, it is in
Atlantic City, NJ, U.S.A. There are also several restaurants named Taj Mahal. A full list is
rendered by the following hypertable. If you click on the location, you may find the address.
The Taj Mahal Indian Cuisine, Mountain View, CA The Taj Mahal Restaurant, Dallas, TX Taj Mahal, Las Vegas, NV Taj Mahal, Springfield, VA
Examples
How did Socrates die?
Answer:
He drunk poisoned wine.
Anyone drinking or eating something that is poisoned is likely to die.
Summaries as answer
• More complex questions will requires the answers to be summaries of the textual information comprised in one or several documents.
• The summarization is going to be driven by the question from one or multiple documents,
• Moreover, the summary will present in a coherent manner using text generation capabilities.
Examples
• Context-based summary-generating questions.
What is the financial situation of AMTRAK? • Stand-alone summary-generating questions
How safe are commercial flights? • Example-based summary-generating questions
What other companies are operated with Government aid?
Expert-Level Questions
The questions asked by expert requires • Collect sufficient structured and unstructured
information for different domains. • Mining domain knowledge and mastering the
relationships between all activities, situations and facts within a specific domain.
• Reasoning by analogy, comparing and discovering new relations
Examples
• (Q1) What are the opinions of the Danes on the Euro?
• (Q2) Why so many people buy four-wheel-drive cars lately?
• (Q3) How likely is it that the Fed will raise the interest rates at their next meeting?
A General Approach• Accept complex “Questions” in a form natural to the
analyst • Translate “Complex Question” into multiple queries
appropriate to the various data sets to be searched • Find relevant information in distributed, multimedia,
multilingual, multi-agency data sources. • Analyze, fuse and summarize information into a
coherent “Answer. • Provide (Proposed) “Answer” to analyst in the form
they want. • Provide Multimedia Visualization and Navigation
tools.
ODQA as a grand challenge
What makes a good long-range research goalor a grand challenge ---Jim Gray
• Understandable. The goal should be simple to state • Challenging. It should not be obvious how to achieve
the goal • Useful. If the goal is achieved, the results should be
clearly useful to many people• Testable. Solutions to the goal should have a simple test
so that one can measure progress and one can tell when the goal is achieved
• Incremental. It is very desirable that the goal has intermediate milestones so that progress can be measured along the way
QA as a grand challenge
A more demanding task is to take a corpus like the Internet or the Computer Science journals, or Encyclopedia Britannica, and be able to answer summarization questions about it as well as a human expert in that field
---Jim Gray Journal ACM, Jan.2003 ( J.ACM’s 50th Anniversary)
QA as a grand challenge
Read a Chapter in a Book and Answer the Questions at the End of the Chapter. Reading and understanding books is a quintessentially human activity. It is the process by which much knowledge transfer occurs from generation to generation.
-- Ai-Raj Reddy Journal ACM, Jan.2003
QA as a grand challenge
• Build a large knowledge base by reading text, reducing knowledge engineering effort by one order of magnitude
• The intent here is to “educate” a knowledge base in the same way that we receive most of our education
--Edward A. Feigenbaum Journal ACM, Jan.2003
QA as a grand challenge
Because questions can be devises to query any aspect of text comprehension, the ability to answer questions is the strongest possible demonstration of understanding.
---Wendy Lehnert
• So ODQA is AI complete in some sense
Conclusion
• Open Domain Question Answering is
a grand challenge in CS/AI/IT
•It is Understandable,
Challenging,
Useful,
Testable,
and Incremental.
Thanks