From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA...
-
Upload
caroline-melton -
Category
Documents
-
view
217 -
download
0
Transcript of From Question-Answering to Information-Seeking Dialogs Jerry R. Hobbs USC/ISI Marina del Rey, CA...
From Question-Answeringto Information-Seeking Dialogs
Jerry R. Hobbs
USC/ISI
Marina del Rey, CA
with Douglas Appelt, David Israel, Peter Jarvis, David Martin,Mark Stickel, and Richard Waldinger of SRI
Chris Culy
SRI International
Menlo Park, CA
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 2
Decomposing Questions
Could Mohammed Atta have met with an Iraqi official between 1998 and 2001?
IE Engine
GeographicalReasoning
QuestionDecomposition
via Logical Rules
ResourceAttached toReasoning
Process
meet(a,b,t) & 1998 t 2001
at(a,x1,t) & at(b,x2,t) & near(x1,x2) & official(b,Iraq)
go(a,x1,t) go(b,x2,t)
IE Engine
TemporalReasoning
Logical Form
SNARK
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 3
The Problem
Inference in large knowledge bases is required for competent question-answering
Many rich but heterogeneous knowledge bases exist today
How do we make use of them in a single system?
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 4
Outline
Three Resources:
1. The Semantic Web: Teknowledge’s search engine ASCS
2. An Information Extraction Engine: SRI’s TextPro
3. An Ontology of Time: DAML-Time
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 5
DAML Search Engine
pred:
arg1:
arg2: Indonesia
?x
capital namespace
namespace
namespace
Searches entire(soon to be
exponentially growing)Semantic Web
Also conjunctive queries: population of capital of Indonesia
Problem: you have to know logic and RDF to use it.
Tecknowledge has developed ASCS:
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 6
DAML Search Engineas AQUAINT Web
Resource
pred:
arg1:
arg2: Indonesia
?x
capital namespace
namespace
namespace
Searches entire(soon to be
exponentially growing)Semantic Web
Solution: You only have to know English to use it; Makes the entire Semantic Web accessible to AQUAINT users.Also: Can use it for subqueries.
AQUAINT System
capital(?x,Indonesia)
procedural attachment in SNARK
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 7
Namespace Problem
Where to find the right predicates?
In QUARK: Subtheories linking predicates to namespaces Subtheories linking topics to namespaces
In DAML/ASCS: EQUIVALENT statements Standardized ontologies Use WordNet and SUMO to expand query Any namespace
Decreasingprecision
Decreasingprecision
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 8
Information ExtractionEngine as a Resource
Document retrieval for pre-processing
TextPro: Top of the line information extraction engine recognizes subject-verb-object, coref rels
Analyze NL query w GEMINI and SNARK
Bottom out in a pattern for TextPro to seek
Keyword search on very large corpus
TextPro runs over documents retrieved
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 9
Linking SNARK with TextPro
TextSearch(EntType(?x), Terms(p), Terms(c), WSeq)
& Analyze(WSeq, p(?x,c))
--> p(?x,c)
Call to TextPro
Type of questionedconstituent
Synonyms and hypernymsof word associated with p or c
Answer:Ordered sequence
of annotated strings of words
Match pieces of annotated answer strings with pieces of query
Subquery generated by SNARKduring analysis of query
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 10
Three Modes of Operationfor TextPro
1. Search for predefined patterns and relations (ACE-style) and translate relations into SNARK's logic
Where does the CEO of IBM live?
2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano is CEO of IBM."
3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."
Use coreference links to get most informative answer
ACE Roleand AT
Relations
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 11
First Mode
TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, Role(?x,Management,IBM,CEO)) --> CEO(?x,IBM)
CEO(Samuel Palmisano,IBM)
Analyze
Entity1: {Samuel Palmisano, Palmisano, head, he}Entity2: {IBM, International Business Machines, they}Relation: Role(Entity1,Entity2, Management,CEO)
<relation TYPE=Role SUBTYPE=Management> <rel_entity_arg ID=“Entity1” ARGNUM=“1”/> <rel_entity_arg ID=“Entity2” ARGNUM=“2”/> <rel_attribute ATTR=“POSITION”>CEO</rel_attribute></relation>
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 12
Three Modes of Operationfor TextPro
1. Search for predefined patterns (MUC-style) and translate template into SNARK's logic Where does the CEO of IBM live?
2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano heads IBM."
3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."
Use coreference links to get most informative answer
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 13
Second Mode
TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, CEO(?x,IBM)) --> CEO(?x,IBM)
"<subj> Samuel Palmisano </subj> <verb> heads </verb> <obj> IBM </obj>"
CEO(Samuel Palmisano,IBM)
Analyze
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 14
Three Modes of Operationfor TextPro
1. Search for predefined patterns (MUC-style) and translate template into SNARK's logic Where does the CEO of IBM live?
2. Search for subject-verb-object relations in processed text that matches predicate-argument structure of SNARK's logical expression "Samuel Palmisano is CEO of IBM."
3. Search for passage with highest density of relevant words and entity of right type for answer "Samuel Palmisano .... CEO .... IBM."
Use coreference links to get most informative answer
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 15
Third Mode
TextSearch(Person, Terms(CEO), Terms(IBM), WSeq) & Analyze(WSeq, CEO(?x,IBM)) --> CEO(?x,IBM)
"<person> He </person> has recently been rumored to have been
appointed Lou Gerstner's successor as <CEOword> CEO </CEOword>of the major computer maker nicknamed <co> Big Blue </co>"
CEO(Samuel Palmisano,IBM)
Analyze
"<person> Samuel Palmisano </person> ...."
coref
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 16
Challenges for IE
Cross-document identification of individuals Document 1: Osama bin Laden Document 2: bin Laden Document 3: Usama bin Laden
Do entities with the same or similar names represent the same individual?
Metonymy Text: Beijing approved the UN resolution on Iraq. Query involves “China”, not “Beijing”
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 17
Temporal Reasoning: Structure
Topology of Time: start, end, before, between
Measures of Duration: for an hour, ...
Clock and Calendar: 3:45pm, Wednesday, June 12
Temporal Aggregates: every other Wednesday
Deictic Time: last year, ...
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 18
Temporal Reasoning: Goals
Develop temporal ontology (DAML)
Reason about time in SNARK (AQUAINT, DAML)
Link with Temporal Annotation Language TimeML (AQUAINT)
Answer questions with temporal component (AQUAINT)
Nearly complete
In progress
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 19
Convergence
DAML Annotationof Temporal Information
on Web(DAML-Time)
Annotation of Temporal Information
in Text(TimeML)
Most information on Web is in text
The two annotation schemesshould be intertranslatable
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 20
TimeML Annotation Scheme(An Abstract View)
2001
6 mos
Sept 11
warning
clock & calendar intervals& instants
intervalsinclusion
beforedurations
instantaneousevents
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 21
TimeML Example
The top commander of a Cambodian resistance force said Thursdayhe has sent a team to recover the remains of a British mine removalexpert kidnapped and presumed killed by Khmer Rouge guerrillastwo years ago.
resist
command
sent recover
Thursday
said now
remove kidnap
2 years
presumed
killedremain
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 22
Vision for Time
Manual DAML temporal annotation of web resources
Manual temporal annotation of large NL corpus
Programs for automatic temporal annotation of NL text
Automatic DAML temporal annotation of web resources
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 23
Spatial and GeographicalReasoning: Structure
Topology of Space: Is Albania a part of Europe?
Dimensionality: How long/big is Chile?
Measures: How large is North Korea? Orientation and Shape: What direction is Monterey from SF?
Latitude and Longitude: Alexandrian Digital Library Gazetteer
Political Divisions: CIA World Fact Book, ...
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 24
Spatial and GeographicalReasoning: Goals
Develop spatial and geographical ontology (DAML)
Reason about space and geography in SNARK (AQUAINT, DAML)
Attach spatial and geographical resources (AQUAINT)
Answer questions with spatial component (AQUAINT)
Somecapability
now
12/04/02Chris Culy, SRI, and Jerry Hobbs, USC/ISI, PIs 25
Status and Future Directions
Basic architecture essentially complete
Good sampling of web and other resources have been incorporated
Focus on bulking up knowledge base relevant to domain (nonproliferation)
Focus on dialogue structure