Ontology-enhanced retrieval (and Ontology-enhanced applications)
description
Transcript of Ontology-enhanced retrieval (and Ontology-enhanced applications)
Ontology-enhanced retrieval (and Ontology-enhanced retrieval (and Ontology-enhanced applications)Ontology-enhanced applications)
Deborah L. McGuinnessDeborah L. McGuinnessAssociate Director and Senior Research ScientistAssociate Director and Senior Research Scientist
Knowledge Systems LaboratoryKnowledge Systems LaboratoryStanford UniversityStanford UniversityStanford, CA 94305Stanford, CA 94305
650-723-9770650-723-9770 [email protected]
(FindUR,CLASSIC,PROSE work supported by AT&T Labs Research, Florham (FindUR,CLASSIC,PROSE work supported by AT&T Labs Research, Florham Park, NJ, OntoBuilder work supported by VerticalNet,Park, NJ, OntoBuilder work supported by VerticalNet,
Chimaera, Ontolingua, JTP supported by DARPA)Chimaera, Ontolingua, JTP supported by DARPA)
One Conceptual SearchOne Conceptual Search
Input is in a natural query language (forms, English, ER diagram …) Query may be transformed (behind the scenes) into a precise query
language with defined semantics Information is at least semi-structured with DL-like markup and also
“exists” in more natural formats and is interoperable Answers returned that are not just the explicit answer to question (but
also the implicit answer to question) Answers return the portion of the content that is of use (not an entire
page of content) Answers may be summarized, abstracted, pruned “Answers” may be services that can take action Interface is interactive and helps users reformulate “unsuccessful”
queries Customizable, extensible, …
Today: Rich Information Source for Today: Rich Information Source for Human Manipulation/InterpretationHuman Manipulation/Interpretation
Human
Human
““I know what was input”I know what was input”
Global documents and terms indexed and available for search Search engine interfaces Entire documents retrieved according to relevance (instead of
answers) Human input, review, assimilation, integration, action, etc. Special purpose interfaces required for user friendly applications
The web knows what was input but does little interpretation, manipulation, integration, and action
Information Discovery… but Information Discovery… but not much morenot much more
Human intensive (requiring input reformulation and interpretation)
Display intensive (requiring filtering) Not interoperable Not agent-operational Not adaptive Limited context Limited service
Analogous to a new assistant who is thorough yet lacks common sense, context, and adaptability
Future: Rich Information Source for Future: Rich Information Source for Agent Manipulation/InterpretationAgent Manipulation/Interpretation
Human
Agent
Agent
““I know what was meant”I know what was meant”
Understand term meaning and user background Interoperable (can translate between applications) Programmable (thus agent operational) Explainable (thus maintains context and can adapt) Capable of filtering (thus limiting display and
human intervention requirements) Capable of executing services
One Approach… start simple One Approach… start simple from embedded basesfrom embedded bases
Recognize the vast amount of information in textual forms…
Enhance “standard” information retrieval by adding some semantics
Use background ontology to do query expansion Exploit ontology to add some structure to IR
search Move to parametric search Move to include inference (in e-commerce setting
moving towards interoperable solutions and configuration
FindUR Challenges/BenefitsFindUR Challenges/Benefits Retrieve documents otherwise missed - Recall More appropriately organize documents according
to relevance (useful for large number of retrievals) Browsing support (navigation, highlighting) Simple User Query building and refinement Full Query Logging and Trace Facilitate use of advanced search functions
without requiring knowledge of a search language Automatically search the right knowledge sources
according to information about the context of the query
(
FindUR Architecture
SearchEngine
Content to Search:
Search and Representation Technology:
User Interface:
Verity Topic Sets
Content (WebPages, Documents,
Databases)
Results(domain spec.)
Verity SearchScript, Javascript, HTML, CGI
Content
Classification
Domain
Knowledge
Results(std. format)
SearchParameters
Classic Collaborative Topic Building
ToolQuery Input
P-CHIPResearch SiteTechnical MemorandumCalendars (Summit 2005,
Research) Yellow Pages (Directory Westfield)Newspapers (Leader) AT&T SolutionsWorldnet Customer Care
OntologyBuilderOntologyBuilder
ConfigurationConfiguration
http://www.research.att.com/sw/tools/classic/tm/ijcai-95-with-scenario.htmlhttp://www.research.att.com/sw/tools/classic/tm/ijcai-95-with-scenario.html
Ontology Creation and Ontology Creation and Maintenance Environment NeedsMaintenance Environment Needs
Semi-automatic generation input Diagnostics/Explanation (Chimaera, CLASSIC,…) Merging and Difference (Chimaera, Prompt, Ontolingua, …) Translators/Dumping (Ontolingua, …) Distributed Multi-User Collaboration (OntologyBuilder,…) Versioning (OntologyBuilder,…) Scalability. Reliability, Performance, Availability
(Shoe,OntologyBuilder,…) Security (viewing, updates, abstraction, authoritative sources…) Ontology Library systems (Ontolingua,…) Business needs – internationalization, compatibility with standards
(XML,…)
ConclusionConclusion
With background ontologies and the appropriate environments, we can move from simple ontology-enhanced applications to the next generation web
PointersPointers
FindUR: www.research.att.com/~dlm/findur OntoBuilder/OntoServer:
http://www.ksl.stanford.edu/people/dlm/papers/ontologyBuilderVerticalNet-abstract.html
Deborah McGuinness: www.ksl.stanford.edu/people/dlm CLASSIC: www.research.att.com/sw/tools/classic Chimaera: www.ksl.stanford.edu/software/chimaera/