Overview

42
© 2003 ontoprise GmbH Ontology-Based Query and Answering in Chemistry: OntoNova @ Project Halo Jürgen Angele - Ontoprise, Karlsruhe, Germany

description

Ontology-Based Query and Answering in Chemistry: OntoNova @ Project Halo Jürgen Angele - Ontoprise, Karlsruhe, Germany. Overview. Project Scenario / Participants Technical Approach Performance strategies Metareasoning / Justifications Encoding / Knowledge Base Encoding Method - PowerPoint PPT Presentation

Transcript of Overview

Page 1: Overview

© 2003 ontoprise GmbH

Ontology-Based Query and Answering in Chemistry: OntoNova @ Project Halo

Jürgen Angele - Ontoprise, Karlsruhe, Germany

Page 2: Overview

2© 2003 ontoprise GmbH

Overview

Project

• Scenario / Participants

Technical Approach

• Performance strategies

• Metareasoning / Justifications

Encoding / Knowledge Base

• Encoding Method

• Architecture of the KB

Challenge

• Results

• Question encoding, fidelity

• Failures, brittleness

Page 3: Overview

3© 2003 ontoprise GmbH

Background

• A multistage project towards the Development of a Digital Aristotle

• Funded by Vulcan Inc. Seattle

• Phase 1 successfully closed in 2003

• Phase 2 since January 2004

Functions (Halo 1)

• Capturing of extensive set of chemical knowledge

• System passed the „Advanced Placement Test“

• Query is answered and answer is explained

Vulcan Inc: OntoBroker passes Advanced Placement Test

Page 4: Overview

4© 2003 ontoprise GmbH

Stage 1: Halo 1

• task: development of a query answering system containing the knowledge of about 80pages of chemistry book in 4 months

• evaluation:• sequestration of system• 160 novel questions (AP exam) • encoding of the questions• chemistry professors graded answers and explanations

• participants: SRI, Cyc Corp, Ontoprise

Page 5: Overview

5© 2003 ontoprise GmbH

Overview

Project

• Scenario / Participants

Technical Approach

• Performance

• Metareasoning / Justifications

Encoding / Knowledge Base

• Encoding Method

• Architecture of the KB

Challenge

• Results

• Question encoding, fidelity

• Failures, brittleness

Page 6: Overview

6© 2003 ontoprise GmbH

Implementation - F –logic

• Knowledge Encoding in F-Logic

• Query-Encoding in F-Logic

• Evaluation in a batch-run by Ontobroker

Page 7: Overview

7© 2003 ontoprise GmbH

High Performance Inferencing

0

50000

100000

150000

200000

250000

300000

350000

400000

1 2 3 4 5 6 7 8 9 10 11 12 13 14

test case

tim

es

OB 3.6 übersetzt

XSB

Win Prolog

SWI Prolog

Page 8: Overview

8© 2003 ontoprise GmbH

Metareasoning & Answer Justification

Internal Database

Inference kernel

Internal Database

Inference kernel

Connektors

F-Logic Compile

r

Prolog Compiler

RDF Compiler

Builtins

.... Compiler

Inference ServerInference Server

Internal Database

Inference kernel

Internal Database

Inference kernel

Connektors

F-Logic Compile

r

Prolog Compiler

RDF Compiler

Builtins

.... Compiler

Inference ServerInference Server

knowledge base explanation knowledge base

Answer = A

Page 9: Overview

9© 2003 ontoprise GmbH

Metareasoning & Answer Justification

Explanations

the products of this reaction are PbI2 and Na because PbI2 precipitates out of the solution

an ionic molecule consisting of cation Pb and anion I is not known to be soluble and is thus guessed to be unsoluble

Page 10: Overview

10© 2003 ontoprise GmbH

Metareasoning & Answer Justification

Reasoning for generating explanations

• integrating additional knowledge into explanations

• generating abstractions

• avoiding redundancies

• considering context and user profile

Page 11: Overview

11© 2003 ontoprise GmbH

Overview

Project

• Scenario / Participants

Technical Approach

• Performance strategies

• Metareasoning / Justifications

Encoding / Knowledge Base

• Encoding Method

• Architecture of the KB

Challenge

• Results

• Question encoding, fidelity

• Failures, brittleness

Page 12: Overview

12© 2003 ontoprise GmbH

Verify by syllabus questions

Encoding

Modelling procedure

Verify ground operations by example questions from Brown et al.

40 test cases

Model ground operations

2002/2003

addexplanationrules

refinement ofmodeling

refinement ofexplanations

testing

dry run challenge run

Page 13: Overview

13© 2003 ontoprise GmbH

Question encoding

A 0.3M solution of acetic acid has a pH of 2.63. The ionization constant of this acid is

a) 1.8 x 10-5 b) 7.0 x 10-4 c) 1.1 x 10-6 d) 7.8 x 10-3 d) 1.9 x 10-6

Page 14: Overview

14© 2003 ontoprise GmbH

Question encoding – multiple choice strategy

m1:Mixture[hasComponents->>{"HCl","Ba(OH)2"}]. m2:Mixture[hasComponents->>{"HCl","CaCO3"}]. m3:Mixture[hasComponents->>{"HCl","CuSO4"}]. m4:Mixture[hasComponents->>{"HCl","Na3PO4"}]. m5:Mixture[hasComponents->>{"HCl","NaCl"}].

answer("A") <- exists P P:GaseousReaction[fromMixture->>m1]. answer("B") <- exists P P:GaseousReaction[fromMixture->>m2]. answer("C") <- exists P P:GaseousReaction[fromMixture->>m3]. answer("D") <- exists P P:GaseousReaction[fromMixture->>m4]. answer("E") <- exists P P:GaseousReaction[fromMixture->>m5].

FORALL X <- answer(X).

Input facts

definition ofalternatives

ask for alternative

Page 15: Overview

15© 2003 ontoprise GmbH

Question encoding – Detailed Answer Section

MC48

1.0 L of a buffer formed by mixing 0.25 moles of ammonia solution with 0.25 moles of ammonium nitrate has a pH of (For ammonia, Kb = 1.8 x 10-5)

m1:BufferSolution[hasComponents->> {ammonia ,ammonium_nitrate,};hasMole@(ammonium_nitrate)->0.25;hasMole@(ammonia )->0.25;hasVolume->1.0].

FORALL Ph <- m1[hasPHValue->Ph].

Page 16: Overview

16© 2003 ontoprise GmbH

Encoding - Basic Chemical Operations

Classify compound as ionic (aequous.flo, utils.flo)Balance chemical equation (balancing.flo)Determine solubility (acidbase.flo)Determine equilibrium expressionRank strength of metal ions as lewis acids ..... ..... .....Determine acid/base conjugateDetermine strengths of acids/basesDetermine products of reaction and type of reactionCalculate PH (ph.flo)Determine position of equilibriumDescribing substancesNaming

Page 17: Overview

17© 2003 ontoprise GmbH

Architecture of KB

Ontology

Instancesbasic facts likeelements,...

acidorder

calculate PH

-value

equilibrium

acidorder

solubility

balancing reactions

...

...

...

...

...

...

...

...

...

...

basic chemicaloperationsusing predicateslike MPhKa andrules

ontologicalaccess

Page 18: Overview

20© 2003 ontoprise GmbH

Summary Architecture KB

• chemical operations: independent knowledge chunks

- collaborative development of KB- reduce complexity- reduce testing effort

• OO – Wrapper: ontological access - eases access- closer to NL

Page 19: Overview

21© 2003 ontoprise GmbH

Overview

Project

• Scenario / Participants

Technical Approach

• Performance

• Metareasoning / Justifications

Encoding / Knowledge Base

• Encoding Method

• Architecture of the KB

Challenge

• Results

• Question encoding, fidelity

• Failures, brittleness

Page 20: Overview

22© 2003 ontoprise GmbH

Results

Challenge Answer Scores

0.00

10.00

20.00

30.00

40.00

50.00

60.00

SME1 SME2 SME3

Sco

res

(%)

CYCORP

ONTOPRISE

SRI

Page 21: Overview

23© 2003 ontoprise GmbH

Results

Challenge Justification Scores

0.005.00

10.0015.0020.00

25.0030.0035.0040.0045.00

SME1 SME2 SME3

Sco

res

(%) CYCORP

ONTOPRISE

SRI

Page 22: Overview

24© 2003 ontoprise GmbH

Performance

Team End-To-End Challenge Run Times

Team Sequestered Improved

Cycorp > 12 hours > 27 hours

Ontoprise 2 hours 9 minutes

SRI 5 hours 38 minutes

Page 23: Overview

25© 2003 ontoprise GmbH

Brittleness Classification

(MOD) Knowledge Modeling

(IMP) Knowledge Implementation/Modeling Language

(INF) Inference and Reasoning

(KFL) Knowledge Formation and Learning

(SCL) Scalability:

(MGT) Knowledge Management

(QMN) Query Management

(ANJ) Answer Justification

(QMT) Quality Metrics (MTA) Meta Capabilities

Page 24: Overview

26© 2003 ontoprise GmbH

Question encoding - fidelity

MC12

When methane, CH4, gas reacts with oxygen, the following changes occur

burn("CH4").

reacts with oxygen = burn

Page 25: Overview

27© 2003 ontoprise GmbH

basic operation modeled to determine the pH-value given the Ka-value and not vice versa

DA18

Ascorbic acid, H2C6H6O6, is a diprotic acid with a Ka1 value of 8.9 x 10-5. The pH of a 0.125 M solution of ascorbic acid is 2.48 and the concentration of C6H6O62- is 1.6 x 10-12 M.

Determine the value of Ka2.

Brittleness – not expected question type

Page 26: Overview

28© 2003 ontoprise GmbH

Results from Halo 1

• controlled experiment

• brittleness classification

• bottleneck: knowledge acquisition !

10000 $ per page(5000 $ per page OP)

Page 27: Overview

29© 2003 ontoprise GmbH

Next Steps: Halo 2

development of tools for

domain experts

to capture knowledge and thus

to reduce knowledge acquisition bottleneck

Page 28: Overview

30© 2003 ontoprise GmbH

Next Steps

Scenario: Term acquisition

Page 29: Overview

31© 2003 ontoprise GmbH

Comp…

E

V

QF

KF

Page 30: Overview

33© 2003 ontoprise GmbH

weak acid

Comp…

E

V

QF

KF

Hints are available about where to add the term to the ontology. Give me the hints·

From your choice of answers it looks as if “weak acid” is a concept.

Give me hints Do you have typical examples for “weak acid” in this context? Can you give specializations for “weak acid” in this context?

Does “weak acid” refer to a set of elements? Finish·

Page 31: Overview

34© 2003 ontoprise GmbH

Next Steps

Scenario: Creating rules

Page 32: Overview

35© 2003 ontoprise GmbH

Formula Editor

Rule NL

X

HXKaH

Formula:

Comp…KF

QF

E

V

Page 33: Overview

36© 2003 ontoprise GmbH

Comp…

Formula Editor

Rule NL

WeakAcid

Salt

BufferSolution

hasSalt

hasAcid

hasSalt

hasH-value

hasAcid

hasAcid

hasSalt

isSaltOf

If BufferSolutionhasAcid WeakAcidIf BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcid

Formula:

WeakAcidC

KF

QF

E

V

Page 34: Overview

37© 2003 ontoprise GmbH

Comp…

Formula Editor

Rule NL

hasSalt

isSaltOf

hasAcid

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcid

Ka=hasKa

X=hasMoleSalt

HX=hasMoleAcid

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcidand the AttributemoleSalt ofBufferSolution hasvalue [X]

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcidand the AttributemoleSalt ofBufferSolution hasvalue [X] and the

Attribute moleAcid

of BufferSolution has

value [HX]

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcidand the AttributemoleSalt ofBufferSolution hasvalue [X] and the

Attribute moleAcid

of BufferSolution has

value [HX] and the

Attribute hasKa of

WeakAcid has value

[Ka]

If BufferSolutionhasAcid WeakAcidAcid and BufferSolutionhasSalt Salt and SaltisSaltOf the WeakAcidand the AttributemoleSalt ofBufferSolution hasvalue [X] and the

Attribute moleAcid

of BufferSolution has

value [HX] and the

Attribute hasKa of

WeakAcid has value

[Ka] then the Attribute

hasHValue of

BufferSolution is

computed to [H]

according to the formula

[H]=Ka*HX/X.

H=hasHvalueFormula:

WeakAcid

Salt

BufferSolution

KF

QF

E

V

Page 35: Overview

38© 2003 ontoprise GmbH

Next Steps

Scenario: Diagram acquisition

Page 36: Overview

39© 2003 ontoprise GmbH

Knowledge Formulation: Diagrams

Example 4.1 A Traffic Light at RestA traffic light weighing 100 N hangs from a vertical cable tied to two other cables that are fastened to a support, as in Figure

4.11a. The upper cables make angles of 37.0˚ and 53.0˚ with the horizontal. Find the tension in each of the three cables.

The user decides that the diagram is an integral part of the text, and selects it.

Input diagram

The system then opens a new perspective on the diagram and text.

Page 37: Overview

40© 2003 ontoprise GmbH

Knowledge Formulation: Diagrams

Based on the text accompanying the diagram, a set of potential glyphs are shown to the user.

A traffic light weighing 100 N hangs from a vertical cable tied to two other cables that are fastened to a support, as in Figure 4.11a. The upper cables make angles of 37.0˚ and 53.0˚ with the horizontal.

Page 38: Overview

41© 2003 ontoprise GmbH

Representing the Initial Scenario

•Glyphs are then overlaid on the existing diagram (as for Question Formulation)

•As glyphs are added, they may also be linked to the descriptive text.

•Dimension values (e.g., weight) and labels can be added by right-clicking on the glyphs and selecting items from a popup menu.

A traffic light weighing 100 N hangs from a vertical cable tied to two other cables that are fastened to a support, as in Figure 4.11a. The upper cables make angles of 37.0˚ and 53.0˚ with the horizontal. Find the tension in each of the three cables.

Set mass

Set weight

Page 39: Overview

42© 2003 ontoprise GmbH

2003 2004 2005 2006 2007

RESEARCH COMMERCE

Halo-2 will boost the SemanticWeb

SemanticWeb Pilot Applications

First W3C Standards

Increasing SemanticWeb Resources

Slow maturity process of SemanticWeb applications

Boosted SemanticWeb applications

Scientific SemanticWeb based on Halo-2 technology

Distributed knowledge

Impact of the Digital Aristotle

KillerApps:SemanticWeb - Editor (from “Frontpage” DarkMatterStudio) - Browser (from “IE” DarkMatterQueryInterface)

Page 40: Overview

43© 2003 ontoprise GmbH

CMU: natural language understanding

U Brighton: intelligent querying with natural language

Georgia Tech:understanding diagrams and pictures

DFKI:usability & intelligent interfaces Ontoprise:

reasoning, integration, semantic web

The Team

Team

Page 41: Overview

44© 2003 ontoprise GmbH

Technology:

Technology Leader (Gartner Group, Forrester Research)

Vision: SemanticWeb

Founded: 1999 (Spin Off Univ. Karlsruhe)

Team: 30 Employees

Context: “Semantic Europe” (~ 100 R&D) - AIFB Karlsruhe

- FZI, Karlsruhe- DERI Galway, Irland- DERI Innsbruck, Austria

Page 42: Overview

45© 2003 ontoprise GmbH

Thank you!

Prof. Dr. Jürgen [email protected]+49 (0)721 509 809 0