March 2000 Gio XIT 1 Increasing the Precision of Semantic Interoperation Gio Wiederhold Stanford...

21
March 2000 Gio XIT 1 Increasing the Precision of Semantic Interoperation Gio Wiederhold Stanford University March 2000 report: www-db.stanford.edu/pub/gio/1999/miti.htm Stanford Computer Forum Supported by AFOSR- New World Vistas Program annink, Shrish Agarwal, Prasenjit Mitra, Stefan Dec
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    0

Transcript of March 2000 Gio XIT 1 Increasing the Precision of Semantic Interoperation Gio Wiederhold Stanford...

March 2000 Gio XIT 1

Increasing the Precision ofSemantic Interoperation

Gio Wiederhold

Stanford UniversityMarch 2000

report: www-db.stanford.edu/pub/gio/1999/miti.htm

Stanford Computer Forum

Supported by AFOSR- New World Vistas Program

Jan Jannink, Shrish Agarwal, Prasenjit Mitra, Stefan Decker.

March 2000 Gio XIT 2

Heterogeneity among Domains

If interoperation involves distinct

domains mismatch ensues• Autonomy conflicts with consistency,

– Local Needs have Priority,– Outside uses are a Byproduct

Heterogeneity must be addressed• Platform and Operating Systems • Representation and Access Conventions • Naming and Ontology

March 2000 Gio XIT 3

Semantic Mismatches

Information comes from many autonomous sources• Differing viewpoints (by source)

– differing terms for similar items { lorry, truck }

– same terms for dissimilar items trunk(luggage, car)

– differing coverage vehicles (DMV, AIA)

– differing granularity trucks (shipper, manuf.)

– different scope student museum fee, Stanford

• Hinders use of information from disjoint sources – missed linkages loss of information, opportunities– irrelevant linkages overload on user or application

program

• Poor precision when merged

ok for web browsing , poor for business

March 2000 Gio XIT 4

Solutions

Specify and standardize terminology usage: ontology• Globally all interacting sources

– wonderful for users and their programs– long time to achieve, 2 sources (UAL, BA), 3 (+ trucks), 4, … all ? – costly maintenance, since all sources evolve – who has the authority to dictate conformance

• Domain-specific XML DTD assumption– Small, focused, cooperating groups– high quality, some examples - genomics, arthritis, shakespeare plays

– allows sharable, formal tools – ongoing, local maintenance affecting users - annual updates

– poor interoperation, users still face inter-domain mismatches

• solves only part of the problem

March 2000 Gio XIT 5

Domains and Consistency .

• a domain will contain many objects• the object configuration is consistent• within a domain all terms are consistent &• relationships among objects are consistent

• context is implicit

No committee is needed to forge compromises * within a domain

Compromises hide valuable details

Domain Ontology

March 2000 Gio XIT 6

Objective Scalable Knowledge Composition

Provide for Maintainable Ontologies

• devolve maintenance onto many domain-specific experts / authorities

• provide an algebra to compute composed ontologies that are limited to their articulation terms

• enable interpretation within the source contexts

SKC

March 2000 Gio XIT 7

Sample Operation: INTERSECTION

Source Domain 1:Owned and maintained by Store

Result contains shared terms,useful for purchasing

Source Domain 2:Owned and maintainedby Factory

Articulation

March 2000 Gio XIT 8

Tools to create articulations

Graph matcherforArticulation- creatingExpert

Vehicle ontology

Transport ontology

Suggestionsfor articulations

March 2000 Gio XIT 9

continue from initial pointAlso suggest similar terms for further articulation:

• by spelling similarity,• by graph position• by term match repository

Expert response:1. Okay2. False3. Irrelevant to this articulation

All results are recorded

Okay ’s are converted into articulation rules

March 2000 Gio XIT 10

Candidate Match RepositoryTerm linkages automatically extracted from 1912 Webster’s dictionary *

* free, other sources . being processed.

Based on processing headwords definitions using algebra primitives

Notice presence of 2 domains: chemistry, transport

March 2000 Gio XIT 11

Using the Match Repository

March 2000 Gio XIT 12

Using the Match Repository

March 2000 Gio XIT 13

An Ontology Algebra

A knowledge-based algebra for ontologies

The Articulation Ontology (AO) consists of matching rules that link domain ontologies

Intersection create a subset ontology keep sharable entries

Union create a joint ontology merge entries

Difference create a distinct ontology remove shared entries

March 2000 Gio XIT 14

INTERSECTION support

Store Ontology

Articulation ontology

Matching rules that use terms from the 2 source domains

Factory Ontology

Terms usefulfor purchasing

March 2000 Gio XIT 15

Other Basic Operations

typically priorintersections

UNION: mergingentire ontologies

DIFFERENCE: materialfully under local control

Arti-culation ontology

March 2000 Gio XIT 16

Features of an algebra

Operations can be composed

Operations can be rearranged

Alternate arrangements can be evaluated

Optimization is enabled

The record of past operations can be

kept and reused when sources change

March 2000 Gio XIT 17

Knowledge CompositionArticulationknowledgefor U

U

U

(A B)U

(B C)U

(C E)

Knowledge resource

B

Knowledge resource

A

Knowledge resource

C Knowledge

resourceD

U

(C D)

U

(B C)

Articulationknowledge

Composed knowledge forapplications using A,B,C,E

Knowledge resource

E

U

(C E)

Legend:

U : unionU: intersection

Articulationknowledgefor (A B)

U

March 2000 Gio XIT 18

Primitive Operations

Unary• Summarize -- abstract • Glossarize - list terms

• Filter - reduce instances

• Extract - move into context

Binary • Match - data corrobaration

• Difference - distance measure

• Intersect - use of articulation

• Union - search broadening

Constructors• create object• create setConnectors• match object• match setEditors• insert value• edit value• move value• delete valueConverters• object - value• object indirection• reference indirection

Model and Instance

March 2000 Gio XIT 19

Exploiting the result .

Processing & query evaluation is best performed withinSource Domains & by their engines

Result has linksto source

Avoid n2 problem of interpretermapping [Swartout HPKB year 1]

March 2000 Gio XIT 20

Domain Specialization .• Knowledge Acquisition (20% effort) &• Knowledge Maintenance (80% effort *)

to be performed• Domain specialists• Professional organizations• Field teams

of modest size

Empowermentautomouslymaintainable

* based on experience with software

March 2000 Gio XIT 21

Summary

To sustain the trend 1. The value of the results has to keep increasing

precision, relevance not volume2. Value is provided by experts,

encoded as models of diverse resources, customersProblems to be addressed mismatches quality temporal extensions maintenance

} Clear models

Thanks to Jan Jannink, Shrish Agarwal, Prasenjit Mitra, Stefan Decker.