OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability

36
OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability Changqing Li, Tok Wang Ling Department of Computer Science School of Computing National University of Singapore

description

OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability. Changqing Li,Tok Wang Ling Department of Computer Science School of Computing National University of Singapore. Outline. Introduction Preliminary and motivation - PowerPoint PPT Presentation

Transcript of OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability

Page 1: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

OWL-based Semantic Conflicts Detection and Resolution for Data Interoperability

Changqing Li, Tok Wang Ling

Department of Computer ScienceSchool of Computing

National University of Singapore

Page 2: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

2

Outline Introduction

Preliminary and motivation

OWL-based Semantic Conflicts Detection and Resolution

Conclusion

Q & A

Page 3: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

3

Introduction Data interoperability and integration is a long-

standing challenge to the database research community.

Ontology provides sharing knowledge among different data sources

Clarify the semantics of information.

Provide a way to solve the interoperability problem in database integration

Page 4: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

4

Introduction (Cont.) OWL is being promoted as a standard for web

ontology language

In the future a considerable number of ontologies will be created based on OWL.

Therefore automatically detecting semantic conflicts based on OWL will greatly expedite the step to achieve semantic interoperability, and will greatly reduce the manual work to detect semantic conflicts.

Page 5: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

5

Ontology Definition An ontology defines the basic terms

and relations comprising the vocabulary of a topic area, as well as the rules for combining terms and relations to define extensions to the vocabulary [1].

1. Robert Neches, Richard Fikes, Timothy W. Finin, Thomas R. Gruber, Ramesh Patil, Ted E. Senator, William R. Swartout: Enabling Technology for Knowledge Sharing. AI Magazine 12(3): pp36-56 (1991)

Page 6: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

6

Ontology Language SHOE

RDF

RDFS

DAML+OIL

OWL

Page 7: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

7

SHOE

The Simple HTML Ontological Extensions (SHOE) [2] extends HTML with machine-readable knowledge annotated.

2. Sean Luke and Jeff Heflin: SHOE Specification 1.01. http://www.cs.umd.edu/projects/plus/SHOE/spec.html

Page 8: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

8

RDF Resource Description Framework (RDF) [3] is a

recommendation of W3C for Semantic Web [4]

It defines a simple model to describe relationships among resources in terms of properties and values.

SVO form (Subject-Verb-Object) Resource-property-Value

3. Ora Lassila and Ralph R. Swick: Resource description framework (RDF).

http://www.w3c.org/TR/WD-rdf-syntax

4. The SemanticWeb Homepage. http://www.semanticweb.org

Page 9: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

9

RDF (Cont.)

< Re s o u rc e A >

< p ro p e rty A >

< Re s o u rc e B>

< p ro p e rty B>

< Re s o u rc e C>

< p ro p e rty C>

Va lu e C

< /p ro p e rty C>

< /Re s o u rc e C>

< /p ro p e rty B>

< /Re s o u rc e B>

< /p ro p e rty A >

< /Re s o u rc e A >

Va lu e o fp ro p e rty B

Va lu e o fp ro p e rty A

Page 10: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

10

RDFS RDF Schema (RDFS) [5], the primitive

description language of RDF

Provide some basic primitives subClassOf subPropertyOf …

5. Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0, W3C Candidate Recommendation 27 March 2000. http://www.w3.org/TR/rdf-schema/

Page 11: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

11

DAML+OIL DARPA Agent Markup Language (DAML) [6]

To facilitate the semantic concepts and relationships understood by machines

Ontology Inference Layer (OIL) [7] Extends RDFS with additional language primitives

not yet presented in RDFS. DAML+OIL [8] are the successors of RDFS

Combination of DAML and OIL More semantic rich primitives are defined

6. The DARPA Agent Markup Language Homepage. http://daml.semanticweb.org/

7. The Ontology Inference Layer OIL Homepage.http://www.ontoknowledge.org/oil/TR/oil.long.html

8. DAML+OIL Definition. http://www.daml.org/2001/03/daml+oil

Page 12: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

12

OWL DAML+OIL is evolving as OWL (Web Ontology

Language) [9].

OWL is almost the same as DAML+OIL

Some primitives of DAML+OIL are renamed in OWL for easier understanding.

e.g., “sameClassAs” is changed to “equivalentClass” …

9. Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider and Lynn Andrea Stein. OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/

Page 13: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

13

Primitives of OWL

“owl” before “:” is the namespace owl:equivalentClass owl:euqivalentProperty owl:sameIndividualAs owl:disjointWith owl:differentFrom …

Page 14: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

14

Our Extension of OWL (EOWL)

We extend OWL with the following primitives eowl:orderingProperty eowl:overlap eowl:properSubClassOf eowl:properSubPropertyOf …

Page 15: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

15

OWL-based Semantic Conflicts Cases

A. Name conflictsB. Order sensitive conflictsC. Scaling conflictsD. Whole and part conflictsE. Partial similarity conflictsF. Swap conflicts

Page 16: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

16

A. Name conflicts Example A. two distributed data warehouses

one is used to analyze the United States market country, state, city and district

and the other is used to analyze the China market country, province, city and county

Based on the context

“provicnce” is defined equivalent to “State” using the OWL primitive “owl:equivalentClass”.

To resolve this conflict, one name needs to be changed. Change to the referenced name.

Page 17: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

17

A. Name conflicts (Cont.)

<owl:Class rdf:ID="Province"> <rdfs:label>Province</rdfs:label> <owl:equivalentClass rdf:resource="#State"/></owl:Class>

Fig. A. Detection of synonym conflicts

“owl:equivalentClass” is the indicator to detect synonym conflicts

Change to “State” as which is referenced in the ontology definition.

Page 18: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

18

A. Name conflicts (Cont.) Case A. Synonyms. The OWL primitives

“owl:equivalentClass”, “owl:equivalentProperty” and “owl:sameInvidualAs” are indicators to detect this case.

Conflict Resolution Rule A. If synonym conflicts are detected, different attribute names with the same semantics need to be translated to the same name (referenced name) for smooth data interoperability.

Page 19: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

19

B. Order sensitive conflicts Example B. Consider the highest three scores of a course.

The highest three scores of course A are listed as “90, 95, 100” at ascending order,

The highest three scores of course B are listed as “98, 95, 93” at descending order.

The “highestThreeScores” is defined as an “eowl:orderingProperty” in the ontology

The sequences of the highest three scores for course A and B should be adjusted both to ascending order or descending order.

Adjust to the sequence of the first one by default, e.g. the sequence of course A

Page 20: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

20

B. Order sensitive conflicts (Cont.)

Fig. B. Detection of order sensitive conflicts

<eowl:orderingProperty rdf:ID="highestThreeScores"> <rdfs:label>highest three scores of a course</rdfs:label> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="xsd#integer"/></eowl:orderingProperty>

We can further define the ascendant or descendant order for more precise semantics.

Page 21: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

21

B. Order sensitive conflicts (Cont.) Case B. Order sensitive. EOWL primitive

“eowl:orderingProperty” and RDF primitive “rdf:Seq” are indicators to detect this case.

Conflict Resolution Rule B. If order sensitive conflicts are detected, we need to adjust the member sequence according to the same criterion for smooth data interoperability, the sequence of the first one by default.

Page 22: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

22

C. Scaling conflicts Example C. Consider two database schemas

Product(ID, Price) Product(ID, Price)

One price may refer to the US dollars, while the other may refer to the Singapore dollars. Figure 4 shows some concepts about a currency ontology; “price” is defined

Translate the price to refer to the same currency unit. The unit of the first one by default.

Page 23: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

23

C. Scaling conflicts (Cont.)

Fig. C. Detection of scaling conflicts

<owl:DatatypeProperty rdf:ID="price"> <rdfs:domain rdf:resource="#Product"> <rdfs:range rdf:parseType="Resource"> <rdf:value/> <currency:CurrencyUnit/> </rdfs:range></owl:DatatypeProperty>

Page 24: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

24

C. Scaling conflicts (Cont.) Case C. Semantic conflicts may exist if the

value of a data type property comprises both value and unit (Scaling). RDF primitive “rdf:parseType="Resource"” and OWL primitive “owl:DatatypeProperty” are indicators for this case.

Conflict Resolution Rule C. If scaling conflicts are detected, the value should be translated to refer to the same unit for smooth data interoperability. The first unit by default.

Page 25: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

25

D. Whole and part conflicts Example D. Consider schemas

Person(ID, name) Person(ID, surname, givenName)

“surname” and “givenName” are both defined as the proper sub property of “name”; using “eowl:properSubClassOf”

“eowl:properSubClassOf” has clearer semantics than “rdfs:subClassOf” because “rdfs:subClassOf” is ambiguous with two meanings: “eowl:properSubClassOf”and “owl:equivalentClass”.

Divide the whole attribute “name” to the part attributes “surname” and “givenName”

Or combine the part attributes “surname” and “givenName” together in the correct sequence to form the whole attribute “name”.

Page 26: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

26

D. Whole and part conflicts (Cont.)

Fig. D1. Detection of whole and part conflicts

<rdf:Property rdf:ID="surname"> <eowl:properSubPropertyOf rdf:resource="#name"></rdf:Property>

Fig. D2. Detection of whole and part conflicts

<rdf:Property rdf:ID=“givenname"> <eowl:properSubPropertyOf rdf:resource="#name"></rdf:Property>

Page 27: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

27

D. Whole and part conflicts (Cont.) Case D. Semantic conflicts may exist if one

concept is completely contained in another concept (Whole and part). EOWL primitives “eowl:properSubClassOf”, “eowl:properSubPropertyOf” are indicators to detect this case.

Conflict Resolution Rule D. If whole and part conflicts are detected, the whole attributes should be divided into part attributes or the part attributes should be combined together to whole attributes for smooth data interoperability.

Page 28: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

28

E. Partial similarity conflicts Example E. integration ResearchAssistant and

GraduateStudent

The relationship between research assistant and graduate student is overlap because some research assistants are also graduate students,

but not all research assistants are graduate students,

and not all graduate students are research assistants.

After integration, there should be three schemas: Research Assistant but not Graduate Student RNotG Graduate Student but not Research Assistant GNotR both Research Assistant and Graduate Student RAndG

Page 29: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

29

E. Partial similarity conflicts (Cont.)

Fig. E. Detection of partial similarity conflicts

<owl:Class rdf:ID="ResearchAssistant"> <eowl:overlap rdf:resource="#GraduateStudent"/></owl:Class>

Page 30: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

30

E. Partial similarity conflicts (Cont.) Case E. Semantic conflicts may exist if two

concepts are overlapped (Partial similarity). EOWL primitive “eowl:overlap” is indicators to detect this case.

Conflict Resolution Rule E. If partial similarity conflicts are detected, the overlap part should be separated before integration.

Page 31: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

31

F. Swap conflicts Example F. Continued from Example A

In China, county is contained in city (city has larger area)

In US, city is contained in county (county has larger area).

The domain (“County”) of property “region:containedIn” in the China ontology is just the range of the same property “region:containedIn” in the US ontology

The range (“City”) of property “region:containedIn” in the China ontology is just the domain of the same property “region:containedIn” in the US ontology.

We can add “China.” or “US.” before “City” and “County” for smooth data interoperability.

Page 32: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

32

F. Swap conflicts (Cont.)

Fig. F1. Detection of swap conflicts (the relationship between city and county in the China ontology)

<owl:Class rdf:ID="County"> <region:containedIn rdf:resource="#City”/></owl:Class>

Fig. F2. Detection of swap conflicts (the relationship between city and county in the US ontology)

<owl:Class rdf:ID="City"> <region:containedIn rdf:resource="#County”/></owl:Class>

Page 33: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

33

F. Swap conflicts (Cont.) Case F. Semantic conflicts may exist if the

domain of a property in the first ontology is the range of the same property in the second ontology, and the range of the property in the first ontology is the domain of the same property in the second ontology (Swap).

Conflict Resolution Rule F. If swap conflicts are detected, context restrictions (see Example F) should be added to the schema for smooth data interoperability.

Page 34: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

34

Conclusion We extend OWL with several primitives which have

clearer semantics

summarize several cases based on OWL in which semantic conflicts are easily to be encountered

The conflict resolution rules for each case are presented.

In the future, OWL will be frequently used to build ontologies, and this paper provides a computer-aid approach to detect and resolve semantic conflicts for smooth data interoperability.

Page 35: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability

35

References

1. Robert Neches, Richard Fikes, Timothy W. Finin, Thomas R. Gruber, Ramesh Patil, Ted E. Senator, William R. Swartout: Enabling Technology for Knowledge Sharing. AI Magazine 12(3): pp36-56 (1991)

2. Sean Luke and Jeff Heflin: SHOE Specification 1.01. http://www.cs.umd.edu/projects/plus/SHOE/spec.html

3. Ora Lassila and Ralph R. Swick: Resource description framework (RDF).

http://www.w3c.org/TR/WD-rdf-syntax

4. The SemanticWeb Homepage. http://www.semanticweb.org5. Dan Brickley and R.V. Guha. Resource Description Framework (RDF) Schema Specification 1.0,

W3C Candidate Recommendation 27 March 2000. http://www.w3.org/TR/rdf-schema/6. The DARPA Agent Markup Language Homepage.

http://daml.semanticweb.org/

7. The Ontology Inference Layer OIL Homepage.

http://www.ontoknowledge.org/oil/TR/oil.long.html

8. DAML+OIL Definition. http://www.daml.org/2001/03/daml+oil9. Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L. McGuinness, Peter F. Patel-Schneider

and Lynn Andrea Stein. OWL Web Ontology Language Reference. http://www.w3.org/TR/owl-ref/

Page 36: OWL-based Semantic Conflicts      Detection and Resolution for Data     Interoperability