TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall)...

38
TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic Institute Presentation for the ESIP Semantic Web Cluster, 4/22/2014

Transcript of TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall)...

Page 1: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCExperience in ontology engineering with the Global Change Information

System

Xiaogang (Marshall) Ma Tetherless World Constellation

Rensselaer Polytechnic Institute

Presentation for the ESIP Semantic Web Cluster, 4/22/2014

Page 2: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCAcknowledgements

• Project:– Global Change Information System: Information Model and

Semantic Application Prototypes, funded by NSF through UCAR

• Collaborators:– Peter Fox (PI, TWC/RPI)

– Curt Tilmes (Co-PI, NASA/USGCRP)

– Xiaogang (Marshall) Ma (Project lead, TWC/RPI)

– Jin Guang Zheng (TWC/RPI)

– Justin Goldstein (USGCRP/UCAR)

– Stephan Zednik (TWC/RPI)

– Linyun Fu (TWC/RPI)

– Brian Duggan (USGCRP/UCAR)

– Steve Aulenbach (USGCRP/UCAR)

– Patrick West (TWC/RPI)

2

Page 3: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCContents

1. Ontologies in computer science

2. The GCIS Ontology

3. Experience from ontology engineering practice

4. Additional operations and tools to refine an ontology

3

Page 4: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC1. Ontologies in computer science

• An ontology spectrum

Italic text explains typical features of concepts and relationships in each ontology type

(from Ma 2011, adapted from Borgo et al., 2005; McGuinness, 2003; Obrst, 2003; Uschold and Gruninger, 2004; Welty, 2002)

4

Page 5: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCA few examples following that spectrum

• Catalog/Glossary– Neuendorf, K.K.E., Mehl, J.J.P., Jackson, J.A., 2005. Glossary of Geology, 5th edition.

American Geological Institute: Alexandria, VA, USA, 800 pp. See latest version at: http://www.agiweb.org/pubs/glossary/

• Taxonomy– BGS Rock Classification Scheme, see: https://www.bgs.ac.uk/bgsrcs/

• Thesaurus– AQSIQ, 1988. GB/T 9649-1988 The Terminology Classification Codes of Geology and Mineral

Resources. General Administration of Quality Supervision, Inspection and Quarantine of P.R. China (AQSIQ). Standards Press of China, Beijing, China. 1937 pp. (In CN&EN)

• Conceptual Schema– NADM Steering Committee, 2004. NADM Conceptual Model 1.0—A conceptual model for

geologic map information: U.S. Geological Survey Open-File Report 2004-1334, North American Geologic Map Data Model (NADM) Steering Committee, Reston, VA, USA, 58 pp. See: http://pubs.usgs.gov/of/2004/1334

• Ontologies encoded in RDF format– Semantic Web for Earth and Environmental Terminology (SWEET). See:

http://sweet.jpl.nasa.gov/

5

Page 6: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCAnother dimension of ontologies

• Top-level ontologies describe very general concepts like space, time, matter, object, event, action, etc., which are independent of a particular problem or domain

• Domain ontologies and task ontologies describe, respectively, the vocabulary related to a generic domain (e.g., medicine) or a generic task or activity (e.g., diagnosing)

• Application ontologies describe concepts depending both on a particular domain and task, which are often specializations of both the related ontologies

top-level ontology

domain ontology task ontology

application ontology

(Guarino, 1997)

Ontologies according to their level of dependence on a particular task or point of view

Specialization of

Specialization of

6

Page 7: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCA few examples following that dimension

• Top-level ontology– DOLCE: Descriptive Ontology for Linguistic and Cognitive Engineering, see:

http://www.loa.istc.cnr.it/old/DOLCE.html

• Domain ontologies and Task ontologies – PROV-O: The W3C PROV Ontology (for represent and interchange provenance

information), see: http://www.w3.org/TR/prov-o/ – BIBO: The Bibliographic Ontology, see: http://bibliontology.com/ – ORG: The Organization Ontology, see: http://www.w3.org/TR/vocab-org/ – DCAT: The Data Catalog Vocabulary, see: http://www.w3.org/TR/vocab-dcat/

• Application ontology– GCIS: The GCIS Ontology, see:

http://tw.rpi.edu/web/project/gcis-imsap/GCISOntology

7

Page 8: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCA few methods for ontology engineering

• Ontology Design Patterns– Widely used are Content Ontology Design Patterns: small ontologies that

mediate between use cases and ontology design solutions (Gangemi and Presutti, 2009)

• Agile Methods for Software Engineering– Adaptive planning; evolutionary development; a time-boxed iteration; and

rapid and flexible response to change (Cohen et al., 2004)

• Use case-driven iterative approach– Use cases for identifying questions, resources & methods; small team &

mixed skills; a context for collaboration between computer scientists & domain scientists; review & iteration; rapid prototype (Fox and McGuinness, 2008)

8

Page 9: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCThe use case-driven iterative approach

More details at: http://tw.rpi.edu/web/doc/TWC_SemanticWebMethodology 9

Page 10: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC2. The GCIS Ontology

• Global Change Information System (GCIS)– An information system under development through the United

States Global Change Research Program (USGCRP) that establishes data interfaces and interoperable repositories of climate and global change data which can be easily and efficiently accessed, integrated with other data sets, maintained over time and expanded as needed into the future

• GCIS Ontology– An application ontology designed for representing and capturing

provenance information in GCIS– Currently focusing on the third National Climate Assessment draft

report (draft NCA3)– More information:

http://tw.rpi.edu/web/project/gcis-imsap/GCISOntology 10

Page 11: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCOntology reuse: improve interoperability

• PROV-O: W3C Provenance Ontology• DCTerms: Dublin Core Metadata Terms • DCType: Dublin Core Types• FOAF: Friend Of A Friend Vocabulary• BIBO: Bibliographic Ontology• ORG: Organization Ontology• SKOS: Simple Knowledge Organization System• OWL: Web Ontology Language• RDF: Resource Description Framework• RDFS:RDF Schema• XSD: XML Schema

11

Page 12: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC• PROV-O• DCTerms• DCType• FOAF• BIBO• ORG• SKOS• OWL• RDF• RDFS• XSD

@prefix prov: <http://www.w3.org/ns/prov#> .@prefix dcterms: <http://purl.org/dc/terms/> .@prefix dctype: <http://purl.org/dc/dcmitype/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix bibo: <http://purl.org/ontology/bibo/> .@prefix org: <http://www.w3.org/ns/org/> .@prefix skos: <http://www.w3.org/2009/08/skos-reference/skos.rdf#> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

12

Page 13: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCOntology engineering: use case analysis

13

• Title: Visit data center website of dataset used to generate a report figure

• Actor and system: a reader of the draft NCA3 on the GCIS website• Flow of interactions: A reader wishes to identify the source of the data

used to produce a particular figure in the draft NCA3. A reference to the paper in which the image contained in this figure was originally published appears in the figure caption. Clicking that reference displays a page of metadata information about the paper, including links to the datasets used in that paper. Pursuing each of those links presents a page of metadata information about the dataset, including a link back to the agency/data center web page describing the dataset in more detail and making the actual data available for order or download.

The first use case

Page 14: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCUse case analysis: Concept map

• Concept map– Graphical tool for organizing and representing knowledge (Novak

and Cañas, 2008)– Often used as the first step in information models that are pre-

cursors to ontology engineering (Starr and de Oliveira, 2013)

14

The IHMC CmapTools is widely used for use case analysis in Semantic Web applications, see: http://cmap.ihmc.us/

Page 15: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCAn intuitive concept map of the 1st use case

15

Page 16: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC

Classes and properties recognized from the use case

An intuitive concept map of the use case

16

Page 17: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC

Classes and properties recognized from the use case

An intuitive concept map of the use case

From an intuitive model to an ontology:

(1)A defined class or property should be meaningful and robust enough to meet the requirements of various use cases(2)An ontology can be extended by adding classes and properties recognized from new use cases through the iterative approach

From an intuitive model to an ontology:

(1)A defined class or property should be meaningful and robust enough to meet the requirements of various use cases(2)An ontology can be extended by adding classes and properties recognized from new use cases through the iterative approach

17

Page 18: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC• Title: Identify roles of people in the generation of a chapter in the draft

NCA3• Actor and system: a viewer of the GCIS website• Flow of interactions: A viewer sees that Chapter 6 (Agriculture) in the

draft NCA3 was written by a group of authors mentioned in a list. On the title page of that chapter the reader can view the role of each author, e.g., convening lead author, lead author or contributing author, in the generation of this report chapter.

• We decided to use the PROV-O ontology to describe this use case

The second use case

18

Page 19: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCThe three Starting Point classes in PROV-O ontology and the properties that relate them

Source: http://www.w3.org/TR/prov-o/ 19

Page 20: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCMapping the use case into PROV-O

isA isA

isAWriting of Chapter 6 in NCA3

Chapter 6 in NCA3

Author of Chapter 6

20

Page 21: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCRoles of agents in an activity in PROV-O

Source: http://www.w3.org/TR/prov-o/ 21

Page 22: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCMapping roles of chapter authors into PROV-O

Writing of Chapter 6 in NCA3

isAAuthor of Chapter 6

isA

Convening lead author

Lead author

Contributing author

isA

22

Page 23: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCHere only three of the eight authors of this chapter are shown. Each author had a specific role for this chapter.

Roles of people in the activity ‘Writing of Chapter 6’

Page 24: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCRe-using existing ontologies for the GCIS ontology

By such mappings we can use reasoners that are suitable for the PROV-O ontology, and thus to retrieve provenance graphs from the established GCIS

By such mappings we can use reasoners that are suitable for the PROV-O ontology, and thus to retrieve provenance graphs from the established GCIS

24

Page 25: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC• We have had more use case analyses to build the

GCIS ontology

25

Page 26: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC3. Experience from ontology engineering practice

Informal message:

Some times, a method is not a method at all.

26

Page 27: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC3. Experience from ontology engineering practice

• For human: A modeling approach– Transform the knowledge in our brains into a list of

concepts and their inter-relationships– Level of details: application needs & interoperability

• think about the ontology spectrum and the dimension of ontologies

• For machine: An encoding approach– Record the model in a format that can be used by

computers in a specific context• CSV, UML, XML, RDF/XML, Turtle, N3, etc.

27

Page 28: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC• For human: concept map helps

– Such as those in preceding slides

• For machine: AVOID ontology hijacking– We should not modify classes/properties that are

defined in external ontologies (e.g., those in PROV-O, BIBO, FOAF, ORG, etc.)

• For machine: domain and range of properties– Be careful about this when reuse properties from

external ontologies

28

Page 29: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCFor machine: avoid ontology hijacking

• For example, we can make such assertions in GCIS ontology:

• And we should avoid such assertions in GCIS ontology:

29

gcis:Agent

prov:Agent

foaf:Agentrdfs:subclassOf

prov:Agentfoaf:Agent rdfs:subclassOf

prov:Agentfoaf:Agent owl:equivalentClass

Page 30: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCFor machine: domain and range of properties

• For example, to use prov:wasGeneragedBy between an instance of gcis:Report and an instance of gcis:ReportGeneration

• We should assert that gcis:Report is a subclass of prov:Entity and gcis:ReportGeneration is a subclass of prov:Activity

30

:wasGeneratedBy a owl:ObjectProperty ; rdfs:domain :Entity ; rdfs:range :Activity ; rdfs:isDefinedBy <http://www.w3.org/ns/prov-o#> ; rdfs:subPropertyOf :wasInfluencedBy ;… :inverse "generated" ; :qualifiedForm :Generation, :qualifiedGeneration .

Definition of :wasGeneratedBy in the W3C PROV Ontology

Page 31: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC• After rounds of use case analysis, we had a

concept map for the GCIS ontology:– http://cmapspublic3.ihmc.us/rid=1MCJMLST0-

1G0CSWH-2YH4/GCIS_Ontology_v1_2.cmap

• And an RDF file synchronized with the concept map, serialized in Turtle format (.ttl):– http://escience.rpi.edu/ontology/GCIS-IMSAP/2/

GCISOntology_v_1_2.ttl

31

For more information about the Turtle format, see: http://www.w3.org/TeamSubmission/turtle/

Page 32: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC4. Additional operations and tools to refine an ontology

• For machine: ontology syntax check• For human: ontology documentation• Namespace prefix: brand your ontology

32

Page 33: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCFor machine: ontology syntax check

• There are many online tools that help check the grammar of an RDF file:– Such as the RDF Validator and Converter, see:

http://www.rdfabout.com/demo/validator/

33

Page 34: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCFor human: ontology documentation

• There are several online tools that help generate an ontology document for human to read– Such as the Live OWL Documentation Environment, see:

http://www.essepuntato.it/lode

34

See a list of similar tools at: http://tw.rpi.edu/web/project/SeSF/WorkingGroup/OntologyDocumentation

Page 35: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCNamespace prefix: brand your ontology

• For the GCIS ontology we use gcis as the namespace prefix– One can register namespace prefix and look up existing ones at:

http://prefix.cc/

35

Page 36: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCFinal output of the GCIS ontology

• Ontology documentation– http://escience.rpi.edu/ontology/GCIS-IMSAP/2/

GCISOntology_v_1_2.htm

• Concept map – http://cmapspublic3.ihmc.us/rid=1MCJMLST0-

1G0CSWH-2YH4/GCIS_Ontology_v1_2.cmap

• Ontology RDF serialized in Turtle format– http://escience.rpi.edu/ontology/GCIS-IMSAP/2/

GCISOntology_v_1_2.ttl

36

Page 37: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWCSee also

• Ma, X., Fox, P., Tilmes, C., Jacobs, K., Waple, A., 2014. Capturing and presenting provenance of global change information. Nature Climate Change. In Press.

• Tilmes, C., Fox, P., Ma, X., McGuinness, D., Privette, A.P., Smith, A., Waple, A., Zednik, S., Zheng, J., 2013. Provenance representation for the National Climate Assessment in the Global Change Information System. IEEE Transactions on Geoscience and Remote Sensing 51 (11), 5160-5168.

• Ma, X., Fox, P., 2013. Recent progress on geologic time ontologies and considerations for future works. Earth Science Informatics 6 (1), 31–46.

37

Page 38: TWC Experience in ontology engineering with the Global Change Information System Xiaogang (Marshall) Ma Tetherless World Constellation Rensselaer Polytechnic.

TWC

[email protected]

Thank you!

gcis rpi Sponsors