The Rosetta Model - Poster · Microsoft PowerPoint - The Rosetta Model - Poster.ppt Author: tking...

1
The Rosetta Stone illustrates how different languages can be used to express the same concepts. The same is true with data models. We undertook an analysis of many of the existing physical science data models to determine where there is commonality and to identify the difference. Each of the data models have been derived in different ways. Some have been based on formal ontologies, others on informal ontologies and others on relational schemas. We focused on the final "published" data model and did not make any judgments regarding the portability of the expression of the data model. Overview Overview The Rosetta Model Can the Different Physical Science Data Models be Reconciled? Todd A King 1 ([email protected] ), Deborah L McGuinness 2,3 ([email protected] ), Raymond J Walker 1 ([email protected] ), Peter Fox 4 ([email protected] ), D Aaron Roberts 5 ([email protected] ), Christopher Harvey 6 ([email protected] ) 1 Insitute of Geophysics and Planetary Physics/UCLA, 2 Rensselaer Polytechnic Institute, 3 Stanford University, 4 UCAR, 5 NASA/NSSDC, 6 Centre de Données de la Physique des Plasmas (CDPP) Models are designed for a purpose which can range from sharing generic items to providing detailed descriptions for a particular domain. We have identified four top level purposes for data models: Resource Discovery: Aids in locating resources for a particular use or area of study. Resource Sharing: Provides enough detail to retrieve and utilize the resource. Human or machine (Ideally both) Archiving: Provides sufficient detail to support long term preservation. Content Classification: Useful in identifying a category or type for all or part of a resource. Our Model Classification Our Model Classification As seen in: Ontology Mapping and Alignment, Natasha Noy, Stanford University What role can ontologies play in defining a Rosetta Model? What role can ontologies play in defining a Rosetta Model? Answer: A critical role. One difficulty in comparing models is that they are expressed in different ways. To achieve model-to- model mapping there needs to be a common reference model or Interlingua. If this model is described as an ontology it can be the "upper ontology". All other models can be described as individual ontologies, preserving all idiosyncrasies of the model. Then mappings can take place at the upper level. Information is transferred by mapping entities at the upper level and then moving details into the target model. Is it possible to have a Rosetta Model? Is it possible to have a Rosetta Model? Answer: Yes Our analysis shows that at the top-level most of the models are very similar. While the terminology may differ there are common entities. At lower levels in the models there are differences in granularity, coverage and points of view. There common entities can form the basis for a Rosetta Model. Can they be reconciled? Can they be reconciled? Answer: Yes Our analysis shows that at the top-level most of the models are very similar. While the terminology may differ there are common entities. At lower levels in the models there are differences in granularity, coverage and points of view. There common entities can form the basis for a Rosetta Model. Dublin Core Originally designed for information resources (documents) and has been expanded to include data, images, movies, and other types of resources. Terms: 27 terms (15 core, 12 element types). Purpose: Resource Discovery (published works). Specification: Narrative Dublin Core Originally designed for information resources (documents) and has been expanded to include data, images, movies, and other types of resources. Terms: 27 terms (15 core, 12 element types). Purpose: Resource Discovery (published works). Specification: Narrative SPASE A data model designed for the Solar and Space Physics communities to unify the data environment to facilitate finding, retrieving, formatting, and obtaining basic information about data essential for research. Terms: 340 terms (10 resource types, 35 entities (containers), 30 enumerations, 55 attributes. 265 enumeration values) Purpose: Resource Discovery, Resource Sharing Content Classification Specification: Narrative, XML Schema and XMI(UML) SPASE A data model designed for the Solar and Space Physics communities to unify the data environment to facilitate finding, retrieving, formatting, and obtaining basic information about data essential for research. Terms: 340 terms (10 resource types, 35 entities (containers), 30 enumerations, 55 attributes. 265 enumeration values) Purpose: Resource Discovery, Resource Sharing Content Classification Specification: Narrative, XML Schema and XMI(UML) SWEET A common semantic framework for various Earth science initiatives. There are 17 ontologies consisting of biosphere, human_activities, process, substance, data_center, material_thing, property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena, space, and units. Terms: 3,940 terms (17 ontologies) Purpose: Content Classification Specification: OWL SWEET A common semantic framework for various Earth science initiatives. There are 17 ontologies consisting of biosphere, human_activities, process, substance, data_center, material_thing, property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena, space, and units. Terms: 3,940 terms (17 ontologies) Purpose: Content Classification Specification: OWL VSTO Virtual Solar Terrestrial Observatory. Originally designed as a set of ontologies for organizing and integrating information spanning upper atmospheric terrestrial physics to solar physics. Fundamental classes include instrument, observatory, data, and services. Its upper level has been reused in other science areas including volcanology and plate tectonics. Terms: 407 terms (one ontology with 35 top-level classes) Purpose: Resource Discovery, Resource Sharing, Content Classification. Specification: OWL VSTO Virtual Solar Terrestrial Observatory. Originally designed as a set of ontologies for organizing and integrating information spanning upper atmospheric terrestrial physics to solar physics. Fundamental classes include instrument, observatory, data, and services. Its upper level has been reused in other science areas including volcanology and plate tectonics. Terms: 407 terms (one ontology with 35 top-level classes) Purpose: Resource Discovery, Resource Sharing, Content Classification. Specification: OWL Rosetta Participants • Mission • Observatory • Instrument • Detector • Person • Reference • Target Product • Sample (Physical) • Data Structure (Digital) • Catalog (record collection) • Table (row, column) • Image (x, y, z) • Movie (x, y, z, t) • n-Array • Documents Resource • Repository • Registry • Web Link • Service Collection • Dataset • Event • Campaign Annotation • Notes • Terms • Associations Rosetta Participants • Mission • Observatory • Instrument • Detector • Person • Reference • Target Product • Sample (Physical) • Data Structure (Digital) • Catalog (record collection) • Table (row, column) • Image (x, y, z) • Movie (x, y, z, t) • n-Array • Documents Resource • Repository • Registry • Web Link • Service Collection • Dataset • Event • Campaign Annotation • Notes • Terms • Associations CAA Cluster Active Archive. Designed to support the archiving and distribution of high quality calibrated data products from ESA's Cluster mission, using an approach general enough to be applicable to other environments. It has a Mission, Observatory, Instrument hierarchy. The recovered data & metadata is adequate for API use. Terms: 480 terms (198 classes and , 282 enumeration items) Purpose: Resource Discovery, Resource Sharing, Archiving, Content Classification. Specification: Narrative and XML Schema CAA Cluster Active Archive. Designed to support the archiving and distribution of high quality calibrated data products from ESA's Cluster mission, using an approach general enough to be applicable to other environments. It has a Mission, Observatory, Instrument hierarchy. The recovered data & metadata is adequate for API use. Terms: 480 terms (198 classes and , 282 enumeration items) Purpose: Resource Discovery, Resource Sharing, Archiving, Content Classification. Specification: Narrative and XML Schema PDS3 A data set nomenclature designed to be consistent across discipline boundaries and standards for labeling data files. Its intent is archive planetary science data and supporting information to enable effective use and interpretation. Terms: 14,458 terms (1643 elements and 81 objects. 12,734 standard values) Purpose: Archiving Specification: Narrative and ODL + PDS vocabulary PDS3 A data set nomenclature designed to be consistent across discipline boundaries and standards for labeling data files. Its intent is archive planetary science data and supporting information to enable effective use and interpretation. Terms: 14,458 terms (1643 elements and 81 objects. 12,734 standard values) Purpose: Archiving Specification: Narrative and ODL + PDS vocabulary IVOA A set of standards to "facilitate the international coordination" of the "utilization of astronomical archives as an integrated and interoperating virtual observatory." Terms: 63 terms (6 categories, 57 terms) Purpose: Resource Discovery Content Classification Specification: Narrative and XML Schema IVOA A set of standards to "facilitate the international coordination" of the "utilization of astronomical archives as an integrated and interoperating virtual observatory." Terms: 63 terms (6 categories, 57 terms) Purpose: Resource Discovery Content Classification Specification: Narrative and XML Schema Summary Summary A draft of the Rosetta Model is presented which can serve as an interlingua. This model will be expressed both as an ontology and a schema so that "translations" can occur. Work on identifying model-to-model mapping (equivalences) is underway.

Transcript of The Rosetta Model - Poster · Microsoft PowerPoint - The Rosetta Model - Poster.ppt Author: tking...

Page 1: The Rosetta Model - Poster · Microsoft PowerPoint - The Rosetta Model - Poster.ppt Author: tking Created Date: 12/10/2007 10:03:28 AM ...

The Rosetta Stone illustrates how different languages can be used to express the same concepts. The same is true with data models. We undertook an analysis of many of the existing physical science data models to determine where there is commonality and to identify the difference. Each of the data models have been derived in different ways. Some have been based on formal ontologies, others on informal ontologies and others on relational schemas. We focused on the final "published" data model and did not make any judgments regarding the portability of the expression of the data model.

OverviewOverview

The Rosetta ModelCan the Different Physical Science Data Models be Reconciled?

Todd A King1 ([email protected]), Deborah L McGuinness2,3 ([email protected]), Raymond J Walker1 ([email protected]), Peter Fox4 ([email protected]), D Aaron Roberts5 ([email protected]), Christopher Harvey6 ([email protected])1Insitute of Geophysics and Planetary Physics/UCLA, 2Rensselaer Polytechnic Institute, 3Stanford University, 4UCAR, 5NASA/NSSDC, 6Centre de Données de la Physique des Plasmas (CDPP)

Models are designed for a purpose which can range from sharing generic items to providing detailed descriptions for a particular domain. We have identified four top level purposes for data models:

Resource Discovery: Aids in locatingresources for a particular use or areaof study.

Resource Sharing: Provides enough detailto retrieve and utilizethe resource. Human ormachine (Ideally both)

Archiving: Provides sufficient detail to support long term preservation.

Content Classification: Useful in identifying acategory or type forall or part of a resource.

Our Model ClassificationOur Model Classification

As seen in: Ontology Mapping and Alignment, Natasha Noy, Stanford

University

What role can ontologies play in defining a Rosetta Model?

What role can ontologies play in defining a Rosetta Model?

Answer: A critical role. One difficulty in comparing models is that they are expressed in different ways. To achieve model-to-model mapping there needs to be a common reference model or Interlingua. If this model is described as an ontology it can be the "upper ontology". All other models can be described as individual ontologies, preserving all idiosyncrasies of the model. Then mappings can take place at the upper level. Information is transferred by mapping entities at the upper level and then moving details into the target model.

Is it possible to have a Rosetta Model?

Is it possible to have a Rosetta Model?

Answer: YesOur analysis shows that at the top-level most of the models are very similar. While the terminology may differ there are common entities. At lower levels in the models there are differences in granularity, coverage and points of view. There common entities can form the basis for a Rosetta Model.

Can they be reconciled?Can they be reconciled?

Answer: YesOur analysis shows that at the top-level most of the models are very similar. While the terminology may differ there are common entities. At lower levels in the models there are differences in granularity, coverage and points of view. There common entities can form the basis for a Rosetta Model.

Dublin CoreOriginally designed for information resources (documents) and has been expanded to include data, images, movies, and other types of resources. Terms: 27 terms

(15 core, 12 element types). Purpose: Resource Discovery

(published works).Specification: Narrative

Dublin CoreOriginally designed for information resources (documents) and has been expanded to include data, images, movies, and other types of resources. Terms: 27 terms

(15 core, 12 element types). Purpose: Resource Discovery

(published works).Specification: Narrative

SPASEA data model designed for the Solar and Space Physics communities to unify the data environment to facilitate finding, retrieving, formatting, and obtaining basic information about data essential for research. Terms: 340 terms

(10 resource types, 35 entities (containers),30 enumerations, 55 attributes. 265 enumeration values)

Purpose: Resource Discovery, Resource SharingContent Classification

Specification: Narrative, XML Schema and XMI(UML)

SPASEA data model designed for the Solar and Space Physics communities to unify the data environment to facilitate finding, retrieving, formatting, and obtaining basic information about data essential for research. Terms: 340 terms

(10 resource types, 35 entities (containers),30 enumerations, 55 attributes. 265 enumeration values)

Purpose: Resource Discovery, Resource SharingContent Classification

Specification: Narrative, XML Schema and XMI(UML)

SWEETA common semantic framework for various Earth science initiatives. There are 17 ontologies consisting of biosphere, human_activities, process, substance, data_center, material_thing, property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena, space, and units. Terms: 3,940 terms

(17 ontologies)Purpose: Content ClassificationSpecification: OWL

SWEETA common semantic framework for various Earth science initiatives. There are 17 ontologies consisting of biosphere, human_activities, process, substance, data_center, material_thing, property, sunrealm, data, numerics, sensor, time, earthrealm, phenomena, space, and units. Terms: 3,940 terms

(17 ontologies)Purpose: Content ClassificationSpecification: OWL

VSTOVirtual Solar Terrestrial Observatory. Originally designed as a set of ontologies for organizing and integrating information spanning upper atmospheric terrestrial physics to solar physics. Fundamental classes include instrument, observatory, data, and services. Its upper level has been reused in other science areas including volcanology and plate tectonics.Terms: 407 terms

(one ontology with 35 top-level classes) Purpose: Resource Discovery,

Resource Sharing, Content Classification.

Specification: OWL

VSTOVirtual Solar Terrestrial Observatory. Originally designed as a set of ontologies for organizing and integrating information spanning upper atmospheric terrestrial physics to solar physics. Fundamental classes include instrument, observatory, data, and services. Its upper level has been reused in other science areas including volcanology and plate tectonics.Terms: 407 terms

(one ontology with 35 top-level classes) Purpose: Resource Discovery,

Resource Sharing, Content Classification.

Specification: OWL

RosettaParticipants

• Mission• Observatory• Instrument

• Detector• Person• Reference• Target

Product• Sample (Physical)• Data Structure (Digital)

• Catalog (record collection)• Table (row, column)• Image (x, y, z) • Movie (x, y, z, t)• n-Array

• Documents

Resource• Repository• Registry• Web Link• Service

Collection• Dataset• Event• Campaign

Annotation• Notes• Terms• Associations

RosettaParticipants

• Mission• Observatory• Instrument

• Detector• Person• Reference• Target

Product• Sample (Physical)• Data Structure (Digital)

• Catalog (record collection)• Table (row, column)• Image (x, y, z) • Movie (x, y, z, t)• n-Array

• Documents

Resource• Repository• Registry• Web Link• Service

Collection• Dataset• Event• Campaign

Annotation• Notes• Terms• Associations

CAACluster Active Archive. Designed to support the archiving and distribution of high quality calibrated data products from ESA's Cluster mission, using an approach general enough to be applicable to other environments. It has a Mission, Observatory, Instrument hierarchy. The recovered data & metadata is adequate for API use. Terms: 480 terms

(198 classes and , 282 enumeration items) Purpose: Resource Discovery,

Resource Sharing, Archiving,Content Classification.

Specification: Narrative and XML Schema

CAACluster Active Archive. Designed to support the archiving and distribution of high quality calibrated data products from ESA's Cluster mission, using an approach general enough to be applicable to other environments. It has a Mission, Observatory, Instrument hierarchy. The recovered data & metadata is adequate for API use. Terms: 480 terms

(198 classes and , 282 enumeration items) Purpose: Resource Discovery,

Resource Sharing, Archiving,Content Classification.

Specification: Narrative and XML Schema

PDS3A data set nomenclature designed to be consistent across discipline boundaries and standards for labeling data files. Its intent is archive planetary science data and supporting information to enable effective use and interpretation. Terms: 14,458 terms

(1643 elements and 81 objects. 12,734 standard values)

Purpose: ArchivingSpecification: Narrative and ODL + PDS

vocabulary

PDS3A data set nomenclature designed to be consistent across discipline boundaries and standards for labeling data files. Its intent is archive planetary science data and supporting information to enable effective use and interpretation. Terms: 14,458 terms

(1643 elements and 81 objects. 12,734 standard values)

Purpose: ArchivingSpecification: Narrative and ODL + PDS

vocabulary

IVOAA set of standards to "facilitate the international coordination" of the "utilization of astronomical archives as an integrated and interoperating virtual observatory."Terms: 63 terms

(6 categories, 57 terms) Purpose: Resource Discovery

Content ClassificationSpecification: Narrative and XML Schema

IVOAA set of standards to "facilitate the international coordination" of the "utilization of astronomical archives as an integrated and interoperating virtual observatory."Terms: 63 terms

(6 categories, 57 terms) Purpose: Resource Discovery

Content ClassificationSpecification: Narrative and XML Schema

SummarySummary

A draft of the Rosetta Model is presented which can serve as an interlingua. This model will be expressed both as an ontology and a schema so that "translations" can occur. Work on identifying model-to-model mapping (equivalences) is underway.