SSSW 2012 - Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies

1
Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies Thomas Bosch (M.Sc.) [email protected] | http://boschthomas.blogspot.com Problem Traditionally, ontology engineers work in close collaboration with domain experts to design domain ontologies (DOs) which requires lots of time and effort DOs as well as XSDs describe domain data models In many cases, XSDs are already defined and can therefore be reused to design DOs Hypothesis The effort and the time delivering high quality DOs using the proposed approach is much less than creating DOs completely manual Main Research Question How to accelerate the time-consuming process designing DOs based on already available XSDs? XSD and OWL follow different modeling goals, the mapping transports only XSDs' information, and generated ontologies (GOs) are not conform to the highest quality requirements of DOs GOs are not immediately useful domain experts and ontology engineers enrich GOs with additional domain-specific semantic information in form of DOs Benefits Process designing DOs from scratch is sped up significantly All XSDs' information (terminology, syntactic structure of XML docs) is reused in GOs GOs' RDF representations can be published in the LOD cloud and linked to other RDF datasets All XML data conforming to XSDs can be imported automatically as DOs' instances GOs and DOs can be maintained in a fast way Detect technical and content-related data models' weaknesses Novelty of Approach Based on XSD meta-model Does not extract semantics out of XSDs Transformation on terminological and assertional knowledge level Automatic transformation of XSDs and XML docs More expressive power of OWL instead of RDFS GOs Limitations Prerequisite: XSDs Not suitable use cases (e.g. when XSDs do not represent the domain knowledge correctly or when XSDs are technically not well designed) Map XSDs to GOs <xs:element name="VariableName" ... /> VariableName Element <xs:element name= "VariableName" ... /> VariableName name_Element_String.{'VariableName'} <xs:attribute ref="lang"/> Lang-Reference ref_Attribute_Attribute.Lang <xs:element name="VariableName" type="NameType"/> VariableName type_Element_Type.NameType <xs:extension><xs:attribute name="translated"/><xs:attribute name= "translatable"/></xs:extension> Extension1 contains_Extension_Attribute.(Translated Translatable) Use Cases To proof approach's generality: any XSDs and corresp. XML docs can be converted to GOs and their RDF representations, as all XSD meta-model's components are covered Generic test cases: derived from XSD meta-model Domain-specific use cases: Data Documentation Initiative (DDI) ontology; projects: MISSY, da|ra, LOD pilot project, SOFISwiki Evaluation To verify the hypothesis User study to compare traditional manual and proposed approach (define measurement methods) Derive DOs of multiple and differing domains Proposed Approach Derive DOs using SWRL rules

Transcript of SSSW 2012 - Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies

Page 1: SSSW 2012 - Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies

Reusing XML Schemas' Information as a Foundation for Designing Domain Ontologies

Thomas Bosch (M.Sc.) [email protected] | http://boschthomas.blogspot.com

Problem • Traditionally, ontology engineers work in close collaboration with

domain experts to design domain ontologies (DOs) which requires lots of time and effort

• DOs as well as XSDs describe domain data models • In many cases, XSDs are already defined and can therefore be

reused to design DOs

Hypothesis The effort and the time delivering high quality DOs using the proposed approach is much less than creating DOs completely manual

Main Research Question How to accelerate the time-consuming process designing DOs based on already available XSDs?

XSD and OWL follow different modeling goals, the mapping transports only XSDs' information, and generated ontologies (GOs) are not conform to the highest quality requirements of DOs GOs are not immediately useful domain experts and ontology engineers enrich GOs with additional

domain-specific semantic information in form of DOs

Benefits • Process designing DOs from scratch is sped up significantly • All XSDs' information (terminology, syntactic structure of XML docs)

is reused in GOs • GOs' RDF representations can be published in the LOD cloud and

linked to other RDF datasets • All XML data conforming to XSDs can be imported automatically as

DOs' instances • GOs and DOs can be maintained in a fast way • Detect technical and content-related data models' weaknesses

Novelty of Approach • Based on XSD meta-model • Does not extract semantics out of XSDs • Transformation on terminological and assertional knowledge level • Automatic transformation of XSDs and XML docs • More expressive power of OWL instead of RDFS GOs

Limitations • Prerequisite: XSDs • Not suitable use cases (e.g. when XSDs do not represent the

domain knowledge correctly or when XSDs are technically not well designed)

Map XSDs to GOs • <xs:element name="VariableName" ... /> VariableName ⊑ Element • <xs:element name= "VariableName" ... /> VariableName ⊑ name_Element_String.{'VariableName'} • <xs:attribute ref="lang"/> Lang-Reference ⊑ ref_Attribute_Attribute.Lang • <xs:element name="VariableName" type="NameType"/> VariableName ⊑ type_Element_Type.NameType • <xs:extension><xs:attribute name="translated"/><xs:attribute

name= "translatable"/></xs:extension> Extension1 ⊑ contains_Extension_Attribute.(Translated ⊔ Translatable)

Use Cases • To proof approach's generality: any XSDs and corresp. XML docs

can be converted to GOs and their RDF representations, as all XSD meta-model's components are covered

• Generic test cases: derived from XSD meta-model • Domain-specific use cases: Data Documentation Initiative (DDI)

ontology; projects: MISSY, da|ra, LOD pilot project, SOFISwiki

Evaluation • To verify the hypothesis • User study to compare traditional manual and proposed approach

(define measurement methods) • Derive DOs of multiple and differing domains

Proposed Approach Derive DOs using SWRL rules