SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)
-
Upload
ralph-hodgson -
Category
Documents
-
view
95 -
download
1
Transcript of SEMTECH2011E - OnTheDay - TQ NIEM Ontologies and Vocabularies - (v5-ARH-sFINAL)
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 1
NIEM Ontologies and Vocabularies
Transforming NIEM to RDF/OWL and Querying NIEM-compliant Instance Data
using SPARQL and SPIN
Ralph Hodgson, CTO, TopQuadrant Gokhan Soydan, Semantic Solution Developer, TopQuadrant
SemTech 2011 East, Thursday, December 1, 2011, 3:00 PM - 3:50 PM Level: Technical – Intermediate
Location: Auditorium
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 2
What is there to talk about, is there enough time?
Business and Technical Motivations
Approaches to model-based Information Exchange using controlled vocabularies
Expressing NIEM as OWL Models and Vocabularies
The Power of RDF/OWL and SPARQL
Next Possibilities
Reusable Message Building blocks Composable Message Schemas Controllable Vocabularies Linked Data Information Insight
UML XML Schema UN/CEFACT CCTS OWL XML Schemas OWL and Turtle/JSON-LD
XSD to OWL Transformation U.S. DOJ Logical Entity Exchange
Specification 3.1 (LEXS) XML Instance Messages to RDF Conversion
SPARQL inferencing over LEXS Messages Demonstration
NIEM as LOD
Take Away
© Copyright 2011 TopQuadrant Inc 3
First let’s remind ourselves on: why information is exchanged
Sculpture by M. Chava Evans (Baltimore, MD) Sculpture, Studio 33, Torpedo Factory, Alexandria, VA
© Copyright 2011 TopQuadrant Inc 4
technical motivations …
Sculptures in the National Gallery, East Building, Washington DC, Nov 25, 2011
XML OWL
UML?
OWL as a specification language for information
models and controlled vocabularies
© Copyright 2011 TopQuadrant Inc 5
But life in the XML Ecology isn’t easy from hierarchies
to Graphs
from Graphs
to hierarchies more at http://topquadrantblog.blogspot.com/2011/09/living-in-xml-and-owl-world.html
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 6
Some breakthroughs: Co-existence of OWL and XSD/XML
+ TopBraid
Transformers
Convert XSD to RDF/OWL
XSD
RDF/OWL
TopBraid
Transformers
Convert XSD to RDF/OWL
XSD
RDF/OWL
TopBraid Transformers
Convert XSD to RDF/OWL
XSD
RDF/OWL
Semantic XML
Convert XML to RDF/OWL
XML
RDF/OWL + +
Make OWL Schemas from NIEM and LEXS XSD Schemas 1
2 Use the OWL Schemas to make RDF from LEXS XML Messages
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 7
SPARQL Rules (SPIN)
Convert RDF/OWL to XML
XML
RDF/OWL
SPARQL Web Pages
(SWP)
Convert HTML to PDF
HTML
ReportingHub Semantic Processing
SPARQL Rules (SPIN)
Convert XML to RDF/OWL
XML
RDF/OWL
SPARQL Web Pages (SWP)
Convert RDF/OWL to HTML
HTML
RDF/OWL
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 8
Generating XML Schemas and Controlled Vocabularies from OWL Models
GRDDL XSLT Generator
XSLT Processor
Going from XML to OWL
ref: XML SchemaPlus – http://www.xspl.us
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 9
Different Reasons to “Connect the Dots”
1) 360 Degrees View
2) Transitive Connections
3) Information Discovery
C
More about the same thing
A
B What is linked to a thing of
interest
A Find things that share common
attributes or relationships
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 10
Personal Motivations: August 1, 2009 – “Data Independence Day”
www.oegov.org
© Copyright 2011 TopQuadrant Inc 11
Current practices for “Living in the XML Ecology” raise many challenges:
X
X
1. Vocabulary Alignment
2. Governance of “core” models
3. Extensibility and tailoring of models to local needs
4. Resilience to change
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 12
Some ways XML Message Schemas have been, or are being, made using UML (1 of 5)
1 The Weather Data Model
ref: WXXM 1.1 Primer, 1.1 10 February 2010, https://wiki.ucar.edu/display/NNEWD/WXXM
Take Away
No URIs No inherent aggregation properties Special programs Complex queries
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 13
Some ways XML Message Schemas have been, or are being, made using UML (2 of 5)
2 CIM Models in the SmartGrid
ref: EPRI CIM and 61850 Harmonization 2009 Project Report, Nov 17, 2009, http://cimug.ucaiug.org/Meetings/Charlotte2009/Presentations/CIM%20and%2061850%20Harmonization%20102909.pdf
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 14
Some ways XML Message Schemas have been, or are being, made using UML (3 of 5)
3 Harmonizing Spatial Data – NEN 3610:2011 and GML
ref: http://www.nen.nl/web/Normshop/Norm/NEN-36102011-nl.htm
Configuration (XML)
GML Application Schema
(XML Schema)
ShapeChange (Java, Servlet)
UML Application Schema (XMI)
Configuration (XML)
GML Application Schema
(XML Schema)
ShapeChange (Java program)
UML model
Encoding
Rules
Guidelines
/
Take Away
No URIs No inherent aggregation properties Special programs Complex queries
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 15
Some of the ways XML Message Schemas have been, or are being, made using UML (4 of 5)
4 UN/CEFACT Standards for Message Exchange
/
Source: 16th UN/CEFACT PLENARY http://www.unece.org/fileadmin/DAM/cefact/cf_plenary/plenary10/UNCEFACT%2016TH%20PLENARY_full_rev5.ppt
Take Away
No URIs No inherent aggregation properties Special programs Complex queries
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 16
Some ways XML Message Schemas have been, or are being, made using UML (5 of 5)
5
/
NIEM Information Exchange Package Documentation
source: “Where have all the Standards Gone?”, Bruce Kelling (Moderator), http://www.ncja.org/Content/NavigationMenu/EducationEvents/2009NationalForum/AllSpeakers.Standards.ppt
Take Away
No URIs Complexity Recommended Practices Required Practices
© Copyright 2011 TopQuadrant Inc 17
TopQuadrant has faced the “OWL co-existence with UML and XML” challenges on a number of projects
SmartGrid Semantic Harmonization and Interoperability
NASA Telemetry and Command, Simulation and Data Architecture Models and Vocabularies
The Netherlands MoJ Ontology-Driven Metadata Workbench Message Builder
EPIM Reporting Hub for the Norwegian Oil and Gas Fields
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 18
The Netherlands MoJ Ontology-Driven Metadata Workbench Message Builder
Business Needs Accurate and rapid Information Sharing
between Organizations
Agility in response to Legislation Changes
Data Quality is guaranteed
Reduced Costs of Message Schema Development
Technical Benefits Direct and flexible Reuse of Data
Components
Full Automation of XML Schema creation
Semantic Consistency is preserved and confirmed
Linked Data / traceability
Version Management
ref: http://www.enterprisedatajournal.com/article/netherlands-ministry-justice-metadata-
workbench-composing-xml-message-schemas-owl-models.htm
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 19
The Netherlands MoJ Ontology-Driven Approach to Message Design using
UN/CEFACT Solution: Ontology-Based Metadata Workbench: Transform Domain Models into UN/CEFACT CCTS compliant representation and allow Business Analysts to assemble business documents for electronic messages from Component Parts.
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 20
Rich Ontologies
CCTS Ontologies
Core Component Overlay
Creation of XML Message Schemas
Contexts
Domains
Business Document Ontologies
CCTS MetaModel
CCTS Document
SPIN Transformation rules
CCTS XML SchemaPlus
CCTS XML Schema
XSP MetaModel
XSLT Script
Business Component
Overlay
“Rich” Ontologies are expressive models of domains. These include LKIF and detailed situations of law and legal document and procedures.
CCTS-Compliant XML Schemas are generated from the XSP Document
CCTS Document Editor XSP Generation XSD Generation
Users create CCTS documents from BIEs and Core Components
Projects
Acronyms
BIE Business Information Entity CCTS UN/CEFACT Core Component Technical
Specifications LKIF Legal Knowledge Interchange Format SPIN SPARQL Inferencing Notation XSLT XSL Transformations (XSLT) Version 2.0 XSP XML SchemaPlus
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 21
NASA Constellation Program
CxP 70160 ANX10
Infrastructure
Specification
CxP 70160 ANX11
Application Programming
Interface Specification
CxP 70160 ANX14
Policy and Security
Model
Constellation Program Data Architecture and Interoperability through the use of OWL Ontologies with strategies for co-existence with XML and other data formats.
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 22
Generating XML Schemas and Controlled Vocabularies from OWL Models
GRDDL XSLT Generator
XSLT Processor
Going from XML to OWL
ref: XML SchemaPlus – http://www.xspl.us
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 23
ReportingHub Vision
Need: “Reporting to authorities and
partners on the NCS in a cost
efficient and secure manner”
Outcome: “Improved Information
Integration and Exchange”
“Faster and better decisions”
Enablers:
“A Field Specific Asset Model based on the Common Asset Model –
ISO 15926, PCA RDL and NPD Facts”
“SPARQL as a way to query the data in a triple store and reason
about data using appropriate inference engine(s)”
“Web Services for hiding the complexity of SPARQL Queries”
“Machine driven creation of new data relationships without
restructuring the data model”
SPARQL Rules (SPIN)
Convert XML to RDF/OWL
XML
RDF/OWL
1500 named users, and
100 concurrent users
SPARQL Web Pages (SWP)
Convert HTML to PDF
HTML
SPARQL Web Pages (SWP)
Convert RDF/OWL to PDF
RDF/OWL
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 24
SPARQL Rules (SPIN)
Convert RDF/OWL to XML
XML
RDF/OWL
SPARQL Web Pages
(SWP)
Convert HTML to PDF
HTML
ReportingHub Semantic Processing
SPARQL Rules (SPIN)
Convert XML to RDF/OWL
XML
RDF/OWL
SPARQL Web Pages (SWP)
Convert RDF/OWL to HTML
HTML
RDF/OWL
Take Away
© Copyright 2011 TopQuadrant Inc 25
The NIEM/LEXS Experiment
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 26
The NIEM/LEXS Experiment
From NIEM/LEXS XSD Schemas and Instance Data
To OWL Models and RDF Triples
NIEM/LEXS RDF/OWL Stack
VAEM, VOAG, VOID, DC
LEXS Rules
LEXS Instances
DTYPE
NIEM Vocabs and
Datatypes
NIEM Ontologies
LEXS Ontology
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 27
What is NIEM?
“National Information Exchange Model, NIEM, is an interagency initiative to provide the foundation and building blocks for national-level interoperable information sharing and data exchange.
The NIEM project was formally announced at the Global Justice XML Data Model (Global JXDM) Executive Briefing on February 28, 2005.
It was initiated as, and continues to be, a joint venture between the U.S. Department of Homeland Security (DHS) and DOJ with outreach to other departments and agencies.
The base technology for NIEM is derived from the Global JXDM. ”
source: http://it.ojp.gov/default.aspx?area=implementationAssistance&page=1017&standard=486
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 28
What are IEP and IEPD? Options for implementing information exchanges
source: US DoJ Implementation Guidance for NIEM-Conformant Exchanges , http://www.hsdl.org/?view&did=487388
An Information Exchange Package (IEP) is an XML representation of the information shared for a specific business purpose.
An Information Exchange Package Documentation (IEPD) is a collection of artifacts (describing the purpose, structure and content of IEPs) that governs an information exchange.
© Copyright 2011 TopQuadrant Inc 29
What we will show you today
• Generation of OWL Models from XML Schemas
• Auto-conversion of LEXS-based XML messages to RDF
• An experiment with fake (generated) Incidents data to show how multiple messages can be aggregated
• Some SPARQL Queries and SPIN rules at work
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 30
XSD/XML to OWL Rules (1 of 2)
# XSD/XML Constructs OWL Constructs
1 xsd:simpleType owl:Datatype
2 xsd:simpleType with xsd:enumeration Becomes an owl:Class as a subclass of ‘EnumeratedValue’. Instances are created for every enumerated value. An instance of ‘Enumeration’, referring to all the instances, is created as well as the owl:oneOf union over the instances.
3 xsd:complexType over xsd:complexContent
owl:Class
4 xsd:complexType over xsd:simpleContent
owl:Class
5 xsd:element (global) with complex type owl:Class and subclass of the class generated from the referenced complex type
6 xsd:element (global) with simple type
owl:Datatype
7 xsd:element (local to a type) owl:DatatypeProperty or owl:ObjectProperty depending on the element type. OWL Restrictions are built for the occurrence.
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 31
XSD/XML to OWL Rules (2 of 2)
# XSD/XML Constructs OWL Constructs
8 xsd:group owl:Class and sub-class of ‘A_AbstractElementGroup’
9 xsd:attributeGroup owl:Class and sub-class of ‘A_AbstractAttributeGroup’
10 Anonymous Complex Type As for Complex Type except a URI is constructed from the parent element and the nested element reference. Also, the class is defined as a subclass of ‘A_Anon’.
11 Anonymous Simple Type As for Simple Type except a URI is constructed from the parent element and the nested element reference.
12 xsd:default on an attribute Uses ‘dtype:defaultValue’ to attach a value to the OWL restriction representing the associated property.
13 Substitution Groups Subclass statements are generated for the members. Instance files resolve their types by consulting the OWL model at import-time.
14 Annotation attributes on elements OWL Annotation properties are created and placed directly on the relevant class.
15 Annotations using xsd:annotation Become, based on user selection, dc:description, rdfs:comment and/or skos:definition OWL annotations.
16 xsi:type on an XML element Overrides the schema type with the specified type.
Take Away
© Copyright 2011 TopQuadrant Inc 32
DEMO of XSD to OWL and XML to OWL Transformations
© Copyright 2011 TopQuadrant Inc. Slide 33
Metrics on the NIEM OWL Model
SELECT ?class ?restrictionCount WHERE { ?class a owl:Class . BIND(smf:countResults( "SELECT DISTINCT ?property WHERE { ?class rdfs:subClassOf ?restriction . ?restriction a owl:Restriction . ?restriction owl:onProperty ?property }" ) AS ?restrictionCount ) }
Take Away
© Copyright 2011 TopQuadrant Inc. Slide 34
NIEM Person (Proto) OWL Model
Note: to address the reusability required in the MoJ work, NIEM ‘Person’ was re-factored into individual ‘Details’ classes.
Take Away
© Copyright 2011 TopQuadrant Inc. Slide 35
Refactoring of NIEM Person into an OWL Model with reusable Concepts (person:Details)
Depending on the context of use, concepts describing different details about a person can be selected for the UBL Business Documents and Messages.
Take Away
© Copyright 2011 TopQuadrant Inc. Slide 36
Refactoring of the NIEM Person into an OWL Model with reusable Concepts (person:AppearanceDetails)
A person’s ‘Appearance Details’ will be needed for criminal investigations.
Take Away
© Copyright 2011 TopQuadrant Inc. Slide 37
NIEM JXDM Complex Type Example Take Away
© Copyright 2011 TopQuadrant Inc 38
DOJ Logical Entity Exchange Specification (LEXS)
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 39
What is the DOJ LEXS?
“LEXS provides a flexible, NIEM-based framework used for the creation of NIEM-conformant IEPDs for information sharing, both for publishing information and for system-to-system federated searches.”
source: http://it.ojp.gov/default.aspx?area=implementationAssistance&page=1017&standard=486
LEXS is a family of NIEM-conformant
IEPDs that define flexible structures to
support a variety of applications.
Any application that participates in
OneDOJ, is a part of LEISP, or supports
law enforcement information sharing
must participate in LEXS exchanges.
If additional structures beyond the
base LEXS are required, LEXS should
be extended by using NIEM (Option 2).
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 40
Conversion of LEXS from XML Schema to OWL using the TopBraid XSD to OWL Importer
XML Schemas OWL Models
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 41
Using SPARQL to count the properties on the LEXS/NIEM OWL Models
SELECT ?class (COUNT (DISTINCT ?p) AS ?properties) WHERE { ?class a owl:Class . OPTIONAL { ?class rdfs:subClassOf ?r . ?r a owl:Restriction . ?r owl:onProperty ?p . } } GROUP BY ?class ORDER BY DESC( ?properties )
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 42
‘digest:EntityAssociationType’ really stands out with 194 Properties
Is this a refactoring opportunity?
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 43
Some NIEM Controlled Vocabularies FBI
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 44
FBI Code Lists Example: Hair Color
OWL Model OWL Instances
fbi:HAICST_GRY a fbi:HAICodeSimpleType ; rdfs:label "GRY"^^xsd:string ; dtype:order "5"^^xsd:nonNegativeInteger ; dtype:value "GRY"^^xsd:token ; skos:definition "Gray or Partially Gray"^^xsd:string ; skos:prefLabel "GRY"^^xsd:string .
Grey Hair in Turtle Syntax
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 45
The digest:EntityActivity Class
OWL Class with properties
Inheritance
Association
A ‘digest:EntityActivity’ is both a ‘digest:Entity’ and a ‘digest:EntityActivityType’
Multiple Inheritance is common
Note that the ‘proto-OWL’ ontology respects the XML Schema’s use of wrapped data types. An optimization can unfold these to direct data types
Association
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 46
A digest:EntityActivity Instance <lexsdigest:EntityActivity> <lexsdigest:Metadata s:id="MIncident1"> <nc:ReportedDate><nc:Date>1997-03-12</nc:Date></nc:ReportedDate> </lexsdigest:Metadata> <nc:Activity s:id="Incident1" s:metadata="MIncident1"> <nc:ActivityIdentification><nc:IdentificationID>000000000003</nc:IdentificationID> </nc:ActivityIdentification> <nc:ActivityCategoryText>Incident</nc:ActivityCategoryText> <nc:ActivityDate><nc:DateTime>1997-03-12T00:01:00.0Z</nc:DateTime></nc:ActivityDate> <nc:ActivityDescriptionText>On 3/12/1997 at 12:01 a.m., Mr. Donald R. Duck (Witness 1) saw a white male break the glass of his neighbor's (Jacob Joe) front door. Mr. Duck placed a 911 call on his cell phone to report the incident. Within minutes, police arrive at the residence (1 NW Brockway Avenue) to find the subject ransacking the house. Detective Bond was the responding and arresting officer. The subject was taken to the Santa Fe Police Department and placed under arrest. An arrest report was filed on 3/12/1997.</nc:ActivityDescriptionText> </nc:Activity> </lexsdigest:EntityActivity>
Class
Instance
Take Away
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 47
burglary-incident-w-arrest-basic-lexs.xml
Transforming LEXS Instance Data to RDF
Semantic XML
Convert XML to RDF/OWL
XML
RDF/OWL + +
burglary-incident-w-arrest-basic-lexs (RDF) Automatic
Conversion from
LEXS XML to RDF
TopBraid’s Semantic XML Engine
uses sxml:tag annotations on the
auto-generated NIEM/LEXS OWL
Ontologies to control the
transformations.
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 48
Useful QA Check on the “Semantic XML”1 Triples
SELECT *
WHERE {
?subject composite:child ?object .
NOT EXISTS { ?object a sxml:Comment }
NOT EXISTS { ?object a ?type .
?type sxml:element "xi:include" }
}
“0” is good!
QA Check
1 Semantic XML is a composite pattern model:
?anElement composite:child ?anotherElement
?anElement composite:child ?anAttribute
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 49
Example SPARQL Query for the sample Burglary Incident
SELECT ?s ?fn1 ?ssn1 ?fbiID1v ?fpID1v ?fpj1v WHERE { ?s rdf:type digest:EntityPerson . ?s digest:personRef ?p1 . ?p1 core:personNameRef ?pnR1 . ?pnR1 core:personFullNameRef ?pfnR1 . ?pfnR1 dtype:value ?fn1 . ?p1 core:personSSNIdentificationRef ?pSSNR1 . ?pSSNR1 core:identificationIDRef ?pSSN1 . ?pSSN1 dtype:value ?ssn1 . ?p1 digest:personAugmentationRef ?p1a . ?p1a jxdm:personFBIIdentificationRef ?fbiID1 . ?fbiID1 core:identificationIDRef ?fbicID1 . ?fbicID1 dtype:value ?fbiID1v . ?p1a jxdm:personStateFingerprintIdentificationRef ?fp1 . ?fp1 core:identificationIDRef ?fpcID1 . ?fpcID1 dtype:value ?fpID1v . ?fp1 core:identificationJurisdictionRef ?fpj1 . ?fpj1 dtype:value ?fpj1v . }
Find all people involved in
an incident for which we
have full names, SSNs,
FBI IDs, finger prints and
the state of jurisdiction
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 50
Using Magic Properties and Property Chains to simplify the SPARQL
SELECT ?s ?fn1 ?ssn1 ?fbiID1v ?fpID1v ?fpj1v WHERE { ?s rdf:type digest:EntityPerson . ?s digest:personRef ?p1 . ?p1 lexs:getFullName ?fn1 . ?p1 lexs:getSSN ?ssn1 . ?p1 digest:personAugmentationRef ?p1a . ?p1a lexs:getFBI-ID ?fbiID1v . ?p1a lexs:getFingerprintID ?fpID1v . ?p1a lexs:getFingerprintIDState ?fpj1v . }
SELECT ?ssn WHERE { ?arg1 ( core:personSSNIdentificationRef / core:identificationIDRef / dtype:value ) ?ssn. }
lexs:getSSN
SELECT ?name WHERE { ?arg1 ( core:personNameRef / core:personFullNameRef / dtype:value ) ?name. }
lexs:getFullName
SELECT ?id WHERE { ?arg1 ( jxdm:personFBIIdentificationRef / core:identificationIDRef / dtype:value ) ?id . }
lexs:getFBI-ID
SELECT ?id WHERE { ?arg1 (jxdm:personStateFingerprintIdentificationRef / core:identificationIDRef / dtype:value) ?id . }
lexs:getFingerprintID
SELECT ?id WHERE { ?arg1 ( jxdm:personStateFingerprintIdentificationRef / core:identificationJurisdictionRef / dtype:value) ?id . }
lexs:getFingerprintStateID
Magic Property
Property Chain
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 51
To demonstrate interesting queries over the LEXS model we needed more data
Only one example file was available
Because we cannot use real data, we built a random cloner of the single instance file using fake data
Random values where chosen from enumerated values
Random witnesses, victims and suspects were taken from a database of fake people
Random dates were generated
The resultant dataset can have any number of incidents
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 52
Generating Random Instance Data using the RDF Instance Graph “Seed”
burglary-incident-w-arrest-basic-lexs (RDF)
Automatic Cloner Using Deep Random Graph Copier
1000 Graph Clones
1 Seed Graph
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 53
Where to find fake people?
http://www.fakenamegenerator.com/order.php
Using TopBraid, CSV file of up to 10,000 names was converted to RDF/OWL triples
This was done using SPINMap
RDF/OWL Instances
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 54
SPINMap was used to transform the Fake People to NIEM/LEXS People
RDF/OWL Instances
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 55
SPARQLMotion Script for the generation of Random Incidents using Fake People
Initialize script variables to set the count of random incidents and other graph base uris
For each random incident graph, this controls the generation of fake instances
Clones the ‘seed’ graph to make each new incident graph
For each type of person (witness, victim, etc.), a random fake name is picked from the ‘Fake Names’ Graph
On completion, the new graph is exported.
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 56
Generation of Random Incidents using Fake People and Randomized Values
Incident 1 Incident n
4 Witnesses 2 Witnesses
2 Arrestees 1 Arrestee
Victim
Victim Victim
Victim
Dispatcher
Dispatcher
Operator
Operator
Officer Officer
Incident 1
Incident 10
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 57
Using SPIN to classify Person Instances
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 58
Using SPIN to transform Person data type properties to direct attributes
Witness Class
Victim Class
Operator Class
Officer Class
Arrestee Class
Male Class
Female Class
Dispatcher Class
Person Class
SPIN Rule on Person Class
Sub-class Relationships
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 59
A Query over the Incidents Data (1 of 2)
Find all people who have
been both a witness and an
arrestee across all incidents
Masato M. Sai was arrested in incident 4, but he was also a witness, which seems suspicious. Especially considering he was an officer in incident 10 and a dispatcher in incident 1.
What’s interesting about Bartholomeus is that he was the dispatcher and got arrested for incident 10! So what’s going on here? Did he conspire with Masato?
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 60
A Query over the Incidents Data (2 of 2)
Not surprising to confuse the police as
suspects if you see this going on
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 61
So why Integrate Data using RDF/OWL?
“Ontology-Driven Data Refineries”
“Frictionless” Data
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 62
Possible Next Steps Enhance XML Schema to OWL transformations to produce more
canonical OWL Models (direct properties)
Form a community of interest? Publish NIEM and LEXS OWL Models and SKOS Vocabularies?
Demonstrate data integration for LEXS-extended or none LEXS-based IEPDs using OWL Neutral Models and SKOS vocabularies DoJ IEPD Clearinghouse lists over 200 custom IEPDs – and this is growing
Provide tooling for generating custom IEPDs using RDF/OWL ontologies with composable message components
© Copyright 2011 TopQuadrant Inc 63
Concluding Remarks
On balance, in the limited time we had, the presentation attempted to show:
1. automatic generation of OWL Models and Vocabularies from XML Schemas
2. automatic generation of RDF/OWL Graphs from XML-compliant messages
3. OWL as an expressive specification language for information models and vocabularies
4. SPARQL as a powerful way of exploring both data and models and doing transformations
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 64
NIEM References
Main Site http://www.ise.gov/national-information-exchange-model-niem
Clearing House http://it.ojp.gov/framesets/iepd-clearinghouse-noClose.htm
DoJ http://www.it.ojp.gov/default.aspx?area=implementationAssistance&pa
ge=1017&standard=520
HSDL http://www.hsdl.org/?view&did=487388
Other http://www.ibm.com/developerworks/library/x-NIEM4/
Click to edit Master title style
© Copyright 2011 TopQuadrant Inc 65
Thank You
Ralph Hodgson E-mail: [email protected] Twitter: @topquadrant, @ralphtq, @oegovnews
Some
of our
books