Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact...

250
Oracle Spatial and Graph: RDF Semantic Graph Feature 1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Transcript of Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact...

Page 1: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph: RDF Semantic Graph Feature

1 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

RDF Semantic Graph Feature

Page 2: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

"THE FOLLOWING IS INTENDED TO OUTLINE OUR GENERAL PRODUCT

DIRECTION. IT IS INTENDED FOR INFORMATION PURPOSES ONLY, AND

MAY NOT BE INCORPORATED INTO ANY CONTRACT. IT IS NOT A

COMMITMENT TO DELIVER ANY MATERIAL, CODE, OR FUNCTIONALITY,

AND SHOULD NOT BE RELIED UPON IN MAKING PURCHASING DECISION.

THE DEVELOPMENT, RELEASE, AND TIMING OF ANY FEATURES OR

FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT

2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

FUNCTIONALITY DESCRIBED FOR ORACLE'S PRODUCTS REMAINS AT

THE SOLE DISCRETION OF ORACLE."

Page 3: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Program Agenda

� Part 1: Overview of Graph

� Part 2: SPARQL and GeoSPARQL

� Part 3: Semantics (RDF, RDFS, OWL)

� Part 4: RDF View on Relational Data (RDB2RDF)

3 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

� Part 5: Key Features of Oracle Spatial and Graph

� Part 6: Performance and Scalability

� Part 7: Tools Demonstration

� Part 8: SQL-based Graph Analytics

� Summary

Page 4: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 1: Overview of Graph

4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 5: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Overview of Graph

• What is a graph?

– A set of vertexes and edges (and optionally attributes)

– A graph is simply linked data

• Why do we care?

– Graphs are everywhere A D

C B

5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Graphs are everywhere

• Road networks, power grids, biological networks

• Social networks/Social Web (Facebook, Linkedin, Twitter, Baidu, Google+,>)

• Knowledge graphs (RDF, OWL)

– Graphs are intuitive and flexible

• Easy to navigate, easy to form a path, natural to visualize

E

A D

F

Page 6: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Various Kinds of Graphs

• There are many kinds of graphs

– Simple graph, Weighted graph, Vertex-labeled graph, Edge-labeled graph,

Directed graph (digraph), Undirected graph, Hypergraph, >

• Different application scenarios

– Link-node graphs representing physical/logical networks used in transportation,

6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Link-node graphs representing physical/logical networks used in transportation,

utilities and telco (Oracle Spatial and Graph Network Data Model (NDM)

– RDF Semantic Graphs modeling data as triples for social network, linked data

and other semantic applications (Oracle Spatial and Graph)

– Property Graphs allowing the association of K/V pairs (attributes) with

vertexes/edges for social network analytics (investigating)

Page 7: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDF Semantic Graph

• Resource Description Framework

– URIs are used to identify

• Resources, entities, relationships, concepts

• Creates Subject-Property-Object “triples”

• Data identification is a must for integration

• URIs are globally unique

• Properties of subjects are triples

7 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• RDF Graph defines semantics

• Standards defined by W3C & OGC– RDF, RDFS, OWL, SKOS

– SPARQL, RDFa, RDB2RDF, GeoSPARQL

• Implementations – Oracle, IBM, Cray, Bigdata ®

– Franz, Ontotext, Openlink, Jena, Sesame, >

Page 8: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Property Graph

• A set of vertices (or nodes) – each vertex has a unique identifier (not globally unique).

– each vertex has a set of in/out edges.

– each vertex has a collection of key-value properties.

• A set of edges – each edge has a unique identifier (not globally unique).

– each edge has a head/tail vertex.

– each edge has a label denoting type of relationship between

8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– each edge has a label denoting type of relationship between

two vertices.

– each edge has a collection of key-value properties.

• Blueprints Java APIs - open source, no standards

• Implementations • Neo4j, InfiniteGraph, Dex, Sail, MongoDB >

• A property graph can be modeled as an RDF Graph(Oracle Spatial and Graph NDM is an example of a property

graph optimized for consistent/homogeneous properties for

edges)

https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

Page 9: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Introduction to RDF Semantic Graph, a feature of Oracle Spatial and Graph

9 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

feature of Oracle Spatial and Graph

Page 10: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Semantic Technology Stack

• Core Technologies

• URI• Uniform resource identifier

• RDF• Resource description framework

10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. http://www.w3.org/2007/03/layerCake.svg

• RDFS• RDF Schema

• OWL• Web ontology language

Page 11: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is RDF

� A graph data model for web resources and their relationships

� The graph can be serialized into

- RDF/XML, N3, N-TRIPLE, >

� Construction unit: Triple

(or assertion, or fact)

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

http://>/locatedIn

http://>/produce

http://>/customerOf

11 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

(or assertion, or fact)<http://foobar> <:produces> <:mp3>

� Quads (named graphs) add context, provenance, identification, etc. to assertions <http://foobar> <:produces> <:mp3 > <:ProductGraph>

Subject Predicate Object http://www.oracle.com

http://www.oracle.com/products/RDF

http://>/produce

http://>/customerOf

http://>/uses(property/edge)(vertex /node) (vertex/node)

Page 12: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Basic Elements of RDF

• Instances

E.g. :John, :MovieXYZ, :PurchaseOrder432

• Classes

• Class represents a group/category/categorization of instances

E.g. :John rdf:type :Student

12 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

E.g. :John rdf:type :Student

• Properties

• Linking data together

E.g. :John :brother :Mary,

:John :hasAge “33”^^xsd:integer.

Page 13: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

An RDF Graph Linking Several Data Sources

13 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 14: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Triples Are Easy. But Why?

• Graph modeling is flexible

• Adding/removing a new edge or vertex is simple

• Adding an edge is like adding a new column to a table but easier to do

• Standard based graph representation

• RDF was defined by W3C. Allows interoperability

• Computers can understand the semantics RDF graphs (triples)

14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Computers can understand the semantics RDF graphs (triples)

• Same URI means same resource

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

:locatedIn

:producehttp://www.oracle.com

http://www.oracle.com/products/RDF

:produce:customerOf

http://www.oracle.com

Page 15: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Triples Are Easy. But Why? (2)

• Graph modeling is flexible

• Adding/removing a new edge or vertex is simple

• Adding an edge is like adding a new column to a table but easier to do

• Standard based graph representation

• RDF was defined by W3C. Allows interoperability

• Computers can understand the semantics RDF graphs (triples)

15 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Computers can understand the semantics RDF graphs (triples)

• Same URI means same resource

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

:locatedIn

:producehttp://www.oracle.com

http://www.oracle.com/products/RDF

:produce:customerOf

Page 16: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Triples Are Easy. But Why? (3)

• Graph modeling is flexible

• Adding/removing a new edge or vertex is simple

• Adding an edge is like adding a new column to a table but easier to do

• Standard based graph representation

• RDF was defined by W3C. Allows interoperability

• Computers can understand the semantics RDF graphs (triples)

16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Computers can understand the semantics RDF graphs (triples)

• Discover hidden relationships, or detect inconsistency via logical inference

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

:locatedIn

:producehttp://www.oracle.com

http://www.oracle.com/products/RDF

:produce:customerOf rdf:type

:customerOf rdfs:range :ServiceProvider

Page 17: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Triples Are Easy. But Why? (4)

• Graph modeling is flexible

• Adding/removing a new edge or vertex is simple

• Adding an edge is like adding a new column to a table but easier to do

• Standard based graph representation

• RDF was defined by W3C. Allows interoperability

• Computers can understand the semantics RDF graphs (triples)

17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Computers can understand the semantics RDF graphs (triples)

• Discover hidden relationships, or detect inconsistency via logical inference

http://www.foobar.com

“CA”

http://www.foobar.com/products/mp3

:locatedIn

:producehttp://www.oracle.com

:customerOfrdf:type

:hasOracleCSI rdf:type owl:FunctionalProperty

“CID1234”

“CID987”

:hasOracleCSI

:hasOracleCSI !

Page 18: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• DBPedia

• SIOC

• NCI

• SNOMED

• FOAF

• Geonames

• Wordnet

• Drug Bank

• ACM

• Daily Med

• Linked CT

• Eurostat

• Semanitc XBRL

• US Census

• YAGO

• Cyc/Open Cyc

• PubMed

• Freebase

Many Graphs and Vocabularies on the Web

18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Geonames

• CIA World Fact Book

• DBLP

• UniProt

• UniParc

• CiteSeer

• Eurostat

• KEGG

• Data.gov.uk

• Music Brainz Data

• Semantic Tweet

• CO2 Emission

• Freebase

• Gene Ontology

• UniRef

• Smart Link

• Reactome

• Diseasome

And so much more !

Page 19: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDF Graph Can Enrich Your Business Applications • Flexible graph modeling adds agility

• Resource identification adds precision

• Integrate full breadth of enterprise content (structured, spatial, email, documents, web

services)

• Reconcile differences in data semantics so that they can all “talk” and interoperate;

19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. https://github.com/tinkerpop/blueprints/wiki/Property-Graph-Model

• Resolve semantic discrepancies across databases, applications

• Create consolidated “single” views across business applications

• Model and implement common Business Processes

Page 20: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graph Technology As an Evolution

Oracle Spatial and Graph works well with these important enterprise technologies

• Relational, XML, Spatial, Text, Security, Clustering, Compression, Data Guard >

– Oracle’s RDF/OWL support is native to the Database

• Web Services, SOA, BPMN, Hadoop (Map Reduce) >

– Support of popular Java APIs and standard complaint Web Service endpoint

20 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Advanced Analytics

– Support integration with OBIEE, Oracle Data Mining, Oracle R Enterprise

• A rich set of third party tools including

– Ontology editing, knowledge management, Complete DL reasoners

– Graph/network visualization

– NLP, text processing

– >

Page 21: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDF Graph Use Cases

• Unified content metadata for federated resources

• Validate semantic and structural consistency

Semantic Semantic

Metadata LayerMetadata Layer

� Find related content & relations by navigating connected entities

Text Mining & Text Mining & EntityEntityAnalyticsAnalytics

21 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Social MediaSocial MediaAnalysisAnalysis

� Analyze social relations

using curated metadata

- Blogs, wikis, tweets, video

- Calendars, IM, voice

connected entities

� “Reason” across entities

EntityEntityAnalyticsAnalytics

Page 22: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Life Sciences

• Finance

• Media

Industries

Industries Have Already Adopted the ConceptSample of Oracle Spatial and Graph customers

22 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Media

• Networks & Communications

• Defense & Intelligence

• Public Sector

Thomson Reuters

Hutchinson

3G Austria

Page 23: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Allied Nation Intelligence ServiceOracle Spatial and Graph: Social Analysis

Objectives

� Profile suspects through telephone, email

and social network communications

� Produce “data products” for analysts

Solution

� Standards-based tools: W3C RDF & SPARQL

� Semantic tagging for 600 TB / 10b triples graph

� Top-secret , compartmented security for data

� New discovery on ~100 million triples / month

Benefits

23 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Solution

� RDF Graph modeling of the social network:

people, groups and places of interest

� Inferencing & graph analytics discover

relationships among individuals & meaning

of pseudonyms, aliases, codes, terminology

� New discovery on ~100 million triples / month

� Find & label “same-as” relationships

Page 24: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Cisco WebEx Social Oracle Spatial and Graph for Enterprise Collaboration

Objectives

� Social connectivity and collaboration

through semantic enablement

� Connect knowledge silos

Solution

� Unifies metadata model - forum, blog, wiki, etc.

� Tagging media documents, pictures, blogs, etc.

to user-defined and/or enterprise vocabularies.

� Validates tag semantic/structural consistency

Benefits

24 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Solution

� Persistent unified graph metadata model

� Concepts tagged with unique meaning

� Find related content & groups by

navigating connected entities,

recommendations

� Validates tag semantic/structural consistency

Page 25: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Eli Lilly and CompanyOracle Spatial and Graph: RDF Graph Metadata Repository

Objectives

� Unified vocabulary for scientific

investigation

� Easier, more complete investigations

Solution

“[This technology>] provides improved insight

into our business by bringing together related

information from diverse data sources,”J. Phil BrooksInformation Consultant, Eli Lilly and Company

“[This technology>] provides improved insight

into our business by bringing together related

information from diverse data sources,”J. Phil BrooksInformation Consultant, Eli Lilly and Company

25 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Solution

� Integrate patient records, chemical

structures, biological sequences &

pathways, images, scientific papers>

� View related data as a graph

� Traverse graphs to discover relationships,

search for a term, or browse ontologies

Page 26: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph Partners:Integrated Tools and Solution Providers

Ontology Engineering & Visualization

Open Source Frameworks Standards

Reasoners NLP Entity Extractors

26 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Applications & Tools SI / Consulting

SesameJoseki

Page 27: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 2: SPARQL and GeoSPARQL

27 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 28: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� SPARQL Protocol and RDF Query Language

– W3C standard for querying and manipulating RDF content

– Queries/updates and corresponding results are communicated via HTTP

with a SPARQL endpoint

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.28

with a SPARQL endpoint

– A SPARQL endpoint implements the SPARQL protocol and serves RDF

data from a RDF triplestore or RDF view

Page 29: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.29

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 30: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

A comprehensive query language

for RDF

Many useful constructs: optional

patterns, aggregates, subqueries,

negation, property paths, extensive

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.30

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

negation, property paths, extensive

function library, etc.

Page 31: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

A comprehensive language for

manipulating RDF graphs

Allows you to create, update and

remove RDF graphs

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.31

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 32: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Defines a protocol for sending

queries or updates to SPARQL

endpoint and returning the results

via HTTP

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.32

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 33: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Defines a mechanism and RDF

vocabulary for describing the

features supported by a SPARQL

endpoint

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.33

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 34: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Alternative formats used to

serialize and exchange answers to

SPARQL queries

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.34

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 35: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

SPARQL extension for executing

queries distributed over different

SPARQL endpoints

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.35

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 36: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Extends SPARQL so that logically

entailed RDF triples (hidden edges

in RDF Graphs) are matched in

addition to directly asserted RDF

triples

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.36

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 37: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What is SPARQL?

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Simple alternative to SPARQL 1.1

Update that describes HTTP

operations for managing a

collection of RDF graphs outside of

a SPARQL 1.1 graph store

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.37

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 38: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Features by Example

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.38

Page 39: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Graph PatternBasic unit of SPARQL queries

govtrack:Politician

govtrack:A000041

govtrack:A000045

“John Adams” “male”

“male”

rdf:type

rdf:type

foaf:name foaf:gender

foaf:name foaf:gender

govtrack:Politician

govtrack:A000041

rdf:type

foaf:name foaf:gender

?t

?p

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.39

“1767-07-11”^^xsd:date

“Samuel Adams”

“1722-09-27”^^xsd:date

“male”vcard:BDAY

vcard:BDAY“John Adams”

“1767-07-11”^^xsd:date

“male”

foaf:name

vcard:BDAY

foaf:gender

?n

?b

?g

Result 1: {?t=govtrack:Politician, ?p=govtrack:A000041, ?n=“John Adams”, ?g=“male”, ?b=“1767-07-11”^^xsd:date}

Result 2: {?t=govtrack:Politician, ?p=govtrack:A000045, ?n=“Samuel Adams”,?g=“male”, ?b=“1722-09-27”^^xsd:date}

Page 40: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

?t

?p

SPARQL Graph PatternBasic unit of SPARQL queries

rdf:type

foaf:name foaf:gender

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>

How do we express this with SPARQL?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.40

?n

?b

?g

foaf:name

vcard:BDAY

foaf:gender SELECT ?t ?n ?b ?gWHERE{ ?p rdf:type ?t .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g }

Basic Graph

Pattern (BGP)

Page 41: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL SELECT Modifiers

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT DISTINCT ?fWHERE{ ?p vcard:N ?vn .

Find all distinct family names for senators

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.41

{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:title "Sen." .

}

Page 42: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL FILTER: Restricting Solutions

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT ?n ?b ?gWHERE{ ?p vcard:N ?vn .

Find all people with family name Adams born before Independence Day

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.42

{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g FILTER ( ?f = "Adams" &&

?b < "1776-07-04"^^xsd:date )}

Page 43: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Built-in Functions

� Basic: arithmetic, comparisons, boolean connectors

� RDF-related: isLiteral(), isURI(), isBlank(), datatype(), lang(), BOUND(), >

� String Functions: SUBSTR(), STRSTARTS(), STRENDS(), REGEX(), >

Numerics:

Extensive library of functions to use

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.43

� Numerics: abs(), floor(), ceil(), >

� Dates and Times: now(), year(), month(), day(), >

� Miscellaneous: IN(), NOT IN(), IF(), COALESCE(), >

� Constructors: xsd:int(), xsd:decimal(), xsd:dateTime(), >

� > plus user-defined

Page 44: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL UNION: Disjunction

SELECT *WHERE{ ?p vcard:N ?vn .?p vcard:BDAY ?b .?vn vcard:Family ?f .{ ?p foaf:name ?n }

Find vcard given name or foaf name for all Clintons

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.44

{ ?p foaf:name ?n }UNION{ ?vn vcard:Given ?n }FILTER ( ?f = "Clinton" ) }

Page 45: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL OPTIONAL: Best Effort Match

SELECT ?t ?n ?b ?hWHERE{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:name ?n .

Find all people with family name Smith and optionally their title and

homepage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.45

?p foaf:name ?n .?p vcard:BDAY ?b .OPTIONAL { ?p foaf:title ?t }OPTIONAL { ?p foaf:homepage ?h }FILTER ( ?f = "Smith" )}

Page 46: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL OPTIONAL: Best Effort Match

SELECT ?n ?b ?hWHERE{ ?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:title "Rep."

Find all representatives and optionally their homepage if it is not a

standard www.house.gov address

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.46

?p foaf:title "Rep."OPTIONAL {?p foaf:homepage ?h FILTER (!STRSTARTS(STR(?h),"http://www.house.gov"))

}}

Page 47: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Negation: MINUS

SELECT ?n ?bWHERE{ ?p vcard:N ?vn .?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .

Find all Kennedys that do not have a homepage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.47

?p vcard:BDAY ?b .MINUS { ?p foaf:homepage ?h }

}

Page 48: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Negation: NOT EXISTS / EXISTS

SELECT ?n ?bWHERE{ ?p vcard:N ?vn .?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .

Find all Kennedys that do not have a homepage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.48

?p vcard:BDAY ?b .FILTER ( NOT EXISTS { ?p foaf:homepage ?h } )

}

Page 49: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Solution Modifiers: ORDER BY

SELECT ?gn ?fn ?hWHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .

Order all representatives by ascending family name and descending

homepage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.49

?v vcard:Family ?fn .?p foaf:title "Rep." .?p foaf:homepage ?h

}ORDER BY ASC(?fn) DESC(STR(?h))

Page 50: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Solution Modifiers: LIMIT / OFFSET

SELECT ?gn ?fn ?hWHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .

Get information about representatives 11 through 20 in alphabetical

order

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.50

?v vcard:Family ?fn .?p foaf:title "Rep." .?p foaf:homepage ?h

}ORDER BY ASC(?fn)LIMIT 10OFFSET 10

Page 51: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 SELECT Expressions

SELECT (CONCAT(?t," ",SUBSTR(?gn,1,1),". ",?fn) AS ?fullTitle)WHERE{ ?p vcard:N ?v .?v vcard:Given ?gn .?v vcard:Family ?fn .

Generate a full title “<title> <FI> <family name>” for each member of

congress

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.51

?v vcard:Family ?fn .?p foaf:title ?t .

}

Page 52: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Grouping and Aggregation

SELECT ?fn ?tWHERE{ ?p vcard:N ?v .?v vcard:Family ?fn .?p foaf:title ?t .

}

Find all distinct pairs of family name and title

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.52

}GROUP BY ?fn ?t

Page 53: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Grouping and Aggregation

SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .?b bill:sponsor ?s .

}GROUP BY ?n

Find the top 10 bill sponsors

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.53

GROUP BY ?nORDER BY DESC(?cnt)LIMIT 10

Available Aggregates:COUNT(), SUM(), MIN(), MAX(), AVG(), GROUP_CONCAT(), SAMPLE()

Page 54: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Grouping and Aggregation

SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .?b bill:sponsor ?s .

}GROUP BY ?n

Find members of congress who have sponsored very few bills

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.54

GROUP BY ?nHAVING (COUNT(?b) < 5)ORDER BY ?cnt ?n

Page 55: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Subqueries

SELECT DISTINCT ?n ?o ?p ?st ?cntWHERE{ ?s foaf:name ?n . ?s pol:hasRole ?r .?r pol:party ?p . ?r pol:forOffice ?o .?o pol:represents ?st{ SELECT ?s (COUNT(?b) AS ?cnt)

Find information about politicians who sponsored more than 150 bills

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.55

{ SELECT ?s (COUNT(?b) AS ?cnt)WHERE{ ?b bill:sponsor ?s }GROUP BY ?sHAVING (COUNT(?b) > 150)

}}ORDER BY DESC(?cnt) ASC(?n)

Page 56: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Value Assignment: BIND

SELECT ?n ?o ?p ?stWHERE{ BIND (CONCAT("John"," ","McCain") AS ?n)?s foaf:name ?n .?s pol:hasRole ?r .?r pol:party ?p .

Find information about political offices of John McCain

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.56

?r pol:party ?p .?r pol:forOffice ?o .?o pol:represents ?st

}

Page 57: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Inline Data: VALUES

SELECT ?gn ?fn ?o ?p ?stWHERE{ ?s vcard:N ?n .?n vcard:Family ?fn .?n vcard:Given ?gn .

Find information for all Browns that are Republican and Smiths that are

Democrat

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.57

?n vcard:Given ?gn .?s pol:hasRole ?r .?r pol:party ?p .?r pol:forOffice ?o .?o pol:represents ?stVALUES (?fn ?p) {("Brown" "Republican")

("Smith" "Democrat") }}

Page 58: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Federated Query

SELECT ?n ?p ?oWHERE{ ?person foaf:name ?n .?person pol:hasRole ?role .?role pol:forOffice ?office .?office pol:represents geo:nh

What information about NH senators can be found from DBPedia

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.58

?office pol:represents geo:nhSERVICE <http://dbpedia.org/sparql> { ?x foaf:name ?n .

?x ?p ?o }}

Page 59: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL ASK Queries

ASKWHERE{ ?person vcard:N ?n .?n vcard:Given "Kennedy".?person pol:hasRole ?role .?role pol:party "Republican" .

Is there a Republican Kennedy?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.59

?role pol:party "Republican" .}

Page 60: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Construct Queries

CONSTRUCT { ?person foaf:name ?n .?person pol:memberOf ?party }

WHERE{ ?person foaf:name ?n .?person pol:hasRole ?role .?role pol:party ?party .

Build a graph of names and political parties for all senators

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.60

?role pol:party ?party .?person foaf:title "Sen." .

}

Page 61: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Describe Queries

DESCRIBE ?personWHERE{ ?person vcard:N ?n .?n vcard:Family "Paul" .

}

Describe all politicians with family name “Paul”

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.61

Page 62: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Property Paths

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.62

Page 63: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Property Paths

� Uses regular expression style syntax to express path patterns over

RDF properties

� Allows syntactic shortcuts for fixed length paths

� Allows searching arbitrary length paths

Enhanced path searching in SPARQL

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.63

� Allows searching arbitrary length paths

– Uses connectivity semantics instead of path counting semantics

Page 64: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Property Path ConstructsSyntax Form Matches

iri An IRI (path of length 1)

^elt Reverse path (object to subject)

elt1 / elt2 Sequence path of elt1 followed by elt2

elt1 | elt2 Alternative path of elt1 or elt2

elt* Path composed of zero or more repetitions of elt

elt+ Path composed of one or more repetitions of elt

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.64

elt+ Path composed of one or more repetitions of elt

elt? Path composed of zero or one repetition of elt

!iri or !(iri1|iri2|>|irin) A path of length 1 that is not one of irii

!^iri or !(^iri1|^iri2|>|^irin) A path of length 1 that is not one of irii as reverse paths

!(iri1|>|irij|^irij+1|>|^irin) A path of length 1 that is not one of irii in the indicated direction

(elt) Grouping used to control precedence

iri is an IRI

elt is a path element, which may itself be composed of other path constructs

Page 65: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Property Path

SELECT ?n ?dist ?scWHERE{ ?person foaf:name ?n .?person pol:hasRole/pol:forOffice/pol:represents ?dist .?person rdf:type ?t .?t rdfs:subClassOf+ ?sc .

Find each politician’s classification and the district he or she represents

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.65

?t rdfs:subClassOf+ ?sc .} LIMIT 100

Page 66: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Property Path Limitations

� For arbitrary length paths

– Cannot return actual path itself

– Cannot get length of found path

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.66

– Difficult to place conditions on nodes along the path

� Length-limited path search is hard to express for longer lengths

� No shortest path function

Page 67: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Named Graphs

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.67

Page 68: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Named Graphs

� An RDF Dataset is a collection of RDF graphs

– Contains one default graph, which does not have a name

– Contains zero or more named graphs, where each graph is identified by an

IRI

The concept of an RDF Dataset

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.68

IRI

� A SPARQL query is executed against an RDF Dataset

� FROM and FROM NAMED keywords are used to construct the RDF

Dataset for a query

� The GRAPH keyword is used to control the active graph for different

parts of a query

Page 69: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Constructing the RDF Dataset

Graph Name Triples

-- {t1,t2,t3}

<urn:g1> {t4,t5}

Contents of RDF Triplestore

SELECT *FROM <urn:g1>FROM <urn:g3>

SPARQL query with RDF

Dataset specificationDefault Graph

{ t4, t5, t8, t9 }

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.69

<urn:g2> {t6,t7}

<urn:g3> {t8,t9}

<urn:g4> {t10,t11}

FROM <urn:g3>FROM NAMED <urn:g2>FROM NAMED <urn:g3>FROM NAMED <urn:g4>WHERE { … }

Named Graphs

{ (<urn:g2>, { t6, t7 }),

(<urn:g3>, { t8, t9 }),

(<urn:g4>, { t10, t11 }) }

Page 70: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Using the GRAPH Keyword

SELECT *FROM <urn:g1>FROM <urn:g3>FROM NAMED <urn:g2>FROM NAMED <urn:g3>

SPARQL query with RDF

Dataset specification

Active Graph (BGP1)

{ <urn:g1> UNION <urn:g3> }

Active Graph (BGP2)

{ <urn:g2>, <urn:g3>, <urn:g4> }

Within a GRAPH

clause:

- BGP is executed

against each active

graph separately

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.70

FROM NAMED <urn:g3>FROM NAMED <urn:g4>WHERE { BGP1 GRAPH ?g { BGP2 }GRAPH <urn:g4> { BGP3 }GRAPH <urn:g1> { BGP4 }

}

{ <urn:g2>, <urn:g3>, <urn:g4> }

Active Graph (BGP3)

{<urn:g4> }

Active Graph (BGP4)

{ }

graph separately

(e.g. BGP2 against

g2, g3, g4).

- Subgraph match

must occur within a

single graph.

Page 71: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Named Graph Query

SELECT ?n ?g (count(?b) as ?bcnt)FROM usgov:peopleFROM NAMED usgov:bills_110FROM NAMED usgov:bills_111WHERE

Find the number of bills sponsored by each politician in the 110th and

111th congress

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.71

WHERE{ ?s foaf:name ?nGRAPH ?g { ?b bill:sponsor ?s }

}GROUP BY ?n ?gORDER BY ?n ?g

Page 72: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.72

Page 73: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update

� Insert triples into an RDF Graph

� Delete triples from an RDF Graph

� Load an RDF Graph

� Clear an RDF Graph

Capabilities of SPARQL Update

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.73

� Clear an RDF Graph

� Create a new RDF Graph

� Drop an RDF Graph

� Copy, move or add the content of one RDF Graph to another

� Perform a group of update operations as a single action

Page 74: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: INSERT DATA

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT DATA{ my:person1 foaf:name "John Smith" .my:person1 foaf:knows my:person1 }

Insert simple triple data

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.74

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT DATA{ GRAPH my:g1 { my:person1 foaf:name "John Smith" .

my:person1 foaf:knows my:person1 } }

Insert simple named graph data

Page 75: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: DELETE DATA

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE DATA{ my:person1 foaf:name "John Smith" .my:person1 foaf:knows my:person1 }

Delete triple data

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.75

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE DATA{ GRAPH my:g1 { my:person1 foaf:name "John Smith" .

my:person1 foaf:knows my:person1 } }

Delete named graph data

Page 76: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: DELETE INSERT

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE { ?p foaf:worksFor "Oracle USA, Inc." }WHERE { ?p foaf:worksFor "Oracle USA, Inc." }

Delete all foaf:worksFor “Oracle USA” triples

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.76

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>INSERT { ?p foaf:worksFor "Oracle America" }WHERE { ?p foaf:worksFor "Oracle USA" }

Insert foaf:worksFor “Oracle America” triples for all “Oracle USA”

employees

Page 77: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: DELETE INSERT

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>DELETE { ?p foaf:worksFor "Oracle USA" }INSERT { ?p foaf:worksFor "Oracle America" }WHERE { ?p foaf:worksFor "Oracle USA" }

Replace all foaf:worksFor “Oracle USA” triples with foaf:worksFor

“Oracle America” triples

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.77

WHERE { ?p foaf:worksFor "Oracle USA" }

Page 78: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: LOAD

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>LOAD <http://www.graphs.com/graph1> INTO GRAPH my:g1

Load graph1 into my:g1

Load graph1 into default graph

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.78

PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX my: <http://www.mydomain.com/>LOAD <http://www.graphs.com/graph1>

Load graph1 into default graph

Page 79: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: CLEAR

PREFIX my: <http://www.mydomain.com/>CLEAR GRAPH my:g1

Clear my:g1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.79

PREFIX my: <http://www.mydomain.com/>CLEAR NAMED

Clear all named graphs

Page 80: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: CLEAR

PREFIX my: <http://www.mydomain.com/>CLEAR DEFAULT

Clear default graph

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.80

PREFIX my: <http://www.mydomain.com/>CLEAR ALL

Clear all named graphs and default graph

Page 81: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: CREATE

PREFIX my: <http://www.mydomain.com/>CREATE GRAPH my:g1

Create graph my:g1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.81

Page 82: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: DROP

PREFIX my: <http://www.mydomain.com/>DROP GRAPH my:g1

Drop my:g1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.82

PREFIX my: <http://www.mydomain.com/>DROP NAMED

Drop all named graphs

Page 83: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: DROP

PREFIX my: <http://www.mydomain.com/>DROP DEFAULT

Drop default graph

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.83

PREFIX my: <http://www.mydomain.com/>DROP ALL

Drop all named graphs and default graph

Page 84: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: COPY

PREFIX my: <http://www.mydomain.com/>COPY GRAPH my:g1 TO GRAPH my:g2

Replace contents of graph g2 with contents of graph g1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.84

PREFIX my: <http://www.mydomain.com/>COPY DEFAULT TO GRAPH my:g2

Replace contents of graph g2 with contents of default graph

Page 85: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: MOVE

PREFIX my: <http://www.mydomain.com/>MOVE GRAPH my:g1 TO GRAPH my:g2

Replace contents of graph g2 with contents of graph g1 and drop graph

g1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.85

PREFIX my: <http://www.mydomain.com/>MOVE my:g2 TO DEFAULT

Replace contents of default graph with contents of graph g2 and drop

graph g2

Page 86: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL 1.1 Update: ADD

PREFIX my: <http://www.mydomain.com/>ADD GRAPH my:g1 TO GRAPH my:g2

Append contents of graph g1 to graph g2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.86

PREFIX my: <http://www.mydomain.com/>ADD my:g2 TO DEFAULT

Append contents of graph g2 to default graph

Page 87: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Summary

� Query Language

� Update

� Protocol

� Service Description

Query Results JSON Format

Components of SPARQL 1.1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.87

� Query Results JSON Format

� Query Results CSV and TSV Format

� Query Results XML Format

� Federated Query

� Entailment Regimes

� Graph Store HTTP Protocol

Page 88: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Open Geospatial Consortium (OGC) GeoSPARQL Specification

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1288

GeoSPARQL Specification

Page 89: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Linked Geo Data

� Many Linked Open Data (LOD) datasets have geospatial components

� Barriers to integration

– Vendor-specific geometry support

– Different vocabularies

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1289

– Different vocabularies

� W3C Basic Geo, GML XMLLiteral,

Vendor-specific

– Different spatial reference systems

� WGS84 Lat-Long, British National Grid

Page 90: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Semantic GIS

� GIS applications with semantically complex thematic aspects

– Logical reasoning to classify features

� Land cover type, suitable farm land, etc.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1290

– Complex Geometries

� Polygons and Multi-Polygons with 1000’s of points

– Complex Spatial Operations

� Union, Intersection, Buffers, etc.Find parcels with an area of at least 3 sq.

miles that touch a local feeder road and are

inside an area of suitable farm land.

Page 91: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Requirements for GeoSPARQL

� Provide a common target for implementers & users

– Representation and query

� Work within SPARQL’s extensibility framework

� Simple enough for general users

– Keep the common case simple (WGS 84 point data)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1291

– Keep the common case simple (WGS 84 point data)

� Capable enough for GIS professionals

– Multiple SRS’s, complex geometries, complex operators

� Don’t re-invent the wheel!

ISO 19107 – Spatial Schema

ISO 13249 – SQL/MM

Simple Features

Well Known Text (WKT)

GML

KML

GeoJSON

Page 92: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Standardization Timeline

Form SWG

(June 2010)

OAB vote on

candidate

standard

(June 2011)

Process

comments and

update document

(Feb. 2012)

Standard

Published

(June 2012)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1292

Release

candidate

standard

(May 2011)

30-day public

comment period

(July 2011)

TC/PC vote

(March 2012)

1 2 3 4 5 6 7

Page 93: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

From SPARQL to GeoSPARQL

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1293

Page 94: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Query

:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .

:res2 rdf:type :Condo .

RDF Data

SELECT ?r ?ba ?brWHERE { ?r rdf:type :House .

?r :baths ?ba .?r :bedrooms ?br }

SPARQL Query

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1294

:res2 rdf:type :Condo .:res2 :baths "2"^^xsd:decimal .:res2 :bedrooms "2"^^xsd:decimal .

:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .

?r | ?ba | ?br===================:res1 | "2.5" | "3":res3 | "1.5" | "3"

Result Bindings

Page 95: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Query

:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .

:res2 rdf:type :Condo .

RDF Data

SELECT ?r ?ba ?brWHERE { ?r rdf:type :House .

?r :baths ?ba .?r :bedrooms ?brFILTER (?ba > 2) }

SPARQL Query

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1295

:res2 rdf:type :Condo .:res2 :baths "2"^^xsd:decimal .:res2 :bedrooms "2"^^xsd:decimal .

:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .

FILTER (?ba > 2) }

?r | ?ba | ?br===================:res1 | "2.5" | "3"

Result Bindings

Page 96: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Spatial SPARQL QUERY:res1 rdf:type :House .:res1 :baths "2.5"^^xsd:decimal .:res1 :bedrooms "3"^^xsd:decimal .:res1 ogc:hasGeometry :geom1 .:geom1 ogc:asWKT "POINT(-122.25 37.46)"^^ogc:wktLiteral .

:res3 rdf:type :House:res3 :baths "1.5"^^xsd:decimal .:res3 :bedrooms "3"^^xsd:decimal .

Spatial RDF DataThis is what GeoSPARQL

standardizes

This is what GeoSPARQL

standardizes

Vocabulary &

Datatypes

Vocabulary &

Datatypes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1296

:res3 :bedrooms "3"^^xsd:decimal .:res3 ogc:hasGeometry :geom3 .:geom3 ogc:asWKT "POINT(-122.24 37.47)"^^ogc:wktLiteral .

SELECT ?r ?ba ?brWHERE { ?r rdf:type :House . ?r :baths ?ba . ?r :bedrooms ?br .

?r ogc:hasGeometry ?g . ?g ogc:asWKT ?wktFILTER(ogcf:sfWithin(?wkt, "POLYGON(…)"^^ogc:wktLiteral)) }

GeoSPARQL QueryFind houses

within a

search

polygon

Extension

Functions

Extension

Functions

Page 97: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

ogc:Feature ogc:Geometry

GeoSPARQL Vocabulary: Basic Classes and Relations

metadata

ogc:SpatialObject

0 .. *ogc:hasGeometry

Same as ISO

GM_Object

Same as ISO

GM_Object

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1297

ogc:dimension : xsd:int

ogc:coordinateDimension : xsd:int

ogc:spatialDimension : xsd:int

ogc:isEmpty : xsd:boolean

ogc:isSimple : xsd:boolean

ogc:asWKT : ogc:wktLiteral

ogc:asGML : ogc:gmlLiteral

>

serializations

metadata0 .. 1

ogc:hasDefaultGeometry

Same as ISO

GFI_Feature

Same as ISO

GFI_Feature

Geometry encoded

as a Literal

Geometry encoded

as a Literal

Page 98: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Details of ogc:wktLiteral

All RDFS Literals of type ogc:wktLiteral shall consist of an optional IRI

identifying the spatial reference system followed by Simple Features

Well Known Text (WKT) describing a geometric value [ISO 19125-1].

"<http://www.opengis.net/def/crs/OGC/1.3/CRS84>POINT(-122.4192 37.7793)"^^ogc:wktLiteral

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1298

POINT(-122.4192 37.7793)"^^ogc:wktLiteral

European Petroleum Survey Group (EPSG)

maintains a set of CRS identifiers.

WGS84 longitude – latitude

is the default CRS

WGS84 longitude – latitude

is the default CRS

"POINT(-122.4192 37.7793)"^^ogc:wktLiteral

Page 99: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Topological Relations between ogc:SpatialObject

A/B A B A B AB

ogc:sfEquals ogc:sfTouches ogc:sfOverlaps ogc:sfContains

A

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 1299

A B A B BBA

ogc:sfWithin ogc:sfDisjoint ogc:sfIntersects ogc:sfCrosses

• Assumes Simple Features Relation Family

• Also support Egenhofer and RCC8

Page 100: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Example Data

:City rdfs:subClassOf ogc:Feature . :Park rdfs:subClassOf ogc:Feature .:exactGeometry rdfs:subPropertyOf ogc:hasGeometry .

:SanFrancisco rdf:type :City .:UnionSquarePark rdf:type :Park .:UnionSquarePark :commissioned "1847-01-01"^^xsd:date .

Meta InformationMeta Information

Non-spatial PropertiesNon-spatial Properties

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12100

:UnionSquarePark :exactGeometry :geo1 .:geo1 ogc:asWKT "Polygon((…))"^^ogc:wktLiteral .

:SanFrancisco :exactGeometry :geo2 .:geo2 ogc:asWKT "Polygon((…))"^^ogc:wktLiteral .

:UnionSquarePark ogc:sfWithin :SanFrancisco .

Spatial PropertiesSpatial Properties

Page 101: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL Query Functions

– ogcf:distance(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral,units: xsd:anyURI): xsd:double

– ogcf:buffer(geom: ogc:wktLiteral, radius: xsd:double,

geom1 geom2

geom

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12101

radius: xsd:double,units: xsd:anyURI): ogc:wktLiteral

– ogcf:convexHull(geom: ogc:wktLiteral): ogc:wktLiteral geom

Page 102: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL Query Functions

– ogcf:intersection(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom1

geom2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12102

– ogcf:union(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom1

geom2

Page 103: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL Query Functions– ogcf:difference(geom1: ogc:wktLiteral,

geom2: ogc:wktLiteral): ogc:wktLiteral

geom1

geom2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12103

– ogcf:symDifference(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): ogc:wktLiteral

geom1

geom2

Page 104: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL Query Functions

– ogcf:envelope(geom: ogc:wktLiteral): ogc:wktLiteralgeom

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12104

– ogcf:boundary(geom1: ogc:wktLiteral): ogc:wktLiteral

– ogcf:getSRID(geom: ogc:wktLiteral): xsd:anyURI

geom

Page 105: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL Topological Query Functions

– ogcf:sfEquals(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfDisjoint(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfIntersects(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfTouches(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfCrosses(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12105

– ogcf:sfCrosses(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfWithin(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfContains(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

– ogcf:sfOverlaps(geom1: ogc:wktLiteral, geom2: ogc:wktLiteral): xsd:boolean

Assumes Simple Features

Relation Family

Page 106: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Example Query

PREFIX : <http://my.com/appSchema#>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/geosparql/functions/>PREFIX epsg: <http://www.opengis.net/def/crs/EPSG/0/>

SELECT ?parcelWHERE { ?parcel rdf:type :Residential .

Find all land parcels that are within the intersection of :City1 and :District1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12106

WHERE { ?parcel rdf:type :Residential .?parcel :exactGeometry ?pGeo .?pGeo ogc:asWKT ?pWKT .

:District1 :exactGeometry ?dGeo .?dGeo ogc:asWKT ?dWKT .

:City1 :extent ?cGeo .?cGeo ogc:asWKT ?cWKT .FILTER(ogcf:sfWithin(?pWKT, ogcf:intersection(?dWKT,?cWKT)))}

Page 107: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Summary

� GeoSPARQL Defines:

– Basic vocabulary, Query functions, Entailment component

� Based on existing OGC/ISO standards

– WKT, GML, Simple Features, ISO 19107

� Uses SPARQL’s built-in extensibility framework

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12107

� Uses SPARQL’s built-in extensibility framework

� Modular specification

– Allows flexibility in implementations

– Easy to extend

Page 108: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 3: Semantics

10

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 109: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDF Schema (RDFS)• Core language constructs

• rdfs:subClassOf

:A rdfs:subClassOf :B � instance of A is also instance of B

• rdfs:subPropertyOf (property transfer)

:p1 rdfs:subPropertyOf :p2, :a :p1 :b � :a :p2 :b

:firstAuthor rdfs:subPropertyOf :Author

skos:prefLabel rdfs:subPropertyOf rdfs:label

Derives implicit relationships

using inference

10

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

skos:prefLabel rdfs:subPropertyOf rdfs:label

• rdfs:domain and rdfs:range (specify how a property can be used):p1 rdfs:domain :D, :a :p1 :b � :a rdf:type :D

:p2 rdfs:range :R, :a :p2 :b � :b rdf:type :R

E.g. :performSurgeryOn rdfs:domain :Surgeon

:performSurgeryOn rdfs:range :Patient

• rdfs:label, seeAlso, isDefinedBy, >:Jack rdfs:seeAlso http://>/Jack_Blog

Page 110: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Web Ontology Language (OWL)• More expressive compared to RDFS

• Property related constructs

• owl:inverseOfE.g. :write owl:inverseOf :authoredBy

• owl:SymmetricProperty:relatedTo rdf:type owl:SymmetricProperty

foaf:knows is not defined as a symmetric property!

• owl:TransitiveProperty

Derives implicit relationships

using inference

11

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

:partOf rdf:type owl:TransitiveProperty.

skos:broader rdf:type owl:TransitiveProperty

• owl:equivalentProperty

• owl:FunctionalProperty:hasBirthMother rdf:type owl:FunctionalProperty

• owl:InverseFunctionalPropertyfoaf:mbox rdf:type owl:InverseFunctionalProperty

• Instances (owl:sameAs, owl:differentFrom)

Page 111: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Web Ontology Language (OWL)• Class related constructs

• owl:equivalentClass

• owl:disjointWith:Boys owl:disjointWith :Girls

• owl:complementOf:Boys owl:complementOf :Non_Boys

• owl:unionOf, owl:intersectionOf, owl:oneOf

11

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• owl:unionOf, owl:intersectionOf, owl:oneOf

• owl:Restriction is used to define a class whose members have

certain restrictions w.r.t a property

• owl:someValuesFrom

• owl:allValuesFrom

• owl:hasValue

Page 112: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Class related constructs

• owl:equivalentClass

• owl:disjointWith:Boys owl:disjointWith :Girls

• owl:complementOf:Boys owl:complementOf :Non_Boys

• owl:unionOf, owl:intersectionOf, owl:oneOf

owl:someValuesFrom

•:ApprovedPurchaseOrder owl:equivalentClass

Web Ontology Language (OWL)

11

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• owl:unionOf, owl:intersectionOf, owl:oneOf

• owl:Restriction is used to define a class whose members have

certain restrictions w.r.t a property

• owl:someValuesFrom

• owl:allValuesFrom

• owl:hasValue

•:ApprovedPurchaseOrder owl:equivalentClass

[ a owl:Restriction ;

owl:onProperty :approvedBy ;

owl:someValuesFrom :Manager ]

:PO1 :approvedBy :managerXyz

:managerXyz rdf:type :Manager

� :PO1 rdf:type :ApprovedPurchaseOrder

Page 113: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Class related constructs

• owl:equivalentClass

• owl:disjointWith:Boys owl:disjointWith :Girls

• owl:complementOf:Boys owl:complementOf :Non_Boys

• owl:unionOf, owl:intersectionOf, owl:oneOf

owl:allValuesFrom

•:Vegetarian rdfs:subClassOf

[ a owl:Restriction ;

owl:onProperty :eats ;

Web Ontology Language (OWL)

11

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• owl:unionOf, owl:intersectionOf, owl:oneOf

• owl:Restriction is used to define a class whose members have

certain restrictions w.r.t a property

• owl:someValuesFrom

• owl:allValuesFrom

• owl:hasValue

owl:onProperty :eats ;

owl:allValuesFrom :VegetarianFood ]

:Jen rdf:type :Vegetarian .

:Jen :eats :Marzipan .

� :Marzipan rdf:type :VegetarianFood .

We SHOULD not use :eats rdfs:range :VegetarianFood

Page 114: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Class related constructs

• owl:equivalentClass

• owl:disjointWith:Boys owl:disjointWith :Girls

• owl:complementOf:Boys owl:complementOf :Non_Boys

• owl:unionOf, owl:intersectionOf, owl:oneOf

owl:hasValue

•:HighPriorityItem owl:equivalentClass

Web Ontology Language (OWL)

11

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• owl:unionOf, owl:intersectionOf, owl:oneOf

• owl:Restriction is used to define a class whose members have

certain restrictions w.r.t a property

• owl:someValuesFrom

• owl:allValuesFrom

• owl:hasValue

•:HighPriorityItem owl:equivalentClass

[ a owl:Restriction ;

owl:onProperty :hasPriority ;

owl:hasValue :High ]

:Item1 rdf:type :HighPriorityItem .

� :Item1 :hasPriority :High

Page 115: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Web Ontology Language (OWL)• Class related constructs

• Cardinality restrictions constrain the number of distinct individuals that can

associate with a class instance via a particular property

• owl:minCardinality

• owl:maxCardinality

• owl:cardinality

E.g. To express that a basketball game has at least 2 players

11

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

E.g. To express that a basketball game has at least 2 players

:BasketBallGame rdfs:subClassOf

[ a owl:Restriction;

owl:onProperty :hasPlayer;

owl:minCardinality 2 ]

• Others

• DatatypeProperty, AnnotationProperty, OntologyProperty,>

Page 116: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

OWL 2

• More language constructs, better expressivity

• Property Chains, Keys, Punning, etc.

• http://www.w3.org/TR/owl2-new-features/

• OWL 2 Profiles

11

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• OWL 2 Profiles

• OWL 2 RL

• OWL 2 EL

• OWL 2 QL

Page 117: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Syntactic subset of OWL 2

– W3C standard profile

– Inspired by DLP, pD*

– Has more than 70 entailment rules

OWL 2 RL

11

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Reasoning/conjunctive query answering is PTIME w.r.t

data/taxonomy complexity

– Defines a standard set of rules for implementation

• Oracle Spatial and Graph provides full support for OWL 2

RL/RDF ruleset

Page 118: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Example OWL2 RL Entailment Rules

• OWL2RL has 70+ entailment rules.– E.g. rule :

T(?p, owl:propertyChainAxiom, ?x)

LIST[?x, ?p1, ..., ?pn]

T(?u1, ?p1, ?u2)

T(?u2, ?p2, ?u3)...

11

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• These rules have efficient implementations in Oracle Spatial and Graph

...

T(?un, ?pn, ?un+1) . ���� T(?u1, ?p, ?un+1)

T(?p, rdf:type, owl:FunctionalProperty

T(?x, ?p, ?y1)

T(?x, ?p, ?y2) . ���� T(?y1, owl:sameAs, ?y2)

Page 119: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• SNOMED-CT is a major application of OWL 2 EL

– Suitable for applications employing ontologies that define very large numbers

of classes and/or properties

– One of the largest commercial biomedical ontologies

• Example rule for EL+ inference

OWL 2 EL

11

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Example rule for EL+ inference

?A rdfs:subClassOf ?A1

?A rdfs:subClassOf ?An

T(?C, owl:intersectionOf, ?x)

LIST[?x, ?A1, ..., ?An] � ?A rdfs:subClassOf ?C

• Oracle Spatial and Graph provides full support for OWL 2 EL

Page 120: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 4: RDF View on Relational Data (RDB2RDF)

12

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 121: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

A Look at Some System ArchitecturesTypical Traditional Application Architecture

Mid-Tier Server

Application 1 Application 2 Application 3

SQL

� Mid-tier server and database

server

� Applications communicate

via SQL/JDBC with RDBMS

backend

� Multiple traditional relational

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12121

Database Server

HR Database Sales DatabaseInventory Database

HR Schema Inventory Schema Sales Schema

SQL� Multiple traditional relational

schemas

� Issues

– Inflexible schema

– Limited semantics

– Limited interoperability

Page 122: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

A Look at Some System ArchitecturesTypical Semantic Application Architecture

Mid-Tier Server

Application 1 Application 2 Application 3

SPARQL

� Common ontologies used to

integrate datasets

� Applications communicate

via SPARQL / HTTP with

native triplestore backend

� Issues

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12122

Triplestore

SPARQL

HR GraphSales Graph Inventory Graph

Shared Ontologies

� Issues

– Radical change for

customer

– Need to ETL to RDF

– RDF/OWL may not be

necessary for the entire

data

Page 123: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Relational Data to RDF (RDB2RDF)

� Direct Mapping

W3C Specification

The mission of the RDB2RDF Working Group, part of the Semantic Web

Activity, is to standardize languages for mapping relational data and

relational database schemas into RDF and OWL*.

The two languages are:

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12123

� Direct Mapping

– Automatically generates a mapping based on an input relational schema

� R2RML

– Language for expressing customized mappings

* http://www.w3.org/2001/sw/rdb2rdf/

Page 124: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

A Look at Some System ArchitecturesHow about RDB2RDF?

Mid-Tier Server

Application 1

Application 2 Application 3

Shared Ontologies

SPARQL

� Use virtual RDF data

� Benefits

– Existing relational data

stays in place and

corresponding applications

do not need to change

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12124

Database Server

HR Database Sales DatabaseInventory Database

HR Schema Inventory Schema Sales Schema

SQLRDB2RDF

Inventory Graph Sales Graph

– Use of virtual mapping

eliminates synchronization

issues

– Common vocabulary helps

with data integration issues

Page 125: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Using R2RML: Overall Flow

Schema: Classes

and Predicates

Query Writer

SPARQL to SQL

Translator

SPARQL QUERY

Map: Classes and

Predicates �

DB Objects

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12125

R2RML

Document

Source Relational Database R2RML

Map Author

R2RML Processor

Page 126: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

R2RML Basics: Mapping

class predicatessubject class predicatessubject

Logical

constraints

Logical

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12126

•TriplesMap: Row of a Logical Table (Table / View / SQL query) � triples

• SubjectMap: Primary key value of a Row � subject (+ static class)

• PredicateMap: Names of columns and constraints � predicates (incl. rdf:type)

• ObjectMap: Values in columns or foreign keys � objects (+ dynamic class)

Logical Table A

Logical Table B

Page 127: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Vocabulary: R2RML Classes and Relations

rr:logicalTable

rr:parentTriplesMap

rr:graphMap

(rr:graph)

rr:graphMap

rr:subjectMap

(rr:subject)

rr:predicateMap

(rr:predicate)rr:objectMap

(rr:object)

rr:predicateObjectMap

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12127

Source: (annotated with relation names)R2RML: RDB to RDF Mapping Language

W3C Candidate Recommendation 23 February 2012

http://www.w3.org/TR/2012/CR-r2rml-20120223/

rr:graphMap

(rr:graph)

rr:joinCondition

Page 128: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Mapping EMP and DEPT Tables to RDFx:Department

http://x.com/Dept/{DNO}

class

Subject<http://x.com/Dept/100>

rdf:type x:Department ;

<../Dept/Deptno> 100 ;

<../Dept/DeptName> “Sales” ;

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12128

1 John DB 100

constraints

EM

P ENO ENAME EXPERTISE DNO

100 Sales NYC

DNO DNAME LOC

DE

PT

pkey ref. pkey

Predicate-object pairs

<../Dept/DeptName> “Sales” ;

<../Dept/Location> “NYC”

.

Page 129: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Mapping EMP and DEPT Tables to RDFx:Employee

http://x.com/Emp/{ENO}Subject

class

<http://x.com/Emp/1>

rdf:type x:Employee ;

<../Emp/Empno> 1 ;

<../Emp/EmpName> “John” ;

<../Emp/Expertise> “DB” ;

<../Emp/DeptNum> 100 ;

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12129

1 John DB 100

constraints

EM

P ENO ENAME EXPERTISE DNO

100 Sales NYC

DNO DNAME LOC

DE

PT

pkey ref. pkey

Predicate-object pairs

<../Emp/DeptNum> 100 ;

<../Emp/Department> <http://x.com/Dept/100>

.

Page 130: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Schema for Generated RDF data

� Classes

– x:Department

– x:Employee

� Properties

– <../Dept/Deptno>

– <../Dept/DeptName>

– <../Dept/Location>

– <../Emp/Empno>

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12130

– x:Employee – <../Emp/Empno>

– <../Emp/EmpName>

– <../Emp/Expertise>

– <../Emp/DeptNum>

– <../Emp/Department>

Page 131: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

R2RML Mapping for EMP (w/ multi-ObjectMap)Empno

NUMBER

Ename

Varchar

W_phone

NUMBER

C_phone

Varchar

H_phone

Varchar

W_addr

Varchar

H_addr

Varchar

DeptNo

NUMBER

Smap ���� []

rr:template

“http://ex.org/E/{EMPNO}” ;

POmap ���� []

rr:predicate em:phone ;

rr:objectMap

Tmap ���� <#EmpTM>

rr:logicalTable [

rr:tableName “EMP”

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12131

“http://ex.org/E/{EMPNO}” ;

rr:class ex:Employee .

rr:objectMap

[ rr:column “W_PHONE” ]

, [ rr:column “C_PHONE” ]

, [ rr:column “H_PHONE” ] .

rr:tableName “EMP”

].

POmap���� []

rr:predicate em:dept ;

rr:objectMap [ rr:parentTriplesMap <#DeptTM> ;

rr:joinCondition [ rr:child “DEPTNO” ; rr:parent “DEPTNO” ]].RefObjectMap

Page 132: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

R2RML Mapping: using “R2RML Views”<#DeptTableView> rr:sqlQuery """

SELECT DEPTNO, DNAME, LOC

, (SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFF

FROM DEPT """.

<#TriplesMap1>

rr:logicalTable <#DeptTableView>;

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12132

rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}";

rr:class ex:Department; ];

rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; ];

rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; ];

rr:predicateObjectMap [ rr:predicate ex:staff; rr:objectMap [ rr:column "STAFF" ]; ].

Page 133: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

R2RML Mapping: Translating type codes to IRIs<#TriplesMap1>

rr:logicalTable [ rr:sqlQuery """

SELECT *, (CASE JOB WHEN 'CLERK' THEN 'general-office'

WHEN 'NIGHTGUARD' THEN 'security'

WHEN 'ENGINEER' THEN 'engineering'

END) ROLE

FROM EMP

""" ];

R2RML

Mapping

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12133

""" ];

rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; ];

rr:predicateObjectMap [

rr:predicate ex:role;

rr:objectMap [ rr:template "http://data.example.com/roles/{ROLE}" ]; ].

<http://data.example.com/employee/7369> ex:role <http://data.example.com/roles/general-office> .

Generated RDF triple

Page 134: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 5: Key Features of RDF Semantic Graph a feature of Oracle Spatial and Graph

13

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

feature of Oracle Spatial and Graph

Page 135: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Native RDF Database

• Named graph support

• Supports SPARQL 1.1, SPARQL/SQL, GeoSPARQL

RDF Semantic Graph Feature

Oracle Spatial and Graph OptionRDF Semantic Graph Feature:

13

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Jena, Sesame, & Joseki Web Services

• W3C standards: RDFS, OWL 2 RL, OWL 2 EL, SKOS,

RDF, RDB2RDF, SPARQL 1.1, RDFa

• Scales with hardware – petabytes of triples

• Works with OBIEE, Oracle BPM, Advanced Analytics

• Exploits: Exadata, RAC, Parallelism, Label Security

Page 136: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph RDF Triple StoreLeverages Oracle Manageability:

• RAC & Exadata scalability

• Compression & partitioning

• SQL*Loader direct path load

• Parallel load, inference, query

• High Availability

• Native RDF graph data store

• Manages billions of triples

• Optimized storage architecture

• SPARQL-Jena/Joseki, Sesame

• SQL/graph query, B-tree indexing

• Ontology assisted SQL query

Load / Storage

Query

13

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• High Availability

• Triple-level label security

• Ladder based inference

• Choice of SPARQL, SQL, or Java

• Native inference engine

• Enterprise Manager

• RDFS, OWL2 RL, EL, SKOS• User-defined rules• Incremental, parallel reasoning• User-defined inferencing• Plug-in architecture

Reasoning

• Semantic indexing framework• Integration with • OBIEE, Oracle R Enterprise• Oracle Data Mining

Analytics

Page 137: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

New functions in Oracle Spatial and Graph

• Open Geospatial Consortium (OGC) GeoSPARQL

• Native SPARQL 1.1 query support– 40+ new query functions/operators: IF, COALESCE, STRBEFORE, REPLACE, ABS,

– Aggregates: COUNT, SUM, MIN, MAX, AVG, GROUP_CONCAT, SAMPLE

– Subqueries

– Value Assignment: BIND, GROUP BY Expressions, SELECT Expressions

– Negation: NOT EXISTS, MINUS

13

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Negation: NOT EXISTS, MINUS

– Improved Path Searching with Property Paths

Page 138: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

New functions in Oracle Spatial and Graph

• RDF views on relational tables (through RDB2RDF)

– RDF views can be created on a set of relational tables and/or views

– SPARQL queries access data from both a relational and RDF store

– Support RDF view creation using

• Direct Mapping: simple and straightforward to use

13

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• R2RML Mapping: customizations allowed

• Inference– Native OWL 2 EL inference support

– User defined inferencing

– Ladder Based Inference

– Performance optimization for user defined rules

– Integration with TrOWL, an external OWL 2 reasoner

* http://trowl.eu/

Page 139: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Jena Adapter for Oracle Database

13

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 140: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Jena Adapter for Oracle Database

• Requires Apache Jena 2.7.2, ARQ 2.9.2, Joseki 3.4.4, Oracle Database

release 11.2.0.3 or higher

• SPARQL 1.1 compliance

• Named Graph (quads) support: DatasetGraphOracleSem

• N-QUADS, TriG data format

14

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• N-QUADS, TriG data format

• Updated StatusListener interface

• Named graph queries through Joseki web service endpoint

• SPARQL Update through Joseki web service endpoint

• JSON output

Page 141: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Jena Adapter for Oracle Database (2)

• An efficient approach to better support OntModel APIsOracleGraphWrapperForOntModel

• Support for reserved SQL (PL/SQL) keywords

select ?date { :event :happenedOn ?date }

• Named graph based local inference: Attachment

14

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Named graph based local inference: Attachment

- public void setUseLocalInference(boolean useLocalInference)

- public boolean getUseLocalInference()

- public void setDefGraphForLocalInference(String defaultGraphName)

- public String getDefGraphForLocalInference()

- public String getInferenceOption()

- public void setInferenceOption(String inferenceOption)

Page 142: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Jena Adapter for Oracle Database (3)

• Analytical functions for RDF data: SemNetworkAnalyst

• Integrating Oracle Spatial and Graph network data model (NDM) with

RDF Semantic Graph feature

• Provides functions including shortest path, within cost, partitioning, >

• Extensible architecture

14

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Extensible architecture

Page 143: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph Native Inference Engine

14

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Engine

Page 144: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Core Inference Features in Oracle Database

• Oracle provides native inference in the database for

• RDFS, RDFS++

• OWLPRIME, OWL2RL, OWL2EL, SKOS

• User-defined rules

14

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Inference done using forward chaining

• Triples inferred and stored ahead of query time

• Removes on-the-fly reasoning and results in fast query times

• Proof generation

• Shows one deduction path

Page 145: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Inference Engine in Oracle: APIs

SEM_APIS.CREATE_ENTAILMENT(• entailment_name

• sem_models(‘GraphTBox’, ‘GraphABox’, >),

• sem_rulebases(‘OWL2RL’),

• passes,

• inf_components,

• Options,>

)

Use “PROOF=T” to generate inference proof

Typical Usage:

• First load RDF/OWL data

• Call create_entailment to generate

inferred graph

• Query both original graph and

inferred data

Inferred graph contains only new triples!

Saves time & resources

14

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Use “PROOF=T” to generate inference proof

SEM_APIS.VALIDATE_ENTAILMENT(• sem_models((‘GraphTBox’, ‘GraphABox’, >),

• sem_rulebases(‘OWL2RL’),

• Criteria,

• Max_conflicts,

• Options

)

Java API: performInference, deleteInference, setInferenceOption, analyze methods in• GraphOracleSem, DatasetGraphOracleSem (Jena Adapter)

Typical Usage:

• First load RDF/OWL data

• Call create_entailment to generate

inferred graph

• Call validate_entailment to find

inconsistencies

Page 146: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Inference Engine in Oracle

• Leverage SQL and relational

technologies (partitioning, compression)

e.g. RDFS9 Rule Implemented in SQL

select distinct T2.SID, ID(rdf:type), T1.OID

Parallel

Execution

14

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

from <IVIEW> T1, <IVIEW> T2

where T1.PID=ID(rdfs:subClassOf)

and T2.PID=ID(rdf:type)

and T1.SID=T2.OID

and not exists (

select 1 from <IVIEW> m

where m.SID=T2.SID

and m.PID=ID(rdf:type)

and m.OID=T1.OID)- Implementing an Inference Engine for RDFS/OWL Constructs, ICDE 2008

- Optimizing Enterprise-scale OWL 2 RL Reasoning in a Relational Database System, ISWC 2010

- Advancing the Enterprise-class OWL Inference Engine in Oracle Database, ORE 2012

- Making the Most of your Triple Store: Query Answering in OWL 2 Using an RL Reasoner, WWW 2013

Page 147: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Parallel Inference Engine in Oracle (3)

• Transitive closure calculation– E.g. ?a owl:sameAs ?b � ?a owl:sameAs ?c

?b owl:sameAs ?c

– Partition (distance d) based Iterative

approach

– P1 has :a => :b, :b => :c (asserted)

– Join P1 with P1 to produce P2

14

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Join P1 with P1 to produce P2

– P2 has :a => :c (assign distance 2)

– Join P1 with P2 to produce P3

– >

:a

:b :c

owl:sameAsd=1

d=1

d=21 Implementing an Inference Engine for RDFS/OWL Constructs, ICDE 2008

On the Computation of the Transitive Closure of Relational Operators, VLDB 1986

Page 148: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Parallel Inference Engine in Oracle (4)

• Incremental Maintenance when there is a

small addition Delta to a big graph G

?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y

Graph G self join Graph G

family:John

family:Jack

family:Mary

:hasParent

:hasBrother

ns:Jill ns:Jedi

:relatedTo

:relatedTo

:hasUncle

14

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

:relatedTo

How to efficiently handle a new edge

:Mary :hasBrother :Tony ?

Oracle Confidential April 2013

Page 149: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Parallel Inference Engine in Oracle (4)

• Incremental Maintenance when there is a

small addition Delta to a big graph G

?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y

Graph G self join Graph G

family:John

family:Jack

family:Mary

:hasParent

:hasBrother

ns:Jill ns:Jedi

:relatedTo

:relatedTo

:hasUncle

14

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

:relatedTo

Adding a new edge :Mary :hasBrother :Tony

family:John

family:Jack

family:Mary

:hasParent

:hasBrother

ns:Jill ns:Jedi

:relatedTo

:relatedTo

:hasUncle

family:Tony

:hasBrother

:hasUncle

Page 150: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Native Parallel Inference Engine in Oracle (4)

• Incremental Maintenance when there is a

small addition Delta to a big graph G

?x :hasParent ?y . ?y :hasBrother ?z ���� ?x :hasUncle ?y

Graph G self join Graph G

– To update the closure when Graph G becomes

Graph G’, we can perform 3 efficient small joins than

1 big join.

family:John

family:Jack

family:Mary

:hasParent

:hasBrother

ns:Jill ns:Jedi

:relatedTo

:relatedTo

:hasUncle

15

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

1 big join.:relatedTo

family:John

family:Jack

family:Mary

:hasParent

:hasBrother

ns:Jill ns:Jedi

:relatedTo

:relatedTo

:hasUncle

family:Tony

Approach ?x :hasParent ?y ?y :hasBrother ?z

I: A single big

join

Graph G’ Graph G’

II: 3 small

joins

Delta Graph G

Graph G Delta

Delta Delta

:hasBrother

:hasUncle

- Optimizing Enterprise-scale OWL 2 RL Reasoning in a Relational Database System, ISWC 2010

Adding a new edge :Mary :hasBrother :Tony

Page 151: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Option 1: Add user-defined rules

• Oracle supports, from release 10g, user-defined rules :

Extending Semantics Supported by Native OWL Inference Engine

Antecedents

Consequents

?z :parentOf ?x .

?z :parentOf ?y .

?x owl:differentFrom ?y .

?x :siblingOf ?y

15

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Option 2: Leverage external DL reasoners

• Option 3: User-defined inferencing in Oracle Database Release 12c

?x owl:differentFrom ?y .

Page 152: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Extensible Architecture for External Reasoners

External

In-Memory

OWL DL

Jena APIs

through

Oracle’s

Native Inference

Engine for

OWL 2 RL, EL &

15

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Reasoners

TrOWL/REL

Jena AdapterOWL 2 RL, EL &

user-defined

rules

Materialized Inference

Page 153: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Enabling Advanced Inference Capabilities• Parallel inference option

EXECUTE sem_apis.create_entailment('M_IDX',sem_models('M'),

sem_rulebases('OWLPRIME'), null, null, 'DOP=x');

– Where ‘x’ is the degree of parallelism (DOP)

• Incremental inference optionEXECUTE sem_apis.create_entailment ('M_IDX',sem_models('M'),

sem_rulebases('OWLPRIME'),null,null, 'INC=T');

• Enabling owl:sameAs option to limit duplicates

15

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Enabling owl:sameAs option to limit duplicatesEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

sem_rulebases('OWLPRIME'),null,null,'OPT_SAMEAS=T');

• Compact data structuresEXECUTE Sem_apis.create _entailment('M_IDX',sem_models('M'),

sem_rulebases(‘OWLPRIME'),null,null, 'RAW8=T');

• OWL2RL/SKOS inferenceEXECUTE Sem_apis.create_entailment('M_IDX',sem_models('M'), sem_rulebases(x),null,null…);

• x in (‘OWL2RL’,’SKOSCORE’)

Page 154: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Querying RDF Semantic Graph

15

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 155: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL Query Architecture

Jena API

Jena Adapter

Sesame API

Sesame Adapter

Standard SPARQL EndpointEnhanced with query management control

Java

HTTP

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12155

Jena Adapter Sesame Adapter

SEM_MATCHSQL

Java

SPARQL-to-SQL Core Logic

Page 156: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL vs. SQL

SPARQL NULL-accepting JOIN vs. SQL NULL-rejecting JOIN

?p ?n ?e

<p1> "Jon Stewart" "[email protected]"

<p2> "John Smith" NULL

?p ?e

<p1> "[email protected]"

<p2> "[email protected]"?p = ?p

AND

?e = ?e{?p :name ?nOPTIONAL {?p :email ?e}OPTIONAL {?p :mbox ?e} }

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12156

?p ?n ?e

<p1> "Jon Stewart" "[email protected]"

<p2> "John Smith" "[email protected]"SPARQL

?p ?n ?e

<p1> "Jon Stewart" "[email protected]"

<p2> "John Smith" NULL

SQL

OPTIONAL {?p :mbox ?e} }

Page 157: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SPARQL vs. SQL

• SPARQL weak typing vs. SQL strong typingFILTER (?a + ?b > 10) > doesn’t complain if ?a, ?b not numbers.

Also, when ?a and/or ?b are not numbers,

FILTER(?a + ?b > 10) and FILTER(!(?a + ?b > 10)) are both FALSE

FILTER(?a != ?b), etc. are complicated

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12157

• No Boolean type in SQL

• SPARQL UNION

SPARQL UNION is actually SQL UNION ALL

SPARQL graph patterns DO NOT have to be UNION compatible

Page 158: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH: SPARQL in SQL

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12158

Page 159: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH: Adding SPARQL to SQL

� Extends SQL with SPARQL constructs

– Graph Patterns, OPTIONAL, UNION

– Dataset Constructs

– FILTER – including SPARQL built-ins

– Solution Modifiers

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12159

– Solution Modifiers

� Benefits:

– Integrates graph data with existing enterprise data

– JOINs with other object-relational data

– Allows SQL constructs/functions

– DDL Statements: create tables/views

Page 160: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12160

FROM <http://g1>WHERE {?p foaf:name ?n1

OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',

SEM_MODELS('M1'),…));

Page 161: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>

SQL Table Function

n1 n2

Alex Jerry

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12161

FROM <http://g1>WHERE {?p foaf:name ?n1

OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',

SEM_MODELS('M1'),…));

Alex Tom

Alice Bill

Alice Jill

Alice John

Page 162: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH: Adding SPARQL to SQL

SELECT n1, n2FROMTABLE(SEM_MATCH('PREFIX foaf: <http://...>SELECT ?n1 ?n2FROM <http://g1>

SQL Table FunctionRewritable

( SELECT v1.value AS n1, v2.value AS n2FROM VALUES v1, VALUES v2

TRIPLES t1, TRIPLES t2, …WHERE t1.obj_id = v1.value_id

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12162

FROM <http://g1>WHERE {?p foaf:name ?n1

OPTIONAL {?p foaf:knows ?f .?f foaf:name ?n2 }

FILTER (REGEX(?n1, "^A")) }ORDER BY ?n1 ?n2',

SEM_MODELS('M1'),…));

AND t1.pred_id = 1234AND …

)Get 1 unified SQL query- Query optimizer sees 1 query

- Get all the performance of Oracle SQL Engine- optimizer, compression, indexes, parallelism, etc.

Page 163: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

SEM_MATCH Table Function Arguments

SEM_MATCH(

query,

models,

rulebases,

'SELECT ?aWHERE { ?a foaf:name ?b }'

Container(s) for

asserted quadsEntailed

+

Basic unit of

access control

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12163

rulebases,

options

);

Built-in (e.g. OWL2RL)

and user-defined

rulebases

'ALLOW_DUP=T STRICT_TERM_COMP=F'

Entailed

triples+

Page 164: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GovTrack RDF DataRDF/OWL data about activities of US Congress

• Political Party Membership

• Voting Records

• Bill Sponsorship

• Committee Membership

• Offices and Terms

GOV_TBOX

GOV_PEOPLE

GOV_BILLS_110

GOV_BILLS_111

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12164

GOV_BILLS_111

GOV_VOTES_110

GOV_VOTES_111

GOV_TRACK_OWL

GOV_ALL_VMINFERENCE

OWL2RLGOV_ASSERT_VMAsserted data only

(4.3M triples)

Asserted + Inferred

(4.7M triples)

GOV_DISTRICTS (US Census)

GovTrack in OracleVirtual Models

Semantic Models

Rulebases

Entailments

Page 165: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: SPARQL SELECT Modifiers

SELECT f$rdftermFROM TABLE(SEM_MATCH('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>SELECT DISTINCT ?fWHERE{ ?p vcard:N ?vn .

Find all distinct family names for senators

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12165

{ ?p vcard:N ?vn .?vn vcard:Family ?f .?p foaf:title "Sen." .

}',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));

Page 166: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: SPARQL FILTER

select n$rdfterm, b$rdfterm, g$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>SELECT ?n ?b ?gWHERE

Find information about Adams that were born before the American Revolution

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12166

WHERE{ ?p vcard:N ?vn .

?vn vcard:Family ?f .?p foaf:name ?n .?p vcard:BDAY ?b .?p foaf:gender ?g FILTER ( ?f = "Adams" &&

xsd:dateTime(?b) < xsd:dateTime("1776-07-04") )}',sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));

Page 167: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: SPARQL MINUS

select n$rdfterm, b$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>SELECT ?n ?bWHERE

Find information about Kennedys that don’t have a homepage

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12167

WHERE{ ?p vcard:N ?vn .

?vn vcard:Family "Kennedy" .?p foaf:name ?n .?p vcard:BDAY ?b .MINUS { ?p foaf:homepage ?h }

}',sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T ‘));

Page 168: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Aggregation

select n$rdfterm, cnt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>SELECT ?n (COUNT(*) AS ?cnt)

Who sponsored the most bills?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12168

SELECT ?n (COUNT(*) AS ?cnt)WHERE{ ?s foaf:name ?n .

?b bill:sponsor ?s .}GROUP BY ?nORDER BY DESC(?cnt)LIMIT 10'

,sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));

Page 169: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Named Graph Query

select n$rdfterm, g$rdfterm, bcnt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>SELECT ?n ?g (count(?b) as ?bcnt)FROM usgov:peopleFROM NAMED usgov:bills_110

Find number of bills sponsored in 110th and 111th congress

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12169

FROM NAMED usgov:bills_110FROM NAMED usgov:bills_111WHERE{ ?s foaf:name ?n

GRAPH ?g { ?b bill:sponsor ?s }}GROUP BY ?n ?gORDER BY ?n ?g'

,sem_models('gov_all_vm'), null, null,null, null,' ALLOW_DUP=T PLUS_RDFT=T '));

Page 170: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GovTrack Bill

Types

Live Demo: Inference

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12170

Page 171: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Entailment

select title$rdfterm, dt$rdfterm, btype$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?title ?dt ?btypeWHERE

Find bills sponsored by Barack Obama and their types

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12171

WHERE{ ?s foaf:name "Barack Obama" .

?b bill:sponsor ?s .?b dc:title ?title .?b rdf:type ?btype .?b bill:introduced ?dtFILTER(xsd:dateTime("2007-03-28") <= xsd:dateTime(?dt) &&

xsd:dateTime(?dt) < xsd:dateTime("2007-04-01") ) }',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T’));

Page 172: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Entailment with Property Path

select title$rdfterm, dt$rdfterm, btype$rdfterm, sc$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?title ?dt ?btype ?scWHERE

Find bills sponsored by Barack Obama and their types

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12172

WHERE{ ?s foaf:name "Barack Obama" .

?b bill:sponsor ?s .?b dc:title ?title .?b rdf:type ?btype .?btype rdfs:subClassOf+ ?sc .?b bill:introduced ?dtFILTER(xsd:dateTime("2007-03-28") <= xsd:dateTime(?dt) &&

xsd:dateTime(?dt) < xsd:dateTime("2007-04-01") ) }',sem_models('gov_assert_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T’));

Page 173: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Full Text Indexing with Oracle Text

� Filters graph patterns based on text search string

� Indexes all RDF Terms

– URIs, Literals, Language Tags, etc.

� Provide SPARQL extension function

– orardf:textContains(?var, "Oracle text search string")

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12173

orardf:textContains(?var, "Oracle text search string")

– Search String

� Group Operators: AND, OR, NOT, NEAR, …

� Term Operators: stem($), soundex(!), wildcard(%)

SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/text');

Page 174: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Full Text Search

select n$rdfterm, title$rdfterm, dt$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX bill: <http://www.rdfabout.com/rdf/schema/usbill/>PREFIX dc: <http://purl.org/dc/elements/1.1/>SELECT ?n ?title ?dtWHERE

Find information about bills related to children and taxes

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12174

WHERE{ ?b bill:sponsor ?s .

?s foaf:name ?n .?b dc:title ?title .?b bill:introduced ?dtFILTER (orardf:textContains(?title, "$children AND $taxes"))}'

,sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));

Page 175: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GeoSPARQL with Oracle Spatial and Graph

� Support geometries encoded as ogc:wktLiterals

:semTech2011 ogc:asWKT"POINT(-122.4192 37.7793)"^^ogc:wktLiteral .

� Provide a library of spatial functions

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12175

SELECT ?sWHERE { ?s ogc:asWKT ?geom

FILTER(ogc:distance(?geom,"POINT(-122.4192 37.7793)"^^ogc:WKTLiteral,uom:KM) <= 10)

� Provide a library of spatial functions

Page 176: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

OGC wktLiteral Datatype

SRS: WGS84 Longitude, Latitude

� Optional leading Spatial Reference System URI followed by OGC WKT geometry

string.<http://xmlns.oracle.com/rdf/geo/srid/{srid}>

� WGS 84 Longitude, Latitude is the default SRS (assumed if SRS URI is absent)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12176

SRS: WGS84 Longitude, Latitude"POINT(-122.4192 37.7793)"^^ogc:wktLiteral

SRS: NAD27 Longitude, Latitude"<http://xmlns.oracle.com/rdf/geo/srid/8260>

POINT(-122.4181 37.7793)"^^ogc:wktLiteral

Page 177: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

OGC wktLiteral Datatype

SQL> exec sem_apis.add_datatype_index('http://xmlns.oracle.com/rdf/geo/WKTLiteral',

� Prepare for spatial querying by creating a spatial index for the ogc:wktLiteraldatatype

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12177

'http://xmlns.oracle.com/rdf/geo/WKTLiteral', options=>'TOLERANCE=1.0 SRID=8307

DIMENSIONS=((LONGITUDE,-180,180)(LATITUDE,-90,90))');

Page 178: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

What Types of Spatial Data are Supported?

� Spatial Reference Systems

– Built-in support for 1000’s of SRS

– Plus you can define your own

– Coordinate system transformations applied transparently during indexing

and query

� Geometry Types

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12178

� Geometry Types

– Support OGC Simple Features geometry types

� Point, Line, Polygon

� Multi-Point, Multi-Line, Multi-Polyon

� Geometry Collection

– Up to 500,000 vertices per Geometry

Page 179: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Spatial Function Library

� Topological Relations

– ogcf:relate, ogcf:sfContains, ogcf:sfCrosses, ogcf:sfDisjoint, ogcf:sfEquals, ogcf:sfIntersects, ogcf:sfOverlaps, ogcf:sfTouches, ogcf:sfWithin

� Distance-based Operations

Standard OGC functions

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12179

– ogcf:distance, ogcf:buffer

� Geometry Operations

– ogcf:boundary, ogcf:convexHull, ogcf:envelope, ogcf:getSRID,

� Geometry-Geometry Operations

– ogcf:difference, ogcf:intersection, ogcf:symDifference, ogcf:union

Page 180: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Spatial Function Library

� Topological Relations

– orageo:relate

� Distance-based Operations

– orageo:distance, orageo:withinDistance, orageo:buffer, orageo:nearestNeighbor

Oracle Extensions

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12180

orageo:nearestNeighbor

� Geometry Operations

– orageo:area, orageo:length

– orageo:centroid, orageo:mbr, orageo:convexHull

� Geometry-Geometry Operations

– orageo:intersection, orageo:union, orageo:difference, orageo:xor

Page 181: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

GovTrack Spatial Demo

� Congressional District Polygons (435)

– Complex Geometries

– Average over 1000 vertices per geometry

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12181

Load .shp file

from US Census

into Oracle Spatial

Generate triples using sdo_util.toWKTGeometry()

Load into Oracle

semantic model

Page 182: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Spatial Query

select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>SELECT ?name ?cdist

Find the representative and district for Nashua, NH

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12182

SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .

?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (ogcf:sfContains(?cgeom,

"POINT(-71.46444 42.7575)"^^ogc:wktLiteral)) } ',sem_models('gov_all_vm'), null, null, null,null, ' ALLOW_DUP=T PLUS_RDFT=T '));

Page 183: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Spatial Query

select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>PREFIX uom: <http://xmlns.oracle.com/rdf/geo/uom/>SELECT ?name ?cdist

Find the 10 nearest congressional districts to Nashua, NH ordered by distance

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12183

SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .

?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (orageo:nearestNeighbor(?cgeom, "POINT(-71.46444 42.7575)"^^ogc:wktLiteral,

"sdo_num_res=10")) }ORDER BY ASC(ogcf:distance(?cgeom,

"POINT(-71.46444 42.7575)"^^ogc:wktLiteral, uom:KM))',sem_models('gov_all_vm'), null, null, null, null, ' ALLOW_DUP=T PLUS_RDFT=T ')) order by sem$rownum;

Page 184: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Spatial Query

select name$rdfterm, cdist$rdftermfrom table(sem_match('PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX pol: <http://www.rdfabout.com/rdf/schema/politico/>PREFIX ogc: <http://www.opengis.net/ont/geosparql#>PREFIX ogcf: <http://www.opengis.net/def/function/geosparql/>PREFIX uom: <http://xmlns.oracle.com/rdf/geo/uom/>SELECT ?name ?cdist

Find the 10 nearest congressional districts ordered by distance to center point

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12184

SELECT ?name ?cdistWHERE{ ?person foaf:name ?name .

?person pol:hasRole/pol:forOffice/pol:represents ?cdist .?cdist ogc:asWKT ?cgeomFILTER (orageo:nearestNeighbor(?cgeom, "POINT(-71.46444 42.7575)"^^ogc:wktLiteral,

"sdo_num_res=10")) }ORDER BY ASC(ogcf:distance(orageo:centroid(?cgeom),

"POINT(-71.46444 42.7575)"^^ogc:wktLiteral, uom:KM))',sem_models('gov_all_vm'), null, null, null, null, ' ALLOW_DUP=T PLUS_RDFT=T ')) order by sem$rownum;

Page 185: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDB2RDF: Supporting RDF Views of Relational Data with Oracle Spatial and Graph

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12185

Page 186: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDB2RDF Architecture

Mid-Tier Server

Application 1

Application 2 Application 3

SQLRDB2RDF

Shared Ontologies

SPARQL/SQL

� Use virtual RDF data

� Benefits

– Existing relational data

stays in place and

corresponding applications

do not need to change

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12186

Database Server

HR Database Sales DatabaseInventory Database

HR Schema Inventory Schema Sales Schema

SQLRDB2RDF

Inventory Graph Sales Graph

– Use of virtual mapping

eliminates synchronization

issues

– Common vocabulary helps

with data integration issues

Page 187: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

RDB2RDF Support in Oracle Spatial and Graph

� Support both Direct Mapping and R2RML: RDB to RDF Mapping

Language

� API for creating, dropping and exporting (materializing) RDF views

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12187

– sem_apis.create_rdf_view_model

– sem_apis.drop_rdf_view_model

– sem_apis.export_rdf_view_model

� Query like native RDF data

Page 188: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Relational Data: HR Schema

EMPLOYEES (

EMPLOYEE_ID NOT NULL NUMBER(6)

FIRST_NAME VARCHAR2(20)

LAST_NAME NOT NULL VARCHAR2(25)

EMAIL NOT NULL VARCHAR2(20)

PHONE_NUMBER VARCHAR2(20)

HIRE_DATE NOT NULL DATE

JOB_ID NOT NULL VARCHAR2(10)

SALARY NUMBER(8,2)

COMMISSION_PCT NUMBER(2,2)

MANAGER_ID NUMBER(6)

DEPARTMENT_ID NUMBER(4)

JOBS (

JOB_ID NOT NULL VARCHAR2(10)

JOB_TITLE NOT NULL VARCHAR2(35)

MIN_SALARY NUMBER(6)

MAX_SALARY NUMBER(6)

)

JOB_HISTORY (

EMPLOYEE_ID NOT NULL NUMBER(6)

START_DATE NOT NULL DATE

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12188

DEPARTMENT_ID NUMBER(4)

)

DEPARTMENTS (

DEPARTMENT_ID NOT NULL NUMBER(4)

DEPARTMENT_NAME NOT NULL VARCHAR2(30)

MANAGER_ID NUMBER(6)

LOCATION_ID NUMBER(4)

)

START_DATE NOT NULL DATE

END_DATE NOT NULL DATE

JOB_ID NOT NULL VARCHAR2(10)

DEPARTMENT_ID NUMBER(4)

)

LOCATIONS (

LOCATION_ID NOT NULL NUMBER(4)

STREET_ADDRESS VARCHAR2(40)

POSTAL_CODE VARCHAR2(12)

CITY NOT NULL VARCHAR2(30)

STATE_PROVINCE VARCHAR2(25)

COUNTRY VARCHAR2(30)

GEO_LOCATION SDO_GEOMETRY

)

Page 189: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Relational Data: E-CommercePRODUCT_INFORMATION (

PRODUCT_ID NOT NULL NUMBER(6)

PRODUCT_NAME VARCHAR2(50)

PRODUCT_DESC VARCHAR2(2000)

CATEGORY_ID NUMBER(2)

WEIGHT_CLASS NUMBER(1)

WARRANTY_PERIOD INTERVAL YEAR(2) TO MONTH

SUPPLIER_ID NUMBER(6))

PRODUCT_STATUS VARCHAR2(20)

LIST_PRICE NUMBER(8,2)

MIN_PRICE NUMBER(8,2)

CATALOG_URL VARCHAR2(50)

)

CUSTOMERS (

CUSTOMER_ID NOT NULL NUMBER(6)

CUST_FIRST_NAME NOT NULL VARCHAR2(20)

CUST_LAST_NAME NOT NULL VARCHAR2(20)

CUST_LOCATION_ID NUMBER(4)

PHONE_NUMBERS PHONE_LIST_TYP

CUST_EMAIL VARCHAR2(30)

)

REVIEWS (

REVIEW_ID NOT NULL NUMBER(6)

CUSTOMER_ID NUMBER(6)

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12189

PRODUCT_CATEGORIES (

CATEGORY_ID NOT NULL NUMBER(6)

CATEGORY_NAME VARCHAR2(50)

CATEGORY_DESCRIPTION VARCHAR2(2000)

)

INVENTORIES (

PRODUCT_ID NOT NULL NUMBER(6)

WAREHOUSE_ID NOT NULL NUMBER(3)

QUANTITY_ON_HAND NOT NULL NUMBER(8)

)

WAREHOUSES (

WAREHOUSE_ID NOT NULL NUMBER(3)

WAREHOUSE_SPEC SYS.XMLTYPE

WAREHOUSE_NAME VARCHAR2(35)

LOCATION_ID NUMBER(4)

)

CUSTOMER_ID NUMBER(6)

PRODUCT_ID NUMBER(6)

RATING NUMBER(1)

REVIEW_TEXT VARCHAR2(4000)

)

Page 190: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Creating the RDF View

BEGINsem_apis.create_rdfview_model(

'r2r_model_full', SYS.ODCIVarchar2List('DEPARTMENTS','EMPLOYEES','JOBS','JOB_HISTORY','LOCATIONS',

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12190

'LOCATIONS','PRODUCT_INFORMATION','WAREHOUSES','INVENTORIES','CUSTOMERS','REVIEWS','PRODUCT_CATEGORIES'

), 'http://mydb/', options => ' SCALAR_COLUMNS_ONLY=T');

END;/

Page 191: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: RDB2RDF

select o$rdftermfrom table(sem_match('SELECT DISTINCT ?o WHERE{ ?s rdf:type ?o }'

,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));

What classes do we have?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12191

,null,' PLUS_RDFT=T '));

Page 192: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: RDB2RDF

select p$rdftermfrom table(sem_match('SELECT DISTINCT ?p WHERE{ ?s ?p ?o }'

,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));

What properties do we have?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12192

,null,' PLUS_RDFT=T '));

Page 193: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: RDB2RDF

select p$rdfterm, o$rdftermfrom table(sem_match('SELECT ?p ?o WHERE{ <http://mydb/EMPLOYEES/EMPLOYEE_ID=1> ?p ?o }'

,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));

Describe Employee 1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12193

,null,' PLUS_RDFT=T '));

Page 194: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: RDB2RDF

select fname$rdfterm, lname$rdfterm, t$rdftermfrom table(sem_match('SELECT ?fname ?lname ?tWHERE{ ?e <http://mydb/EMPLOYEES#FIRST_NAME> ?fname .

?e <http://mydb/EMPLOYEES#LAST_NAME> ?lname .?e <http://mydb/EMPLOYEES#ref-JOB_ID> ?j .

Find all employees and their job title

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12194

?e <http://mydb/EMPLOYEES#ref-JOB_ID> ?j .?j <http://mydb/JOBS#JOB_TITLE> ?t }'

,sem_models('r2r_model_full'),null,null,null,null,' PLUS_RDFT=T '));

Page 195: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Semantic Indexing: Extracting Semantic Data from Unstructured Text with Oracle Spatial

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12195

from Unstructured Text with Oracle Spatial and Graph

Page 196: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Subject Property Object graph

p:Marcus rdf:type rc::Person <>/r1>

p:Marcus :hasName “Marcus”^^> <>/r1>

p:Marcus :hasAge “38”^^xsd:.. <>/r1>

> > > >

Triples table with rowid references

CREATE INDEX ArticleIndex

ON Newsfeed (Article)

INDEXTYPE IS SemContext

PARAMETERS (‘my_policy’)

An

aly

tica

l Qu

erie

s

On

Gra

ph

Da

ta

Semantic Indexing- Overview

Auto maintained like a

B-tree index

LOCAL PARALLEL 4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12196

Rowid docId Article Source

1 Indiana authorities filed felony charges and a

court issued an arrest warrant for a financial

manager who apparently tried to fake his

death by crashing his airplane in a Florida

swamp. Marcus Schrenker, 38 >

CNN

2 Major dealers and investors > NW

Newsfeed table SemContext index on Article column

SELECT Sem_Contains_Select(1)

FROM Newsfeed

WHERE Sem_Contains (Article,

‘{?x rdf:type rc:Person .

?x :hasAge ?age .

FILTER(?age >= 35)}’,1)=1

AND Source = ‘CNN’

r1

r2

Content type:

•Text

•File (path)

•URL

Page 197: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Semantic Indexing - Key Components

� Extensible Information Extractor

– Programmable API to plug-in 3rd party extractors into the database.

� SemContext Indextype

– A custom indexing scheme that interacts with the extractor to manage the metadata

extracted from the documents efficiently and facilitates semantic search via SQL

queries.

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12197

queries.

� SEM_CONTAINS Operator

– To identify documents of interest based on their extracted metadata, using standard

SQL queries.

� SEM_CONTAINS_SELECT Ancillary Operator

– To return additional information (SPARQL Query Results XML) about the documents

identified using SEM_CONTAINS operator.

Page 198: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Semantic Indexing - Key Concepts

� Policy

– Base Policy: <policy_name, extractor_type>

– Dependent Policy: <policy_name, base_policy_name, ontology>

� Association between indexes and policies

– Multiple policies may be associated with an index

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12198

– Metadata extracted from each base policy is stored separately

� Sem_Contains invocation is restricted to one policy

– Policy to be used can be specified by user

� Inference

– Document-centric

– Corpus-centric

Page 199: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Product Descr.: “Harry Potter” series book 2

� [from amazon.com]

The Dursleys were so mean that hideous that summer that all Harry Potter wanted was to get back to the Hogwarts

School for Witchcraft and Wizardry. But just as he's packing his bags, Harry receives a warning from a strange, impish

creature named Dobby who says that if Harry Potter returns to Hogwarts, disaster will strike.

And strike it does. For in Harry's second year at Hogwarts, fresh torments and horrors arise, including an outrageously

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12199

And strike it does. For in Harry's second year at Hogwarts, fresh torments and horrors arise, including an outrageously

stuck-up new professor, Gilderoy Lockheart, a spirit named Moaning Myrtle who haunts the girls' bathroom, and the

unwanted attentions of Ron Weasley's younger sister, Ginny.

But each of these seem minor annoyances when the real trouble begins, and someone--or something--starts turning

Hogwarts students to stone. Could it be Draco Malfoy, a more poisonous rival than ever? Could it possibly be Hagrid,

whose mysterious past is finally told? Or could it be the one everyone at Hogwarts most suspects...Harry Potter

himself?

Page 200: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

News report: “Solving a Riddle of Primes”

� [from nytimes.com: Solving a Riddle of Primes - Published: May 20, 2013]

Three and five are prime numbers — that is, they are divisible only by 1 and by themselves. So are 5 and 7. And 11

and 13. And for each of these pairs of prime numbers, the difference is 2.

>

The proof has been elusive.

But last month, a paper > arrived “out of the blue” at the journal Annals of Mathematics, said Peter Sarnak, a

professor of mathematics at Princeton University and the Institute for Advanced Study and a former editor at the

journal, which plans to publish it. The paper, by Yitang Zhang of the University of New Hampshire, > does show an

infinite number of prime pairs whose separation is less than a finite upper limit — 70 million, for now. (Dr. Zhang used

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12200

infinite number of prime pairs whose separation is less than a finite upper limit — 70 million, for now. (Dr. Zhang used

70 million in his proof — basically an arbitrary large number where his equations work.)

>

Dr. Zhang’s proof takes advantage of a 2005 paper by Daniel Goldston of San Jose State University, Janos Pintz of the

Alfred Renyi Institute of Mathematics in Budapest and Cem Yildirim of Bogazici University in Istanbul, which had shown

there would always be pairs of primes closer than the average distance between two primes.

>

Dr. Zhang also used techniques developed in the 1980s by Henryk Iwaniec of Rutgers, Enrico Bombieri of the Institute

for Advanced Study and John B. Friedlander of the University of Toronto, adding his own ingenuity to tie everything

together in a way others had been unable to.

>

Page 201: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Live Demo: Semantic Indexing

select docid, sem_contains_select(1) sparqlrsltfrom doctablewhere sem_contains (doc, 'SELECT ?nm ?orgnm ?posnmWHERE { ?s rdf:type ctype:PersonCareer .

?s c:careertype "professional" .

Find information about professionals and their position name if they are

a professor

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 12201

?s c:careertype "professional" . ?s c:person ?o . ?o c:name ?nm . ?s c:organization ?org . ?org c:name ?orgnm . OPTIONAL{?s c:position ?pos .

?pos c:name ?posnm . FILTER (regex(?posnm,"^professor"))}

}', sem_aliases(sem_alias('ctype','http://s.opencalais.com/1/type/em/r/')

,sem_alias('c','http://s.opencalais.com/1/pred/')), 1) = 1 order by docid;

Page 202: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 6: Performance and Scalability

20

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 203: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graphs Are Big and Are Getting Bigger

• Social Scale*– 1 Billion vertices, 100 billion edges

• Web Scale*• 50 billion vertices, 1 trillion edges

20

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• 50 billion vertices, 1 trillion edges

• Brain Scale*• 100 billion vertices, 100 trillion edges

* An NSA Big Graph Experiment

http://www.pdl.cmu.edu/SDI/2013/slides/big_graph_nsa_rd_2013_56002v1.pdf

Page 204: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

A Big RDF Graph Linking Many Data Sources

20

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 205: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graphs at Scale

Properties of Graph Problems

1. Data Driven

2. Unstructured

20

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

2. Unstructured

3. Poor Locality

4. High IO to Compute ratio

Page 206: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graphs at Scale

Properties of Graph Problems

1. Data Driven

A. Dictated by Node and Link structure

20

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

A. Dictated by Node and Link structure

B. Structure not known a priori

2. Unstructured

3. Poor Locality

4. High IO to Compute ratio

Page 207: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graphs at Scale

Properties of Graph Problems

1. Data Driven

2. Unstructured

A. Irregular Structure of Graphs

20

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

A. Irregular Structure of Graphs

B. Difficult to parallelize

C. Difficult to partition

3. Poor Locality

4. High IO to Compute ratio

Page 208: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Graphs at Scale

Properties of Graph Problems

1. Data Driven

2. Unstructured

20

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

2. Unstructured

3. Poor Locality

A. Represent relationships between entities

B. Esp. for graphs derived from analysis

4. High IO to Compute ratio

Page 209: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

1. Distributed Memory Architectures

2. Partitioned Global Address Space

3. Shared memory

4. Massively Multi-threaded

Parallel Architectures and Computing Models

20

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

4. Massively Multi-threaded

Page 210: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Parallel Graph Queries and Hardware

1. Task Granularity – where to introduce ||’ism?A. IO parallelism – DBWR, partitioning, || hints

B. CPU parallelism – multi-threading, single thread multi core

2. Memory ContentionA. Shared memory in a Global Address Space

B. Not an issue in Oracle

21

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

B. Not an issue in Oracle

3. Load Balancing1. How to load balance multi threaded systems

2. Less Pronounced on Shared Memory

4. Shared Graphs

A. Focus on throughput

B. IO ||’ism

Page 211: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Setup for Performance

• Use a balanced hardware system for databases and mid-tier servers

– A single, huge physical disk for everything is not recommended.

• Multiple hard disks tied together through ASM is a good practice

– A virtual machine for multiple databases and applications is not recommended

– Make sure throughput of hardware components matches up

-

Hardware spec

100 - 200 MB/sCPU core

Sustained throughputComponent

21

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

30 - 50 MB/sDisk (spindle)

80 MB/s*2 Gbit/sGigE NIC (interconnect)

2 Gbit/s

2 Gbit/s

8 * 2 Gbit/s

1/2 Gbit/s

200 MB/sDisk controller

200 MB/sFiber channel

1,200 MB/s16 port switch

100/200 MB/s1/2 Gbit HBA

2k-7k MB/sMEM

Page 212: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

1. Plan on Going to Disk

2. Running Out of Memory

3. Being Smart About It

4. Have IB already in place

OrXXX

21

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

4. Have IB already in place

5. Locate data close to compute nodes

6. Realizing It is Part of a Larger EcoSystem

Page 213: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Engineered Systems

Purpose Built General Purpose

21

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

SPARC

SuperCluster

ExalogicExadata

© 2011 Oracle Corporation – Proprietary and Confidential

Page 214: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Exadata Database Machine

• Designed to be the best platform for Oracle Database

• Integrates and optimizes servers, storage, network

– Using high-volume servers in a scale-out architecture

– Flash-optimized

• Special software brings database intelligence to

21

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Special software brings database intelligence to

storage, flash, and networking

– To deliver extreme performance and data compression

• Deployments are fast, reliable, and supportable

• Optimal for all database workloads

– OLTP, Data Warehousing, and Database Clouds

Page 215: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Exadata Storage Software Unique Features

• Exadata Smart Scans– 10X or greater reduction in data sent

to database servers

• Exadata Storage Indexes– Eliminate unnecessary I/Os

• Hybrid Columnar Compression

• Exadata Smart Flash Cache– Breaks random I/O bottleneck by

increasing IOPs by up to 20X

– Doubles user data scan bandwidths

• I/O Resource Manager (IORM) – Enables storage grid by prioritizing

21

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Hybrid Columnar Compression– Efficient compression increases

effective storage capacity and increases user data scan bandwidths by a factor of up to 10X

– Very efficient on triple, quad

– Enables storage grid by prioritizing I/Os to ensure predictable performance

• Quality of Service (QoS)

– Actively meet and maintain SLAs

– Memory Guard to protect existing

current transactions from memory-

based failures

Page 216: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Components - Exadata, Exalogic, and SuperCluster

Exadata X3-8 Exalogic X3-2 SuperCluster

Compute Grid

- Compute Servers 2 Sun Server X2-8 30 Sun Server X3-2 4 SPARC T4-4

- Compute Cores 160 Xeon 480 Xeon 128 Sparc T4

- Compute Server Memory 4 TB 7.5 TB 4 TB

- Operating Systems Linux Linux / Solaris Solaris

- Oracle Virtual Machine Partitioning - Yes Yes

- Exalogic Elastic Cloud Software - Yes Yes

21

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. © 2011 Oracle Corporation – Proprietary and Confidential

- Exalogic Elastic Cloud Software - Yes Yes

Storage Grid

- ZFS 7320 NAS Appliance - Yes (60TB) Yes (60TB)

- Exadata Storage Servers 14 (up to 504TB) - 6 (up to 216TB)

Networking

- Ethernet 1 GbE / 10GbE 10GbE 10GbE

- InfiniBand Fabric Yes Yes Yes

- Fibre Channel SAN Connectivity - - Yes (optional)

Page 217: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Configuration for Performance

21

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 218: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Configure OS and Network

• Network configuration is important to data integration performance

– Network MTU (TCP, Infiniband)

– net core rmem_max, wmem_max

21

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Linux OS Kernel parameters– shmmax,

– shmall,

– aio-max-nr,

– sem, >

Page 219: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Configure Database

• Database parameters– SGA, PGA, filesystemio_options,

– db_cache_size, auto dop, >

• Calibrate I/O performance‒ DBMS_RESOURCE_MANAGER.CALIBRATE_IO

21

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

‒ DBMS_RESOURCE_MANAGER.CALIBRATE_IO

• Gather statistics

• Run a typical workload on a typical data set‒ Check AWR report to see top waits

‒ Check SQL Monitor report to find bottlenecks in SQL executions

Page 220: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Configure Mid-Tier Server

• Understand bottleneck‒ Use tools, jstack/top for example, to identify top threads

• Set a proper JVM heap size‒ Pay close attention to GC activities and memory related settings

• Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, :NewRatio,

22

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, :NewRatio, :SurvivorRatio, etc.

• For Java clients using JDBC (through Jena Adapter)‒ Network MTU, Oracle SQL*Net parameters including SDU, TDU,

SEND_BUF_SIZE, RECV_BUF_SIZE,

‒ Linux Kernel parameters: net.core.rmem_max, wmem_max, net.ipv4.tcp_rmem, tcp_wmem, >

Page 221: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Performance and Scalability

• Scales to 100s of billions of triples (petabytes) and more

- Scales linearly with Oracle database and hardware

- No limitations as with other in-memory approaches

• Fast loading of triples– Incremental and bulk loading

22

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Incremental and bulk loading

• Parallelism is exploited– Load, Query, Inference

• Fast access by persisting asserted and inferred triples

Page 222: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph - LUBM 200K on 3-Node RAC

Sun Server X2-4 Load, Inference and Query Performance

• The LUBM 200K Graph has 48+ Billion triples (edges)

– Original graph has 26.6 Billion unique triples (quads)

– Inference produced another 21.4 Billion triples

• Data Loading Performance

– Triples Loaded and Indexed Per Second (TLIPS): 273K

22

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Setup:

Hardware: Sun Server X2-4, 3-node RAC

- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)

- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)

256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)

Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G

Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

• Inference Performance

– Triples Inferred and Indexed Per Second (TIIPS): 327K

• SPARQL Query Performance

– Query Results Per Second (QRPS): 459K

Page 223: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Spatial and Graph - LUBM 200K on 3-Node RAC

Sun Server X2-4 Load Performance

Data Set Quads Loaded TimeDegrees of

Parallelism

LUBM200K

Load into Staging Table:

Load into the RDF graph:

27.4 billion Quads (with duplicates)

26.6 billion Quads (unique quads)

2 hrs 6 min.

22 hrs 23 min.

DOP = 66

DOP = 80

•Data loading included de-duplication and building of two indexes on the quads. A significant portion (11 hrs 18

minutes) of the total load time was spent in building the two indexes.

22

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

minutes) of the total load time was spent in building the two indexes.

•Loading from the 198 compressed N-Quad formatted files was done by defining an External Table (with gunzip

preprocessor) on those files and then using sem_apis.LOAD_INTO_STAGING_TABLE

•Load flags => parse mbv_method=shadow parallel=80 parallel_create_index DEL_BATCH_DUPS=USE_INSERT

Setup:

Hardware: Sun Server X2-4, 3-node RAC

- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)

- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)

256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)

Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G

Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

Page 224: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Data Set (# quads) Quads Inferred Time Degrees of Parallelism

LUBM 200K (27.4B) 21.4 billion 17 hrs 56 min. DOP = 80

Inference Semantics: OWLPrime + the following components:

Inference included building 2 indexes on the inferred triples that took a little over 5 hrs.

Oracle Spatial and Graph - LUBM 200K on 3-Node RAC

Sun Server X2-4 Inference Performance

22

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

INTERSECT, INTERSECTSCOH, SVFH, THINGH, THINGSAM, UNION

Inference Options: RAW8=T, Dynamic Sampling level 1

Setup:

Hardware: Sun Server X2-4, 3-node RAC

- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)

- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)

256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)

Software: Oracle Database 11.2.0.3.0, SGA_TARGET=850G and PGA_AGGREGATE_TARGET=150G

Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

Page 225: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Ontology

LUBM 200K – 48B quads

27.4 billion asserted quads

26.6 billion inferred quads

LUBM Benchmark Queries

OWLPrime

& new

inference

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7

# answers 4 494.5M 6 34 719 2.067B 67

Time (sec) 0.01 1160 0.01 609.22 0.04 1105.07 712.48

Query Q8 Q9 Q10 Q11 Q12 Q13 Q14

# answers 7790 4 224 15

Oracle Spatial and Graph - LUBM 200K on 3-Node RAC

Sun Server X2-4 Query Performance

22

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

DOP = 40, Dynamic sampling level = 6. 4.18 Billion answers generated in 2.53 hrs on a single node.

inference

components# answers 7790 53.86M 4 224 15 926088 1.568B

Time (sec) 1228.95 3139.28 0.01 0.01 1.2 208.88 946.01

Setup:

Hardware: Sun Server X2-4, 3-node RAC

- Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870)

- Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420 4 CPU 2.00GHz 8-Core (Intel E7-4820)

256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ)

Software: Oracle Database 11.2.0.3.0, SGA_TARGET=850G and PGA_AGGREGATE_TARGET=150G

Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

Page 226: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• LUBM 25K local inference on Sun M8000

• 6.1B+ quads (3.4B asserted, 2.7B inferred)

Parallel Execution Performance on M8000

345

250

300

350

400

Tim

e (

in m

inu

tes)

Oracle’s Parallel Execution

is completely transparent!

22

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

214

180

150 145

121

0

50

100

150

200

250

0 20 40 60 80 100 120 140

Tim

e (

in m

inu

tes)

Parallel Degree (DOP)

is completely transparent!• Cross CPUs/Cores on a single node

• Cross multiple nodes in a cluster

Page 227: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Inference Performance on Exadata V2

1 Preliminary result: 1 round of OWLPrime (OWL Horst semantics)

Data Set (# triples) Triples Inferred Time Degrees of Parallelism

LUBM 100K (13B) 5B 1h, 58’ 1 DOP = 32

LUBM 25K (3.3B) 2.7B 4h, 7’ 2 DOP = 32

LUBM 8K (1.1B) 869M 46’ 2 DOP = 64

22

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

2 Inference: OWLPrime + components: INTERSECT,INTERSECTSCOH,SVFH,THINGH,THINGSAM,UNION

Setup:

Hardware: Full Rack Sun Oracle Database Machine X2-2 (8 nodes, 72GB RAM per node), and Exadata Storage Server

Storage required: LUBM8K: 330GB or LUBM25K 1TB + 110GB temp table space

Software: Oracle Database 11.2.0.1.0 + Patch 9819833: SEMANTIC TECHNOLOGIES 11G R2 FIX BUNDLE 2

Each node: SGA_TARGET=32G and PGA_AGGREGATE_TARGET=31G

Page 228: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Query Performance on Exadata V2Auto DOP used. 465,849,803 answers generated for LUBM 25K in 274.2 sec.

Ontology

LUBM 25K

3.3 billion triples &

2.7 billion inferred

LUBM Benchmark Queries

Query Q1 Q2 Q3 Q4 Q5 Q6 Q7

# answers 4 2528 6 34 719 260M 67

22

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

OWLPrime

& new inference

components

Complete? Y Y Y Y Y Y Y

Time

(sec)0.01 20.65 0.01 0.01 0.02 23.07 4.99

Query Q8 Q9 Q10 Q11 Q12 Q13 Q14

# answers 7790 6.8M 4 224 15 0.11M 197M

Complete? Y Y Y Y Y Y Y

Time

(sec)0.48 203.06 0.01 0.02 0.02 2.40 19.45

Page 229: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 7: Demonstration

22

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Managing and Mining RDF Graph Data with Oracle Spatial and Graph

Page 230: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Open source ontology editor

• Java APIs (Jena Adapter)

allows an easy integration of

Integration with Protégé

Tools: Ontology Editing and Engineering

Protégé 4.1

http://protege.stanford.edu/

23

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

allows an easy integration of

Protégé with Oracle Spatial

and Graph

Page 231: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Commercial tool

• Java APIs (Jena Adapter)

allows an easy integration of

Integration with TopQuadrant Tools

Tools: Ontology Editing and Engineering

23

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

allows an easy integration of

TopQuadrant TopBraid Suite

with Oracle Spatial and Graph

Page 232: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Open source visualizer

• Visualizes RDF and OWL stored

in Oracle Database

Integration with Cytoscape

Tools: Navigation and Visualization of Graphs

http://www.cytoscape.org/

23

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

in Oracle Database

• Enables Fish Eye views by

building summary-detail graphs

• Performant with large RDF data

sets

Page 233: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Commercial tool

• Standards-based SPARQL

web service endpoint allows an

Integration with Visualization

Tools: Navigation and Visualization of Graphs

Tom Sawyer Perspectives

23

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

web service endpoint allows an

easy integration with Tom

Sawyer’s Perspective

Page 234: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Standards based SPARQL

web service endpoint and

SPARQL Gateway feed XML

Integration with

Oracle Business Intelligence EE

Tools: Reporting RDF in Business Intelligence

23

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

SPARQL Gateway feed XML

(transformed from SPARQL

XML response) to BI

Page 235: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Open source language

• Statistical computing and chart for graph data

Integration with Oracle R Enterprise

Tools: Statistical Graph AnalyticsOracle R Enterprise feature of Oracle Advanced Analytics

23

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

for graph data

• Produces publication quality plots

• Highly extensible with open source R packages

Page 236: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Tools: Discovery & Predictive Analysis Oracle Data Mining feature of Oracle Advanced Analytics

Problem Classification Sample Problem

Anomaly Detection Given demographic data about a set of customers, identify

customer purchasing behavior that is significantly different from

the norm

Association Rules Find the items that tend to be purchased together and specify

their relationship – market basket analysis

23

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

their relationship – market basket analysis

Clustering Segment demographic data into clusters and rank the probability

that an individual will belong to a given cluster

Feature Extraction Given demographic data about a set of customers, group the

attributes into general characteristics of the customers

F1 F2 F3 F4

Page 237: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Using Oracle Data Mining with RDF Graph Data

• Make the semantic data available to a data mining tool in an

appropriate format

– Turn a semantic data store into yet another data source for DM tool

create view N_COUNTRY_BD_RATE as select name

, to_number(brate) as brate

, to_number(drate) as drate

, to_number(popu) as population

23

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

, to_number(popu) as population

, to_number(mig) as net_migration_rate

, to_number(imr) as infant_motal_rate

, to_number(leab) as life_expectancy

from table(sem_match('{ SQL

?subject <http://www.cia.gov/cia/publications/factbook#Birth_rate> ?brate .

?subject <http://www.cia.gov/cia/publications/factbook#Death_rate> ?drate .

?subject <http://www.cia.gov/cia/publications/factbook#Name> ?name .

?subject <http://www.cia.gov/cia/publications/factbook#Population> ?popu . SPARQL

?subject <http://www.cia.gov/cia/publications/factbook#Net_migration_rate> ?mig .

?subject <http://www.cia.gov/cia/publications/factbook#Infant_mortality_rate> ?imr .

?subject <http://www.cia.gov/cia/publications/factbook#Life_expectancy_at_birth> ?leab .

}‘ … ))

Page 238: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Using Oracle Data Mining with RDF Graph Data• Tie it all together

– Turn a semantic data store into yet another data source to DM

– Follow the conventional DM process:

• Data preparation, build/evaluate model, deployment

• This is one example of what you may get:

23

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 239: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Using Oracle Data Mining with RDF Graph Data• Tie it all together

– Turn a semantic data store into yet another data source to DM

– Follow the conventional DM process:

• Data preparation, build/evaluate model, deployment

• Some Mining results can be saved back as RDF into Oracle database

23

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Anomaly Detection output in SQL

Convert into RDF

:AbnormalCase1 :hasSubject

:Dominica .

:AbnormalCase1 :probability

“0.54”

Page 240: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Enterprise Manager

Understand exactly what is the going on• Configuration

• Storage

• Security

24

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Performance

– Real time monitor

– CPU

– Memory

– I/O

– Sessions

– Activity

– Workload

– X

• X

Page 241: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Oracle Enterprise Manager

Understand exactly what is the going on• Configuration

• Storage

• Security

24

1

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Performance

– Real time monitor

– CPU

– Memory

– I/O

– Sessions

– Activity

– Workload

– X

• X

Page 242: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Part 8: SQL-Based Graph Analytics

24

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 243: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Why Use SQL for Graph Analytics?

• A standard language that runs every platform

• Modern RDBMS provides many cool features

– Cost based optimizer

– Parallel execution

– Partition pruning

– Hierarchical queries, Match_Recognize

24

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

– Hierarchical queries, Match_Recognize

– Smart scans, Table Compression, Index Compression, IMDB

– Temporal query

– Transparent query rewrite (OLS/VPD)

– No need to worry about memory management

– No need to worry about workload management in a cluster

• Even NoSQL folks want to have a SQL interface

Page 244: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Shortest Path

• Why is shortest path (SP) important?

– SP is a good measure of how closely related things are

– SP is a foundation to many other algorithms and metrics

• Distance, diameter, radius, eccentricity

• Closeness Centrality

– In G, if John is closer to others than Mary, then John is more important because John can spread

information quicker.

24

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

information quicker.

• Betweenness Centrality

– In G, if John locates on more shortest paths between other nodes than Mary, then John is more

important because John can better control information flow

• Many existing algorithms

– Dijkstra, Bellman Ford, BFS, A*, Johnson, >

• Proposed solution: SQL-Based Dijkstra (SBD)

http://en.wikipedia.org/wiki/Centrality

Page 245: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

• Twitter GraphNodes 41,652,230

Edges 1.4+ Billion

Max Out Degree 2,997,469

Max In Degree 770,155

Weights: integers in [1, 100]

• Performance on DesktopRandom Tests Run1 Run2 Run3

1: v766 v12345 2.84s 1.95s 1.91s

2: v766 v667023 2.41s 1.51s 1.44s

3: v12 v667023 4.55s 4.46s 4.54s

4: v667023 v12 20.09s 1.55s 1.12s

• Performance on ExadataRun1 Run2 Run3

1.11s 1.06s 1.15s

0.86s 0.85s 0.73s

6.61s 3.84s 3.82s

3.91s 0.88s 0.84s

Performance of Finding Shortest Path in SQL

24

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Weights: integers in [1, 100]

• Hardware: 8GB RAM Desktop, 3-SATA disk, quad core

5: v20 v13220032 8.62s 7.36s 7.42s

6: v32309592 v14293310 153.73s 3.93s 2.78s

7: v702013 v37083737 14.99s 1.85s 1.62s

8: v12041332 v37083590 1.55s 1.50s 1.03s

9: v57640358 v19945621 11.91s 0.77s 0.65s

10: v19945621v57640358 315.59s 93.44s 5.39s

• Hardware: Exadata, 128 cores, 128G SGA, 64G PGA (shared

with 3 other databases)

6.52s 5.85s 5.84s

2.12s 2.12s 2.42s

1.0s 1.02s 1.08s

0.92s 0.92s 1.0s

0.56s 0.54s 0.53s

4.05s 3.59s 3.76s

Test performed in Dec 2012 ~ Feb 2013

Page 246: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Page Ranking

• Page Ranking (PR) definition (from Google’s two

founders)

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn))

where PR(A) is the Page Rank of page (node or vertex) A, d is a

damping factor, C(Ti) is the out degree of page Ti

– PR is a variant of Eigenvector centrality which measures influence A

T2

T1

34

24

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

of a node in a network

– Intuitions:

• A vertex (page) is more important if there are more incoming edges

• A vertex is more important if there’re in-edges from more important vertexes

• Contribution of a vertex is diluted if out-degree is high

– Proposed Solution: iteratively update Page Ranking of vertexes

using SQL and parallel execution

T3

1T4

1

Page 247: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Performance of Page Ranking in SQL

• Twitter GraphNodes 41,652,230

Edges 1.4+ Billion

Max Out Degree 2,997,469

Max In Degree 770,155

Weights: not relevant

• Performance evaluation

• 50 iterations of PR calculation took less than 39 minutes

• DOP = 128

• After 10 iterations, the average PR change is less than 0.003

• Hardware: Exadata, 128 cores, 128G SGA, 64G PGA (shared with 3

other databases)

0.5

24

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

Avg | PR(v) – PRprev(v) |

IterationsTest performed in Dec 2012 ~ Feb 2013

Page 248: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

Summary

• Graph is a flexible, intuitive data modeling

• W3C standards-based RDF Semantic Graph

• Tools that work with Graph data

• SQL-Based Graph Analytics

24

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

• Oracle provides an efficient, secure, and scalable platform for

– Managing graph data

– Performing graph analytics

Page 249: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

24

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Page 250: Oracle Spatial and Graphdownload.oracle.com/.../xldb2013_rdf_graph_training.pdf• CIA World Fact Book • DBLP • UniProt • UniParc • CiteSeer • KEGG • Data.gov.uk • Music

25

0

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.