New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical...

61
Foundations of RDF Databases Claudio Gutierrez Department of Computer Science Universidad de Chile UNS - Bahía Blanca - 2008

Transcript of New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical...

Page 1: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Foundations of RDF Databases

Claudio Gutierrez

Department of Computer ScienceUniversidad de Chile

UNS - Bahía Blanca - 2008

Page 2: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• Renzo Angles• Marcelo Arenas• Carlos Hurtado• Sergio Muñoz• Jorge Pérez

Joint Work With

C. Gutierrez – Foundations of RDF Databases

Page 3: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

C. Gutierrez – Foundations of RDF Databases

To the memory ofAlberto Mendelzon,

database theoreticianand Web enthusiast

Inspired by…

Page 4: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

1. RDF and Databases2. RDF and Database models3. RDF Query Language– Requirements and Domains– Manifold Views

4. SPARQL– Syntax and Semantics– Complexity– Expressive Power

C. Gutierrez – Foundations of RDF Databases

Agenda

Page 5: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

1. RDF and Databases2. RDF and Database models3. RDF Query Language– Requirements and Domains– Manifold Views

4. SPARQL– Syntax and Semantics– Complexity– Expressive Power

C. Gutierrez – Foundations of RDF Databases

Agenda

Page 6: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

More like a “Computer Science” than a“Web Science” talk

C. Gutierrez – Foundations of RDF Databases

Disclaimers

A particular view on the subject

Not a survey!

Apologies to Web Scientists

Page 7: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

“ The Semantic Web is therepresentation of data on the WorldWide Web. It is a collaborative effortled by W3C with participation from alarge number of researchers andindustrial partners. It is based on theResource Description Framework(RDF)”

The base of the Semantic Web is RDF

http://www.w3.org/2001/sw/

C. Gutierrez – Foundations of RDF Databases

Page 8: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Recommendation (1999)

Language for representing

information about

resources in the Web

Particularly metadata

about Web resources

Automation of processing:“RDF is intended for situations in

which this information needs to be

processed by applications, rather

than only displayed to people”

C. Gutierrez – Foundations of RDF Databases

Page 9: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Layers of the Semantic Web

C. Gutierrez – Foundations of RDF Databases

Page 10: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

A Data Processing perspective

r

12 4

36

5q

sw

tu p

ba

h

cf

Unicode URI

Proof

Trust

Digi

tal S

igna

ture

XML + NS + xmlschema( Text + Links )

(entities + relations )

(Concepts + knowledge)

RDF + rdfschema

Logic + Ontology vocabulary

C. Gutierrez – Foundations of RDF Databases

Page 11: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• Manage huge volumes of data with logical precision• Separate modeling from implementation levels

The Database Approach

RDF

+ DB

RDF

C. Gutierrez – Foundations of RDF Databases

Page 12: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• Manage huge volumes of data with logical precision• Separate modeling from implementation levels

The Database Approach

RDF

+ DB

RDF

C. Gutierrez – Foundations of RDF Databases

As opposed to AI: DB primary concern isscalability. Then expressive power

Page 13: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• Manage huge volumes of data with logical precision• Separate modeling from implementation levels

The Database Approach

RDF

+ DB

RDF

C. Gutierrez – Foundations of RDF Databases

As opposed to AI: DB primary concern isscalability. Then expressive power

As opposed to IR: DB primary concern isprecision. Then scalability (recall).

Page 14: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Native Data Store

APIs

Data Structure: RDF Graphs

RDBMS

MySQLMSQL

Oracle DB2

Postgres

Files

Applications Services

Query language

SPARQL SeRQL RDQL RQL …..

RDF Database Technology

C. Gutierrez – Foundations of RDF Databases

Page 15: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

APIs

Data Structure: RDF Graphs

RDBMS

MySQLMSQL

Oracle DB2

Postgres

Files

Applications Services

Query language

SPARQL SeRQL RDQL RQL …..

Native Data Store

This Course: Database Modeling Level

C. Gutierrez – Foundations of RDF Databases

Page 16: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

This Course: Database Modeling Level

C. Gutierrez – Foundations of RDF Databases

Hence leaving out:• Visualization, APIs, Services, etc.• Indexing, storing, transactions, etc.

Page 17: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

This Course: Database Modeling Level

C. Gutierrez – Foundations of RDF Databases

Hence leaving out:• Visualization, APIs, Services, etc.• Indexing, storing, transactions, etc.

But also leaving out: Updating / Constraints / Temporality /Optimization / Aggregation / Flexibility /etc. / etc.

Page 18: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

1. RDF and Databases2. RDF and Database models3. RDF Query Language

– Requirements and Domains– Manifold Views

4. SPARQL– Syntax and Semantics– Complexity– Expressive Power

C. Gutierrez – Foundations of RDF Databases

Agenda

Page 19: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Database Models: Coddʼs definition

C. Gutierrez – Foundations of RDF Databases

Data structures

Query Language

Integrity constraints

Page 20: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Database Models: Coddʼs definition

C. Gutierrez – Foundations of RDF Databases

Data structures

Query Language

Page 21: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Evolution of Database Models

C. Gutierrez – Foundations of RDF Databases

RDF

Page 22: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Evolution of Database Models

C. Gutierrez – Foundations of RDF Databases

RDF

Page 23: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Blank NodesRDFSVocabulary

Graph (Triple) structure

RDF Data Structure: three main blocks

subClasstype

domain

range

classproperty

subProperty

?X ?Y

?Z∃

C. Gutierrez – Foundations of RDF Databases

Page 24: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Blank Nodes

?X ?Y

?Z∃

Vocabulary

subClass

type

domain

range

classproperty

subProperty

Graph (Triple) structure

RDF Data Structure: the core

C. Gutierrez – Foundations of RDF Databases

Page 25: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Blank Nodes

?X ?Y

?Z∃

Vocabulary

subClass

type

domain

range

classproperty

subProperty

Graph (Triple) structure

RDF Data Structure: the core

C. Gutierrez – Foundations of RDF Databases

Triple structure:set of statements

Page 26: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Blank Nodes

?X ?Y

?Z∃

Vocabulary

subClass

type

domain

range

classproperty

subProperty

Graph (Triple) structure

RDF Data Structure: the core

Graph structure:linked network ofstatements.

C. Gutierrez – Foundations of RDF Databases

Triple structure:set of statements

Page 27: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Relational Tables (Triple) view

Subject Predicate Object• Triples as tuples• Set of triples as Tables

C. Gutierrez – Foundations of RDF Databases

Page 28: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Relational Tables (Triple) view

Subject Predicate Object• Triples as tuples• Tables of triples

Advantages:+ Well studied and

well understood+ Reuse relational

technologies

C. Gutierrez – Foundations of RDF Databases

Page 29: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Relational Tables (Triple) view

Subject Predicate Object• Triples as tuples• Tables of triples

Advantages:+ Well studied and

well understood+ Reuse relational

technologies

Problems (Questions):

- Why yet another syntax for

the relational model?

- Was this the intended

objective of RDF?

- Expressive power limitations

C. Gutierrez – Foundations of RDF Databases

Page 30: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Graph Database Models:• Data and/or schema are represented by graphs• Query language able to capture main graph operations

and properties• Studied by DB community, but still not well understood

RDF Data Structure: Graph Database Model view

C. Gutierrez – Foundations of RDF Databases

Page 31: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Graph Database Models:• Data and/or schema are represented by graphs• Query language able to capture main graph operations

and properties• Studied by DB community, but still not well understood

RDF Data Structure: Graph Database Model view

C. Gutierrez – Foundations of RDF Databases

Golde

nage

Page 32: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

PathDegree ofa Node

AdjacentEdges

Neighborhoods

PROPERTYDiameterDistanceFixed-

length path

Language

QueryGraph

F-G

Lorel

GraphDB

Gram

GraphLog

G+

G

RDF Data Structure: Graph query languages

C. Gutierrez – Foundations of RDF Databases

Page 33: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

PathDegree ofa Node

AdjacentEdges

Neighborhoods

PROPERTYDiameterDistanceFixed-

length path

Language

QueryGraph

F-G

Lorel

GraphDB

Gram

GraphLog

G+

G

RDF Data Structure: Graph query languages

C. Gutierrez – Foundations of RDF Databases

Green light for graph features!

Page 34: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Vocabulary

subClass

type

domain

range

classproperty

subProperty

Blank Nodes

Graph (Triple) structure

?X ?Y?Z

RDF Data Structure: Triple structure + Blank nodes

C. Gutierrez – Foundations of RDF Databases

Page 35: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Vocabulary

subClass

type

domain

range

classproperty

subProperty

Blank Nodes

Graph (Triple) structure

?X ?Y?Z

Complexity / Semantics issues:• Deciding entailment becomes

NP-complete.• Deciding core is DP-complete• Semantics of querying not

simple

RDF Data Structure: Triple structure + Blank nodes

C. Gutierrez – Foundations of RDF Databases

Page 36: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

Blank NodesVocabulary

Graph (Triple) structure

subClass

type

domain

range

classproperty

subProperty

?X ?Y

?Z∃

C. Gutierrez – Foundations of RDF Databases

Page 37: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

Blank NodesVocabulary

Graph (Triple) structure

subClass

type

domain

range

classproperty

subProperty

?X ?Y

?Z∃

C. Gutierrez – Foundations of RDF Databases

Good News: Blank nodescan be treated orthogonallyto ground fragment.

Page 38: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

C. Gutierrez – Foundations of RDF Databases

More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }

Page 39: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

C. Gutierrez – Foundations of RDF Databases

More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided

Page 40: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

C. Gutierrez – Foundations of RDF Databases

More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided• Structural (internal) constraints of the language can be

separated from user-features. e.g. (Class, type, Resource)

Page 41: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: Ground fragment

C. Gutierrez – Foundations of RDF Databases

More good news:• Vocabulary can be reduced to { type, domain, range, subClassOf, subPropertyOf }• Complex semantic rules and axioms can be avoided• Structural (internal) constraints of the language can be

separated from user-features. e.g. (Class, type, Resource)• Features which do not add expressive power can be

avoided, e.g. reflexivity of subClass and subProperty.

Page 42: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: A minimal fragment

Blank Nodes

?X ?Y

?Z∃

subClass

type

domain

range

subProperty

Vocabulary

Graph (Triple) structure

{subClass, subProperty, type, domain, range}

C. Gutierrez – Foundations of RDF Databases

Page 43: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: A minimal fragment

Blank Nodes

?X ?Y

?Z∃

subClass

type

domain

range

subProperty

Vocabulary

Graph (Triple) structure

{subClass, subProperty, type, domain, range}

C. Gutierrez – Foundations of RDF Databases

Theorem: Simple proof system sound and

complete for the semantics of RDF in this

fragment. That is:

G |= F under RDF semantics iff

G |= F under mRDF semantics

Page 44: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Data Structure: A minimal fragment

Blank Nodes

?X ?Y

?Z∃

subClass

type

domain

range

subProperty

Vocabulary

Graph (Triple) structure

{subClass, subProperty, type, domain, range}

C. Gutierrez – Foundations of RDF Databases

Theorem: Simple proof system sound and

complete for the semantics of RDF in this

fragment. That is:

G |= F under RDF semantics iff

G |= F under mRDF semantics

Theorem: Let G be a restricted graph in thefragment, and t a ground tuple. Deciding ifG |= t can be done in time O(G x log(G))

Page 45: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

1. RDF and Databases2. RDF and Database models3. RDF Query Language

– Requirements and Domains– Manifold Views

4. SPARQL– Syntax and Semantics– Complexity– Expressive Power

C. Gutierrez – Foundations of RDF Databases

Agenda

Page 46: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Query language: Social Networks domain

+ Find triads by typeRanking

+ Selective counting of neighbors+ Operations between attributes+ Change relation direction based onattributes

Diffusion

+ Extract the subnetwork induced by cliquesof size n+ Build a hierarchy of cliques

Cohesive Subgroups

+ Extract subnetwork by timeFrienship+ Two-mode network to one-mode networkAffiliations

+ Group multiple binary relationsCenter and Periphery+ Extract egonetwork of an actor+ Remove relations between groups

Brokers and Bridges

+ Discretize an attributePrestige

+ Loop removalGenealogies andCitations

+ Extract a subnetwork based on attributes+ Group actors based on attributes+ Selective grouping of actors based onattributes

Attributes andRelations

+ Directed to undirected binary relations+ Remove relations

Looking for SocialStructure

Use Case (local)Chapter title

+ Connected components+ Clustering+ Bicomponents andbrockers

Connected components

+ Detect cohesivesubgroups+ Egonetworks+ Input Domain

Groups(k-neighbors, k-core,n-cliques, k-plex, etc.)

+ GeodesicsPaths and CyclesUse Case (Global)Subgraph family

C. Gutierrez – Foundations of RDF Databases

Page 47: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Query Language: Biology domain

Frequent subgraph recognitionTo find biopathways graph motifs

Subsumption testingEnzyme taxonomies

Frequent subgraph recognitionTo find biopathways graph motifs

Subgraph isomorphismChemical info retrieval

Subgraph homomorphismKinaze enzyme

Transitive ClosureFind all products ultimately derived from aparticular reaction

Least common ancestorObserve multiple products are co-regulated

Shortest path queriesIdentify “important” paths from nutrients tochemical outputs

Graph compositionTo construct pathways from individualreactions

Node matchingChemical structure associated with a node

Graph intersection, union, differenceFind the difference in metabolismsbetween two microbes

Majority graph queryTo combine multiple protein interactiongraphs

Graph compositionTo connect pathways, metabolism of co-existing organisms

Graph QueryUse Case

C. Gutierrez – Foundations of RDF Databases

Page 48: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Query Language: Web domain

C. Gutierrez – Foundations of RDF Databases

DistanceWhat is the Erdös distance between authos X and author Y?

PathsAre suspects A and B related?

AdjacencyAll relatives of degree one of Alice

PathsWhat is the influence of article D?

Degree of a nodeWhat is/are the most cited paper/s?Graph QueryUse Case

Page 49: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

Minimalist design:– Tags + Bundles (classes)– No inheritance, no intersection, etc.– Renaming

TagsA tag is simply a word you use to describe a bookmark. Unlike folders, youmake up tags when you need them and you can use as many as you like.

RDF Query Language: Tagging domain

C. Gutierrez – Foundations of RDF Databases

Page 50: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• SQL: Great for finding data from tabular representations, can get complexwhen many tables are involved in a given query

RDF Query Language: Standardizationʼs view

C. Gutierrez – Foundations of RDF Databases

SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)

Sixth annual W3C Technical Plenary (March 2006)

Page 51: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• SQL: Great for finding data from tabular representations, can get complexwhen many tables are involved in a given query

• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed

RDF Query Language: Standardizationʼs view

C. Gutierrez – Foundations of RDF Databases

SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)

Sixth annual W3C Technical Plenary (March 2006)

Page 52: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• SQL: Great for finding data from tabular representations,can get complex when many tables are involved in agiven query

• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed

• SPARQL: Good pattern matching paradigm, especiallywhen relationships have to be used to answer a query

Standardizationʼs view (Jim Melton, Oracle, 2006)

C. Gutierrez – Foundations of RDF Databases

SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)

Sixth annual W3C Technical Plenary (March 2006)

Page 53: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• SQL: Great for finding data from tabular representations,can get complex when many tables are involved in agiven query

• XQuery: Great for finding data in tree representations,can get complex when many relationships have to betraversed

• SPARQL: Good pattern matching paradigm, especiallywhen relationships have to be used to answer a query

Standardizationʼs view (Jim Melton, Oracle, 2006)

C. Gutierrez – Foundations of RDF Databases

SQL, XQuery and SPARQL: What's Wrong with this Picture?Jim Melton (Oracle; XML Query Working Group, XML Coord. Group)

Sixth annual W3C Technical Plenary (March 2006)

SPARQL = Sympathy Queen?

Page 54: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• RDF is the first level of a logical tower• Emphasis in logic features of RDF model• Keep an eye in extensions to more expressive logics• Bad news: complexity issues

RDF Query Language: Logicianʼs view

C. Gutierrez – Foundations of RDF Databases

Page 55: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

• How do we answer the most common queries?• How do we cope with APIs and store developments?• Design usually influenced by current programming and

system tools.• Not always concerned with scalability and long term.

RDF Query Language: Developerʼs view

RDF

C. Gutierrez – Foundations of RDF Databases

Page 56: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF as a graph data model?

?

RDF Query Language: Database theoreticianʼs view

RDF as a relational model?

Graphs

Relations

C. Gutierrez – Foundations of RDF Databases

Page 57: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Query Language: Database theoreticianʼs view

Local Global

C. Gutierrez – Foundations of RDF Databases

Theorem (Gaifman). A property of graphs is expressible by a closedfirst order formula iff it is equivalent to a combination of properties ofthe form

where v1,…,vs denote vertices and d(x,y) denotes distance

v2> 2r

rv1

v3

v4

v5

Page 58: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

RDF Query Language: Database theoreticianʼs view

Local Global

C. Gutierrez – Foundations of RDF Databases

Theorem (Gaifman). A property of graphs is expressible by a closedfirst order formula iff it is equivalent to a combination of properties ofthe form

where v1,…,vs denote vertices and d(x,y) denotes distance

v2> 2r

rv1

v3

v4

v5

Want Local (relational) or global (graph) queries?

Page 59: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

SPARQL (W3C Recommendation, 2008)– Relational view of querying– RDF = triples + blanks– Pattern matching

W3C Working Groupʼs view

C. Gutierrez – Foundations of RDF Databases

Page 60: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

SPARQL (W3C Recommendation, 2008)– Relational view of querying– RDF = triples + blanks– Pattern matching

W3C Working Groupʼs view

C. Gutierrez – Foundations of RDF Databases - ESWC 2008

Good News:there is a standard!

Page 61: New Foundations of RDF Databases · 2008. 11. 28. · • Manage huge volumes of data with logical precision • Separate modeling from implementation levels The Database Approach

SELECT ASKCONSTRUCT DESCRIBEQuery Form

DatasetClause

Where Clause(Graph Pattern)

Triplepattern

FROM

FROM NAMED

Dataset

FILTEROPTIONAL

ANDUNION

X Y Z

X Y

TRUE - FALSE

SPARQL Query (General Structure)

C. Gutierrez – Foundations of RDF Databases - ESWC 2008