Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz...

23
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi

Transcript of Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz...

Page 1: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Logics for Data and Knowledge Representation

SPARQL Protocol and RDF Query Language (SPARQL)

Feroz Farazi

Page 2: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL

A language for expressing queries to retrieve information

from various data represented in RDF [SPARQL Spec.]

A query language with the capability to search graph

patterns [SPARQL Spec.]

Often SPARQL queries contain

a basic graph pattern: a set of subject, object, predicate triple patterns

RDF terms possibly substituted with variables

Result of the query

a subgraph of the RDF data graph

Page 3: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Terminologies

RDF Terms:

Given that I is the set of all IRIs, L is the set of all RDF literals and B is the set

of all blank nodes in an RDF graph. Within the graph the set of all RDF Terms,

T = I U L U B

RDF Dataset:

D = {G, (I1, G1), (I2, G2),…(Ik, Gk)},

where G is the default graph

(Ii, Gi) are named graphs, i = 1 to k, k ≥ 0An RDF dataset always contains a default graph, which does not have a nameIt contains zero or more named graphsEach named graph is identified by an IRI

Page 4: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

TerminologiesQuery Variable:

A query variable, v ∈ V, where V is infinite and V ∩ T = ∅

Triple Pattern:

A triple pattern P {∈ (T U V) x (I U V) x (T U V)}

Solution Mapping:

A solution mapping is a partial function M:V -> T

where V is the query variable and

T is the set of all RDF Terms

Solution Sequence:

A list of solutions which might be unordered. Number of solutions might be

zero, one or more.

Page 5: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Terminologies Solution Sequence Modifier:

(i) Order By (ii) Projection (iii) Distinct (iv) Reduced (v) Offset (vi) Limit

Others:

IRI (Internationalized Resource Identifier), Lexical form, language tag

(e.g., en, it), datatype IRI (e.g., xsd:boolean), literal, plain literal and typed

literal

IRIs and URIs

URIs include a subset of the ASCII character set

IRIs can include Unicode characters (Universal Character Set)

ASK: to perform a test to know if a query expression has a

solution. It replies with yes/no.

Page 6: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Query Dataset:

:paper1 :title “Semantic Matching”

Query Expression:SELECT ?title

WHERE { :paper1 :title ?title. }

Query Result:

“Semantic Matching”title

Dataset::paper1 :creator “Fausto Giunchiglia”

Query Expression:SELECT ?author

WHERE { :paper1 :creator ?author. }

Query Result:“Fausto Giunchiglia”

SELECT query form returns RDF Terms bound to the variables

Page 7: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

QueryDataset:

_:a :name "Tim Berners-Lee" .

_:a :homepage <http://www.w3.org/People/Berners-Lee/> .

_:b :name "Fausto Giunchiglia" .

_:b :homepage <http://disi.unitn.it/~fausto/> .

Query Expression:SELECT ?name ?homepage

WHERE { ?x :name ?name .

?x :homepage ?homepage }

Query Result:

Multiple Matches

name homepageTim Berners-Lee <http://www.w3.org/People/Berners-Lee/>Fausto Giunchiglia <http://disi.unitn.it/~fausto/>

Page 8: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Query Dataset:

:x :name "Tim Berners-Lee"@en .

:y :name "Fausto Giunchiglia"@en.

Query Expression 1:SELECT ?u

WHERE { ?u :name "Tim Berners-Lee"}

Query Result:

u

Query Expression 2:SELECT ?u

WHERE { ?u :name "Tim Berners-Lee"@en}

Query Result:

u

:x

RDF Literals Matching

This query has 0 solution because without language tag the search element does not match with dataset element

This query has 1 solution because the inclusion of language tag bound u to :x

Page 9: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Building RDF Graphs CONSTRUCT: this query construct returns an RDF graph Dataset:

_:a :creator "Tim Berners-Lee" .

_:b :creator "Fausto Giunchiglia" .

Query Expression:CONSTRUCT { ?x :name ?name }

WHERE { ?x :creator ?name }

Query Result:

_:c :name "Tim Berners-Lee" .

_:d :name "Fausto Giunchiglia" .

In this dataset with :creator we mean Dublin Core (dc) creator metadataIn the query with :name we mean FOAF name metadataWe built a graph with FOAF name attribute which was not available in the

source dataset

Page 10: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

RDF Term Restrictions FILTER: solutions are restricted to those RDF Terms

which match with the filter expression Dataset:

_:a :creator "Tim Berners-Lee" .

_:a :age 53 .

_:b :creator "Fausto Giunchiglia" .

_:b :age 54.

Query Expression: SELECT ?author

WHERE { ?x :creator ?author.

FILTER regex(?author, "Tim") }

Query Result:

author

"Tim Berners-Lee" .The above query can be made case insensitive by adding “i” flag in the filter

as follows:

FILTER regex(?author, “tim”, “i”)

Query Expression: SELECT ?author ?age WHERE { ?x :creator ?author.

?x :age ?age FILTER (?age >53) }

Query Result: author

age "Fausto Giunchiglia" 54

Page 11: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Querying Optional Pattern OPTIONAL: to allow binding variables to RDF Terms to be

included in the solution in case of availability Dataset:

_:a :creator "Tim Berners-Lee" .

_:a :age 53 .

_:a :homepage <http://www.w3.org/People/Berners-Lee/> .

_:b :creator "Fausto Giunchiglia" .

_:b :age 54.

Query Expression: SELECT ?author ?homepage

WHERE { ?x :creator ?author.

OPTIONAL {?x :homepage ?homepage}}

Query Result:

author homepage "Tim Berners-Lee" <http://www.w3.org/People/Berners-Lee/>

"Fausto Giunchiglia"

It is a left associative operator Why do we need it? All entities might not have the same set of attributes

Page 12: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

ORDER BY ClauseORDER BY: a facility to order a solution sequence Dataset:

_:a :creator "Tim Berners-Lee" .

_:a :age 53 .

_:b :creator "Fausto Giunchiglia" .

_:b :age 54.

Query Expression: SELECT ?author

WHERE { ?x :creator ?author;

?x :age ?age}

ORDER BY ?author DESC (?age)

Query Result:

author "Fausto Giunchiglia"

"Tim Berners-Lee"

Page 13: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

DISTINCT and REDUCED Modifiers DISTINCT: to remove duplicate from a solution sequence

Dataset:_:b :creator "Fausto Giunchiglia" .

_:b :age 54.

_:c :creator "Fausto Giunchiglia" .

_:c :age 54.

Query Expression: SELECT DISTINCT ?creator

WHERE { ?x :creator ?creator}

Query Result:

creator "Fausto Giunchiglia"

REDUCED: to permit the duplicates to be removed. Query Expression: SELECT REDUCED ?creator

WHERE { ?x :creator ?creator}

The cardinality of the elements in the solution set is at least one and no more than the cardinality without removing duplicates

Page 14: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

OFFSET and LIMIT Clauses OFFSET: to show the elements of the solution set starting

after a specified number. If the number is zero, there will be no effect.

Dataset:_:b :creator "Fausto Giunchiglia" .

_:b :age 54.

_:c :creator "Tim Berners-Lee" .

_:c :age 53.

Query Expression: SELECT ?author

WHERE { ?x :creator ?author }

ORDER BY ?author

OFFSET 1

Query Result:

author "Tim Berners-Lee"

Limit: to put an upper bound on the number of elements of the solution set returned

Query Expression: SELECT ?author WHERE { ?x :creator ?author } ORDER BY ?author

LIMIT 1 OFFSET 1

Query Result: author "Tim Berners-Lee"

Page 15: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Relational vs RDF queries Relational queries consist of (among others):

Relational algebra of joins Foreign key references

RDF queries consists of (among others): (Logical) statements in triple form

Unification variables are used to connect graph patterns A relational query:

Produces a new database table that is a combination of two or more input tables (partially or completely)

An RDF query: Produces a subset of the input RDF graph Simplifies some issues of table based queries, for example, no need to

put subquery construct

[D. Allemang and J. Hendler, 2008]

Page 16: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Turtle Turtle is a terse RDF triple language Turtle is a textual syntax for RDF facilitates writing RDF graph

in a compact and natural language text form with abbreviations for common usage patterns and datatypes compatible with triple pattern syntax of SPARQL (and N-Triples)

Triple:a sequence of (subject, predicate, object) terms

separated by whitespace terminated by '.' after each triple

e.g., <http://www.w3.org/.../Weaving/> <http://purl.org/dc/elements/1.1/creator> <http://www.w3.org/People/Berners-Lee> .

List of predicates and objects:For the same subject can be codified without repeating the common part

Reference to the subject of the previous triple is indicated by the use of a semicolone.g., <http://www.w3.org/.../Weaving> <http://purl.org/dc/elements/1.1/creator> <http://www.w3.org/People/Berners-Lee> ;

<http://purl.org/dc/elements/1.1/title> "Weaving the Web".

Page 17: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

Turtle List of objects:

For the same subject and predicate, triples can be codified without repeating the common part

Reference to the subject and predicate of the previous triple is indicated by the use of a comma

e.g., <http://www.w3.org/.../Weaving> <http://purl.org/dc/elements/1.1/creator> "Tim Berners-Lee", "TBL", "Tim BL" .

SPARQL differs from Turtle: SPARQL permits RDF Literals as the subject of RDF triples SPARQL permits variables (?name or $name) in any part of the triple of the

form prefix and base declarations

Turtle allows prefix and base declarations anywhere outside of a triple In SPARQL, they are only allowed in the Prologue (at the start of the SPARQL query)

case sensitivity SPARQL uses case insensitive keywords, except for 'a‘, where 'a' means the IRI

http://www.w3.org/1999/02/22-rdf-syntax-ns#type Turtle's prefix and base declarations are case sensitive 'true' and 'false' are case insensitive in SPARQL and case sensitive in Turtle TrUe is not a valid boolean value in Turtle

Page 18: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL: Federated Query Federated SPARQL query can be used to express queries

across diverse data sources if data is stored natively as RDF or data is viewed as RDF via middleware

It is an opportunity for data consumers to get data distributed across the Web

Federated Query is used for executing queries distributed over different SPARQL endpoints

The SERVICE keyword allows a query author to direct a portion of a query to a particular

SPARQL endpoint supports SPARQL queries merging data distributed across the

Web

Page 19: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL: Federated Query An example query to a remote SPARQL endpoint Consider a query to find the names of the people we know Data about the names of various people is available at the

http://people.example.org/sparql endpoint:

and one wants to combine with a local FOAF file http://example.org/myfoaf.rdf that contains the single triple:

QUERY:PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT ?name FROM <http://example.org/myfoaf.rdf>

WHERE

{

<http://example.org/myfoaf/I> foaf:knows ?person .

SERVICE <http://people.example.org/sparql>

{ ?person foaf:name ?name . }

}

This query, on the datasets provided above, has one solution:

RESULT:

Name

----------

James

DATASET 1:@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix : <http://example.org/> . :people1 foaf:name "Tim BL" . :people2 foaf:name "James" .

DATASET 2:<http://example.org/myfoaf/I> <http://xmlns.com/foaf/0.1/knows> <http://example.org/people2> .

Page 20: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL: Federated Query An example query with OPTIONAL to two remote SPARQL

endpoints Consider we want to query people and optionally obtain their

interests and the names of people they know Data in the default graph at remote SPARQL endpoint:

At http://people.example.org/sparqlDATASET 1:@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix : <http://example.org/> . :people1 foaf:name "Tim BL" . :people2 foaf:name "James" . :people3 foaf:name "Jerome" . :people3 foaf:interest <http://www.w3.org/2001/sw/rdb2rdf/> .

At http://people2.example.org/sparqlDATASET 2:@prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix : <http://example.org/> . :people1 foaf:knows :people21 . :people21 foaf:name " Chris" . :people3 foaf:knows :people22 . :people22 foaf:name “Frank" .

QUERY:PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?person ?interest ?known WHERE { SERVICE <http://people.example.org/sparql>

{ ?person foaf:name ?name . OPTIONAL { ?person foaf:interest ?interest .

SERVICE <http://people2.example.org/sparql>

{ ?person foaf:knows ?known . } }

} }

This query, on the datasets provided above, has three solutions:

RESULT:

person interest known

---------------------------------------------------------Tim BL

James

Jerome <http…rdb2rdf/> <http…people22>

Page 21: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL Update Graph Store: an editable repository of RDF graphs managed

by a single service Update service: a service (often referred to by the informal

term SPARQL endpoint) that accepts and processes update requests

Similarly to RDF dataset a Graph Store contains one (unnamed) slot holding a default graph and zero or more named slots holding named graphs

Formal definition: Graph Store

The Graph Store can be viewed as a mutable RDF Dataset.

GS = {DG, (iri1, G1), ... , (irin, Gn) }

Where the default graph DG is the RDF graph associated with the unnamed slot

n ≥ 0 and for each 1 ≤ i ≤ n, Gi is an RDF graph associated with the named slot identified by IRI iri i

all IRIs are distinct, i.e., i≠j implies iri i≠irij

Page 22: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

SPARQL Update Update operations can specify the named graph(s) to be edited. In case the

named graph is not mentioned, the operation is performed on the default graph

The unnamed or default graph may refer to a separate graph, a graph describing the named graphs, a representation of a union of other graphs, and so on

Unlike an RDF Dataset, named graphs can be added to or deleted from a Graph Store

A Graph Store can keep local copies of RDF graphs defined elsewhere on the Web and modify those copies independently of the original graph

SPARQL Update supports two categories of update operations on a Graph Store

Graph Update - addition and removal of triples from some graphs Graph Management - creation and deletion of graphs and the graph

update operations (to add, move, and copy graphs) for managing graphs In the case where there is one unnamed graph and no named graphs,

SPARQL Update can be used as a graph update language (as opposed to a Graph Store update

language)

Page 23: Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.

References SPARQL Spec. (2008). W3C Recommendation. SPARQL 1.1 Federated Query (2013). W3C Recommendation. SPARQL 1.1 Update (2013). W3C Recommendation. Turtle Terse RDF Triple Language (2013). W3C Recommendation. D. Allemang and J. Hendler. Semantic web for the working ontologist:

modeling in RDF, RDFS and OWL. Morgan Kaufmann Elsevier, Amsterdam, NL, 2008.