SWT Lecture Session 4 - SW architectures and SPARQL

Post on 26-Jan-2015

115 views 0 download

description

 

Transcript of SWT Lecture Session 4 - SW architectures and SPARQL

+

SW Software architecture and SPARQL

Mariano Rodriguez-Muro, Free University of Bozen-Bolzano

+Disclaimer

License This work is licensed under a

Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/)

Material for these slides has been taken from W3C pages for SPARQL Jena and Sesame’s documentation

+Summary

Semantic web idea, and overview of SWT

RDF data model

Jena intro

SPARQL protocol

+Reading material

Semantic Web Programming Part III, Chapter 8

FUSEKI tutorial

+

SW Software architectures

+Architecture of SW applications

Local access The RDF graph is stored

locally and is accessible through an API

Mixed access Manual (hard coded) Federated Queries Traversal

Remote access The RDF graph is owned

by a third party and expossed through an SPARQL-endpoint or a web service

+

Local AccessTriple stores and APIs

+Local access

Data is managed locally by means of a triple store (e.g., Jena, Sesame)

Data may be: RDF (e.g., local copies of

Linked Data Legacy data transformed

into RDF (more in a few)

+Local access(triple stores)

Possible triple stores: Jena TDB, SDB, Sesame,

4Store,… Virtuoso, OWLIM,

AllegroGraph,…

3rd party data: From DUMPs

http://wiki.dbpedia.org/ http://pro.europeana.eu/

datasets Crawling (e.g, LDSpider)

Legacy transformation

+Local access (legacy sources)

3rd party tools to transform CSV XLS, etc.

XSLT to transform XML

Mapping based (R2RML, D2RQ) 3rd party tools to transform

RDBMS into RDF dumps 3rd party tools to expose

RDBMS as virtual RDF

All of these will be covered in the course

+Local access (local queries)

All triple stores offer SPARQL execution Accessible through

console tools (mysql and psql style)

Through their own API

bin/sparql --data=data-mydata.rdf --query=my-sparql-query.rq

+SPARQL with Jena in Java

Key API objects Query QueryFactory QueryExecutionFactory QueryExecution

execAsk() > boolean execConstruct() >

Model execDescribe() > Model execSelect() >

ResultSet

String queryString = "PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT * WHERE { ?x owl:sameas ?y }";

Query query = QueryFactory.create(queryString);

QueryExecution qe = QueryExecutionFactory.create(query, tdb);ResultSet results = qe.execSelect();

+SPARQL with Jena in Java

ResultSet Results from a query in a

table-like manner for SELECT queries. Each row corresponds to a set of bindings which fulfil the conditions of the query. Access to the results is by variable name.

getResultVars() > List<String>

hasNext() > boolean next() > QuerySolution

String queryString = "PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT * WHERE { ?x owl:sameas ?y }";

Query query = QueryFactory.create(queryString);

QueryExecution qe = QueryExecutionFactory.create(query, tdb);ResultSet results = qe.execSelect();

+SPARQL with Jena in Java

QuerySolution A single answer from a

SELECT query varNames() >

Iterator<String> contains(varname) >

boolean get(varname) > RDFNode getResource(varname) >

Resource getLiteral(varname) >

Literal

String queryString = "PREFIX owl: <http://www.w3.org/2002/07/owl#> SELECT * WHERE { ?x owl:sameas ?y }";

Query query = QueryFactory.create(queryString);

QueryExecution qe = QueryExecutionFactory.create(query, tdb);ResultSet results = qe.execSelect();

+SPARQL with Jena in Java

Tools for ResultSet ResultSetFormatter

asRDF, asText, asXMLString, asJSON…

Parameterized SPARQL query

-----------------------------------------------------| uri |=====================================================| <http://www.opentox.org/api/1.1#NumericFeature> || <http://www.opentox.org/api/1.1#NominalFeature> || <http://www.opentox.org/api/1.1#StringFeature> || <http://www.opentox.org/api/1.1#Feature> || <http://www.w3.org/2002/07/owl#Nothing> || <http://www.opentox.org/api/1.1#Identifier> || <http://www.opentox.org/api/1.1#ChemicalName> || <http://www.opentox.org/api/1.1#IUPACName> || <http://www.opentox.org/api/1.1#InChI> || <http://www.opentox.org/api/1.1#MolecularFormula> || <http://www.opentox.org/api/1.1#CASRN> || <http://www.opentox.org/api/1.1#SMILES> |-----------------------------------------------------

+

Remote AccessSPARQL Protocol

+Remote access (SPARQL Protocol)

Means to access query processors

Compatible with RDF

Abstract specification

Bindings with the following protocols: HTTP SOAP (WSDL)

+SPARQL Protocol

Main elements: operation

one operation ‘query’ In message

one sparql query zero or more datasets

Out Message SPARQL results document

(for SELECT or ASK) An RDF graph serialized

in RDF/XML (for DESCRIBE and CONSTRUCT)

+SPARQL Protocol

Means to access query processors

Compatible with RDF

Abstract specification

Bindings with the following protocols: HTTP SOAP (WSDL)

+SPARQL Protocol (HTTP bindings)

Uses HTTP GET and POST messages

Encoded query message (HTTP Encode)

Returns document in requested format (or the default)

PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?book ?who WHERE { ?book dc:creator ?who }

+SPARQL Protocol (HTTP bindings)

Request

Response (next page) (Uses the XML formatting for SPARQL results)

GET /sparql/?query=EncodedQuery HTTP/1.1Host: www.exampleUser-agent: my-sparql-client/0.1

HTTP/1.1 200 OKDate: Fri, 06 May 2005 20:55:12 GMTServer: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3Connection: closeContent-Type: application/sparql-results+xml

<?xml version="1.0"?><sparql xmlns="http://www.w3.org/2005/sparql-results#">

<head> <variable name="book"/> <variable name="who"/> </head> <results distinct="false" ordered="false"> <result> <binding name="book"><uri>http://www.example/book/book5</uri></binding> <binding name="who"><bnode>r29392923r2922</bnode></binding> </result>...</sparql>

+SPARQL Protocol (HTTP bindings)

Request (specifying the default graph)

Response (next page) Runs against the dataset identified by the URI:

http://www.other.example/books

GET /sparql/?query=EncodedQuery&default-graph-uri=http://www.other.example/books HTTP/1.1Host: www.other.exampleUser-agent: my-sparql-client/0.1

HTTP/1.1 200 OKDate: Fri, 06 May 2005 20:55:12 GMTServer: Apache/1.3.29 (Unix) PHP/4.3.4 DAV/1.0.3Connection: closeContent-Type: application/sparql-results+xml

<?xml version="1.0"?><sparql xmlns="http://www.w3.org/2005/sparql-results#">

<head> <variable name="book"/> <variable name="who"/> </head> <results distinct="false" ordered="false"> <result> <binding name="book"><uri>http://www.example/book/book5</uri></binding> <binding name="who"><bnode>r29392923r2922</bnode></binding> </result>...</sparql>

+SPARQL Protocol (HTTP bindings)

Request with content negotiation

Response (next page) Using the format specified by the client

GET /sparql/?query=EncodedQuery&default-graph-uri=http://www.example/jose-foaf.rdf HTTP/1.1Host: www.exampleUser-agent: sparql-client/0.1Accept: text/turtle, application/rdf+xml

HTTP/1.1 200 OKDate: Fri, 06 May 2005 20:55:11 GMTServer: Apache/1.3.29 (Unix)Connection: closeContent-Type: text/turtle

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.@prefix foaf: <http://xmlns.com/foaf/0.1/>.@prefix myfoaf: <http://www.example/jose/foaf.rdf#>.

myfoaf:jose foaf:name "Jose Jimeñez"; foaf:depiction <http://www.example/jose/jose.jpg>;

foaf:nick "Jo";...

+SPARQL Protocol (SOAP bindings)

A protocol for accessing web services

Based on HTTP and XML

Messages are passed through envelopes

SPARQL requests and responses are embedded in the envelopes

+SPARQL Protocol (SOAP bindings)

Request (note content type, encoding of the query)

POST /services/sparql-query HTTP/1.1Content-Type: application/soap+xmlAccept: application/soap+xml, multipart/related, text/*User-Agent: Axis/1.2.1Host: www.exampleSOAPAction: ""Content-Length: 438

<?xml version="1.0" encoding="UTF-8"?> <soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope/"xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <query-request xmlns="http://www.w3.org/2005/09/sparql-protocol-types/#"> <query>SELECT ?z {?x ?y ?z . FILTER regex(?z, 'Harry')}</query> </query-request> </soapenv:Body> </soapenv:Envelope>

+

Response is a SOAP message embedding SPARQL/XML

HTTP/1.1 200 OKContent-Type: application/soap+xml

<?xml version="1.0" encoding="utf-8"?> <soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <query-result xmlns="http://www.w3.org/2005/09/sparql-protocol-types/#"> <ns1:sparql xmlns:ns1="http://www.w3.org/2005/sparql-results#"> <ns1:head> <ns1:variable name="z"/> </ns1:head> <ns1:results distinct="false" ordered="false"> <ns1:result> <ns1:binding name="z"> <ns1:literal>Harry Potter and the Chamber of Secrets</ns1:literal> </ns1:binding> </ns1:result> ... </ns1:results> </ns1:sparql> </query-result> </soapenv:Body> </soapenv:Envelope>

SPARQL Protocol (SOAP bindings)

+

Response is a SOAP message embedding SPARQL/XML

HTTP/1.1 200 OKContent-Type: application/soap+xml

<?xml version="1.0" encoding="utf-8"?> <soapenv:Envelope xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> <query-result xmlns="http://www.w3.org/2005/09/sparql-protocol-types/#"> <ns1:sparql xmlns:ns1="http://www.w3.org/2005/sparql-results#"> <ns1:head> <ns1:variable name="z"/> </ns1:head> <ns1:results distinct="false" ordered="false"> <ns1:result> <ns1:binding name="z"> <ns1:literal>Harry Potter and the Chamber of Secrets</ns1:literal> </ns1:binding> </ns1:result> ... </ns1:results> </ns1:sparql> </query-result> </soapenv:Body> </soapenv:Envelope>

SPARQL Protocol (SOAP bindings)

+Remote access (SPARQL-endpoints)

A SPARQL processor that is accessible through the SPARQL protocol

Many open endpoints:http://www.w3.org/wiki/SparqlEndpoints

Often accessible through query forms, e.g.: Dbpedia:

http://dbpedia.org/sparql BBC programme information:

http://lod.openlinksw.com/sparql/

+Remote access (SPARQL-endpoints)

Using implementing the SPARQL protocol on your own, or

Using a library that supports the SPARQL protocol: Python: RDFLib PHP: librdf, RAP JavaScript Java: Jena, Sesame, etc

+Remote access (SPARQL-endpoints with Jena)

In Jena:

The library hides all details about the protocol. You can use the normal API calls and objects to work with the results.

String location = “http://dbpedia.org/sparql”;String query = “PREFIX …. SELECT …. “;QueryExecution x = QueryExecutionFactory.sparqlService(location, query); ResultSet results = x.execSelect(); ResultSetFormatter.out(System.out, results);

+

Remote AccessCreating SPARQL endpoints

+Creating a SPARQL end-point

Most triple stores include a HTTP implementation of the SPARQL protocol

Set up depends on the system

Jena’s way is by mean of JOSEKI now (FUSEKI)

+Setting up Joseki

Package comes with Server JARs Scripts to manage the server and data

Once the system is running, the control panel can be found at:http://localhost:3030/

+Running a Fuseki Server

fuseki-server --mem /DatasetPathNamecreate an empty, in-memory dataset

fuseki-server --file=FILE /DatasetPathNamecreate an empty, in-memory dataset and load FILE into it

fuseki-server --loc=DB /DatasetPathNameUse an existing TDB database, or create one if it doesn’t exist.

fuseki-server --config=ConfigFileconstruct one ore more endpoints based on the config. desc.

+Server URI scheme

http://*host*/dataset/query the SPARQL query endpoint.

http://*host*/dataset/updatethe SPARQL Update language endpoint.

http://*host*/dataset/datathe SPARQL Graph Store Protocol endpoint.

http://*host*/dataset/uploadthe file upload endpoint.

Default port 3030

+Script Control

Load datas-put http://localhost:3030/ds/data default books.ttl

Get it backs-get http://localhost:3030/ds/data default

Query it with SPARQL using the .../query endpoint.s-query --service http://localhost:3030/ds/query 'SELECT * {?s ?p ?o}'

Update it with SPARQL using the .../update endpoint.s-update --service http://localhost:3030/ds/update 'CLEAR DEFAULT'

+Summary

Covered: SPARQL through APIS SPARQL endpoints Creating and managing

endpoints HTTP Protocol

Later Transforming data into

RDF Virtual RDF (R2RML,

D2RQ)

+See also

http://www.w3.org/TR/rdf-sparql-protocol/

http://jena.apache.org/documentation/serving_data/

Not discussed here: Serving LOD with dereferencable URI’s and the SPARQL protocol

+

Sesame