Bigdive 2014 - RDF, principles and case studies

74
RDF principles and case studies Diego Valerio Camarda regesta.exe www.regesta.com [email protected] dvcama @ github&twitter

description

 

Transcript of Bigdive 2014 - RDF, principles and case studies

Page 1: Bigdive 2014 - RDF, principles and case studies

RDF

principles and case studies

Diego Valerio Camarda regesta.exe

www.regesta.com

[email protected]

dvcama @ github&twitter

Page 2: Bigdive 2014 - RDF, principles and case studies

a brief introduction to linked open data

Page 3: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

Html page

Html page Html

page Html page Html

page The nowadays WEB

Page 4: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents

Html page

Html page Html

page Html page Html

page

Page 5: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

The nowadays WEB at least 1.85 billion indexed documents someone says 1 trillion online documents actually the best HTML parser is still the HUMAN BRAIN

Html page

Html page Html

page Html page Html

page

Page 6: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

The nowadays WEB is not the WEB that Tim proposed in 1998

Page 7: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

The nowadays WEB is not the WEB that Tim proposed in 1998

Page 8: Bigdive 2014 - RDF, principles and case studies

why things instead of documents

The nowadays WEB is not the WEB that Tim proposed in 1998

Page 9: Bigdive 2014 - RDF, principles and case studies

what about URIs and RDF a new way to publish data on the web

ids are ambiguous and suck!

Use URIs as names for things Use HTTP URIs so that people can look up those names Use the standards (RDF, SPARQL) providing useful information Include links to other URIs so that they can discover more things

linked data principles Tim Berners-Lee July 27, 2006

Page 10: Bigdive 2014 - RDF, principles and case studies

HTTP://yourdomain.com/something

what about URIs and RDF turning web pages in “real” data

ids are ambiguous and suck!

Page 11: Bigdive 2014 - RDF, principles and case studies

what about URIs and RDF turning web pages in “real” data

ids are ambiguous and suck!

Page 12: Bigdive 2014 - RDF, principles and case studies

[…] l’animaletto venne indicato come: “il tasso del tasso del Tasso”

Achille Campanile

It’s time for machine (for parsing pages)

Page 13: Bigdive 2014 - RDF, principles and case studies

[…] l’animaletto venne indicato come: “il tasso del tasso del Tasso”

Achille Campanile

It’s time for machine (for parsing pages)

http://it.dbpedia.org/resource/Meles_meles

http://it.dbpedia.org/resource/Taxus http://it.dbpedia.org/resource/Torquato_Tasso

http://it.dbpedia.org/resource/Achille_Campanile (author of the sentence)

Page 14: Bigdive 2014 - RDF, principles and case studies

A new way to design databases RDF

(aka ’define knowledge’)

Page 15: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the standard (old) approach

ID_P COGNOME NOME REF_ID_SOCIETA GENERE

1 Camarda Diego 1 maschio

2 … … … …

ID_SOCIETA DENOMINAZIONE SITO

1 Regesta.exe srl www.regesta.com

Page 16: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the new (cool) approach

<http://www.regesta.com/diego>

Subject

Page 17: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the new (cool) approach

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName>

Subject Predicate

Page 18: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the new (cool) approach

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’.

Subject Predicate Object

Page 19: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the new (cool) approach

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’. <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’.

Page 20: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! the new (cool) approach

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ .

Page 21: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! ok, but what a “diego” is?

Page 22: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! it’s a person!

<http://www.regesta.com/diego> a <http://xmlns.com/foaf/0.1/Person>

Page 23: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! adding a Class

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ .

<http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .

Page 24: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! building a graph

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> .

<http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> .

Page 25: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! building a graph

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> .

<http://www.regesta.com/about> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> .

Page 26: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! building a graph

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> .

Page 27: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! building a graph

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ ; <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ ; <http://xmlns.com/foaf/0.1/gender> ‘male’ ; <http://www.w3.org/1999/...#type> <http://xmlns.com/foaf/0.1/Person> ; <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com/about> . <http://www.regesta.com/about> <http://www.w3.org/1999/...#type> <http://www.w3.org/ns/org#Organization> ; <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ ; <http://xmlns.com/foaf/0.1/homepage> <http://www.regesta.com> .

Page 28: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! Objects could be Subjects

diego

Page 29: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! considering diego and regesta

diego

regesta

Page 30: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <diego> <memberOf> <regesta>

diego

regesta

Page 31: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! but, <regesta> <locatedIn> <rome>

diego

regesta

rome

Page 32: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <diego> <placeOfBirth> <rome>

diego

regesta

rome

Page 33: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <rome> <parentADM> <italy>

diego

regesta

rome

italy

Page 34: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <silvia> <placeOfBirth> <italy>

diego

regesta

silvia

rome

italy

Page 35: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <silvia> <…> <…>

diego

regesta

silvia

rome

italy

Page 36: Bigdive 2014 - RDF, principles and case studies

Go Triples, go! <…> <…> <…> = a knowledge graph!

diego

regesta

silvia

rome

italy

Page 37: Bigdive 2014 - RDF, principles and case studies

A lot of sentence to achieve (descriptive) freedom

<http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/familyName> ‘Camarda’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/firstName> ‘Diego’ . <http://www.regesta.com/diego> <http://xmlns.com/foaf/0.1/gender> ‘male’ . <http://www.regesta.com/diego> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/diego> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/familyName> ‘Mazzini’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/firstName> ‘Silvia’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/gender> ‘female’ . <http://www.regesta.com/silvia> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> . <http://www.regesta.com/silvia> <http://www.w3.org/ns/org#memberOf> <http://www.regesta.com> . <http://www.regesta.com> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/org#Organization> . <http://www.regesta.com> <http://www.w3.org/2004/02/skos/core#prefLabel> ‘Regesta.exe srl’ . <http://www.regesta.com/silvia> <http://xmlns.com/foaf/0.1/knows> <http://www.regesta.com/diego> .

<…> <…> <…>.

<noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>.<noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> <makeGoCreazy> <homer>. <noTv> <makeGoCreazy> <homer>. <noBeer> …

Page 38: Bigdive 2014 - RDF, principles and case studies

Standards for semantic web

Page 39: Bigdive 2014 - RDF, principles and case studies

RDF http://www.w3.org/standards/techs/rdf SPARQL http://www.w3.org/standards/techs/sparql ONTOLOGIES http://www.w3.org/standards/semanticweb/ontology

Did you studied HTML? Good! it's time for a new standard

Page 40: Bigdive 2014 - RDF, principles and case studies

The Resource Description Framework is a general-purpose language for representing

information in the Web.

It's time for a new standard RDF

Page 41: Bigdive 2014 - RDF, principles and case studies

The SPARQL Protocol and RDF Query Language is a query language and protocol for RDF.

It's time for a new standard SPARQL

Page 42: Bigdive 2014 - RDF, principles and case studies

On the Semantic Web, vocabularies define the concepts and relationships

(also referred to as “terms”) used to describe and represent

an area of concern.

It's time for a new standard Ontologies

Page 43: Bigdive 2014 - RDF, principles and case studies

PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX dc: <http://purl.org/dc/elements/1.1/> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> foaf:firstName dc:title rdfs:label

Pre:fixes (ontologies) just a few words

Page 44: Bigdive 2014 - RDF, principles and case studies

Browsing the web of data

Page 45: Bigdive 2014 - RDF, principles and case studies

Resource Description Framework

› SPARQL endpoint › dereferenceable URIs › content negotiation › standard ports, like 80 (HTTP) › JSONP support

MUST!

Page 46: Bigdive 2014 - RDF, principles and case studies

Resource Description Framework

› SPARQL endpoint › dereferenceable URIs › content negotiation › standards port, like 80 (HTTP) › JSONP support › up-to-date › the endpoint URL is easy to deduce from resources › the resources are described by dc:title or rdfs:label › the endpoint hosts a page for humans › the resources and the endpoint are on the same domain

SHOULD! (please do it, for me)

Page 47: Bigdive 2014 - RDF, principles and case studies

One single API a world to explore

Page 48: Bigdive 2014 - RDF, principles and case studies

One single API interlinking

<a href=“…”>click here</a>

owl:sameAs rdfs:seeAlso

Page 49: Bigdive 2014 - RDF, principles and case studies

SELECT * {?minnesota ?banana ?sun}

SPARQL a must know query language

Page 50: Bigdive 2014 - RDF, principles and case studies

SPARQL group graph pattern

diego

regesta

silvia

rome

italy

diego

regesta

silvia

rome

italy

Page 51: Bigdive 2014 - RDF, principles and case studies

SPARQL group graph pattern

diego

regesta

rome

silvia italy

silvia italy

Page 52: Bigdive 2014 - RDF, principles and case studies

SELECT ?person { ?person <placeOfBirth> ?place ; <memberOf> ?company . ?company <locatedIn> ?place . }

SPARQL group graph pattern

<diego>

Page 53: Bigdive 2014 - RDF, principles and case studies

SELECT ?person ?prop ?obj { ?person <placeOfBirth> ?place ; <memberOf> ?company ; ?prop ?obj . ?company <locatedIn> ?place . }

SPARQL group graph pattern

(turn the page)

Page 54: Bigdive 2014 - RDF, principles and case studies

person prop obj <diego> rdf:type foaf:Person <diego> foaf:firstName ‘Diego’ <diego> foaf:familyName ‘Camarda’ <diego> foaf:gender ‘male’ <diego> org:memberOf <regesta>

SPARQL group graph pattern

Page 55: Bigdive 2014 - RDF, principles and case studies

DESCRIBE <diego>

SPARQL describe

(turn the page)

Page 56: Bigdive 2014 - RDF, principles and case studies

<diego> rdf:type foaf:Person . <diego> foaf:firstName ‘Diego’ . <diego> foaf:familyName ‘Camarda’ . <diego> foaf:gender ‘male’ . <diego> org:memberOf <regesta> . <silvia> foaf:knows <diego> .

SPARQL describe

Page 57: Bigdive 2014 - RDF, principles and case studies

CONSTRUCT {<diego> foaf:donaldDuck ?c} WHERE{<diego> ?b ?c. }

SPARQL construct

(turn the page)

Page 58: Bigdive 2014 - RDF, principles and case studies

<diego> foaf:donaldDuck foaf:Person . <diego> foaf:donaldDuck ‘Diego’ . <diego> foaf:donaldDuck ‘Camarda’ . <diego> foaf:donaldDuck ‘male’ . <diego> foaf:donaldDuck <regesta> .

SPARQL construct

Page 59: Bigdive 2014 - RDF, principles and case studies

DISTINCT, COUNT GRAPH, PREFIX isBlank, isIRI, isLiteral, isNumeric FILTER, REGEX, STR FILTER NOT EXISTS, MINUS ORDER BY, OFFSET, LIMIT for other stuff http://www.w3.org/TR/sparql11-query/

SPARQL minimum requirements

Page 60: Bigdive 2014 - RDF, principles and case studies

Please start negotiating content right now!

Hi dude, I accept: text/html,application/xhtml+xml Html

page Great! I’ll serve you a web page

Hi dude, I accept: application/rdf+xml

RDF data Great… 303, redirect!

Hi dude, I accept: pizza/margherita

406 error mmm… sorry

Page 61: Bigdive 2014 - RDF, principles and case studies

Please start negotiating content right now!

application/rdf+xml application/xml text/plain text/turtle application/x-turtle application/trix application/x-trig text/n3 text/rdf+n3 application/trix

application/x-trig application/x-binary-rdf text/x-nquads application/ld+json application/rdf+json application/xhtml+xml text/xml application/json application/rdf+xml application/rdf+n3 application/sparql-results+xml application/sparql-results+json

Page 62: Bigdive 2014 - RDF, principles and case studies

curl -L -H "Accept: application/rdf+xml" http://dati.camera.it/ocd/governo.rdf/g102 curl -L -H "Accept: text/n3" http://dati.camera.it/ocd/governo.rdf/g102

Please start negotiating content using CURL…

Page 63: Bigdive 2014 - RDF, principles and case studies

Java : Sesame / Jena

Python : RDFLib Ruby : RDF.rb

nodeJs : sparql-client

or, as I do, simple HTTP GET +

parsing result as json or xml

Please start negotiating content …or a framework!

Page 64: Bigdive 2014 - RDF, principles and case studies

RDF data storing and deploying

Page 65: Bigdive 2014 - RDF, principles and case studies

It’s slow so keep calm

1 record 15 triples

2.949.771 votes 64.948.856 triples

usually

eg. Chamber of deputies

data big data

RDF probably will transform

Page 66: Bigdive 2014 - RDF, principles and case studies

Virtuoso Sesame

Fuseki (Jena) Owlim / Bigdata (Sesame)

AllegroGraph D2R server

ARC2 …

Triplestores I just need a SPARQL endpoint

I just really need http://yourdomain/sparql

Page 67: Bigdive 2014 - RDF, principles and case studies

Case studies

Page 68: Bigdive 2014 - RDF, principles and case studies

select distinct ?o where {?s a ?o}

select ?o count(distinct ?s) where {?s a ?o}

select count(?s) where {?s ?p ?o}

select count(?s) ?class where {?s ?p ?o; a ?class}

select distinct ?p where {?s a <http://classe>; ?p ?o}

select ?p count(?p) where {?s a <http://classe>; ?p ?o}

select ?s where {?s a <http://classe>}

?p ?o where {<http://URI> ?p ?o} ?p ?o ?p1 ?o2 where {<http://URI> ?p ?o. OPTIONAL{?o ?p1 ?o2. FILTER(isBlank(?o))}}

select distinct ?s ?title where {?s a <http://classe>;

dc:title ?title. FILTER(REGEX(? title,’parola’,’i’))} LIMIT 100

SPARQL magic a query for all seasons

Page 69: Bigdive 2014 - RDF, principles and case studies

Case studies Chamber of deputies

Page 70: Bigdive 2014 - RDF, principles and case studies

http://dati.camera.it/sparql

http://storia.camera.it

From SPARQL to html

Page 71: Bigdive 2014 - RDF, principles and case studies

Case studies Central State Archive

Page 72: Bigdive 2014 - RDF, principles and case studies

http://acs.beniculturali.it/sparql

http://labs.regesta.com/reloadProject

From SPARQL to html

Page 73: Bigdive 2014 - RDF, principles and case studies

Useful links

Page 74: Bigdive 2014 - RDF, principles and case studies

W3C standards http://www.w3.org/standards/semanticweb/ OKFN endpoints status (and list) http://sparqles.okfn.org LodLive (a SPRQL navigator) http://en.lodlive.it a very good intro to RDF https://github.com/JoshData/rdfabout/blob/gh-pages/intro-to-rdf.md Tim Berners-Lee’s “Linked Data – 5 stars ranking” http://www.w3.org/DesignIssues/LinkedData.html My github page http://github.com/dvcama My email mailto:[email protected]