Hello Open World - Semtech 2009
-
Upload
alexandre-passant -
Category
Technology
-
view
1.535 -
download
0
Transcript of Hello Open World - Semtech 2009
Copyright 2008 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Hello Open Data World!
The Web of Data for the Pragmatic Developer
Alexandre Passant & Giovanni Tummarello
SEMTECH, San Jose, California 2009
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
2
NB: Tutorial adapted from the WWW2009 “Hello Open
World” presentation by Alexandre Passant and Michael
Hausenblas
Digital Enterprise Research Institute www.deri.ie
Speakers introduction
Alexandre Passant
Postdoctoral researcher, DERI Galway
Social Software and Semantic Web
http://apassant.net
Giovanni Tummarello
Research Fellow, DERI Galway
Web of Data Search and Mashups
http://g1o.net
3
Digital Enterprise Research Institute www.deri.ie
DERI Galway
Enabling Networked Knowledge
Social Semantic Information Spaces
Semantic Reality
Approximately 130 people
4
Digital Enterprise Research Institute www.deri.ie
Companion website
http://helloopenworld.net
5
Digital Enterprise Research Institute www.deri.ie
What is this tutorial about
What you will learn
Web of Data principles
Architecture principles for applications on the Web of Data
Finding and creating structured data
Using vocabularies and lightweight inference
Querying Web data with SPARQL and with Sindice
User interfaces and mash-ups
What you will not
Ontology mapping and alignment
Advanced rules languages
Complex SPARQL querying
66
Digital Enterprise Research Institute www.deri.ie
What you should be able to do after this
tutorial
Explain the Web of Data to your CTO / Students /
Advisor / Grandmother
Spread the values of the Web of Data
Join the Web of Data
Effectively enriching your existing pages and applications with
annotations that will help your data to be found and integrated
Leverage the Web of Data for your apps
Finding data “out there” and reusing it
Using open-source and xAMP technologies
Creating, consuming and mashing-up RDF data
77
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
8
Digital Enterprise Research Institute www.deri.ie
Motivation
9
Yahoo
Rich
Snippets
Digital Enterprise Research Institute www.deri.ie
Motivation
More and more data is available on the Web
Structured data, in RDF, microformats, etc.
Reusing Data is a value for your application !
Up to now people would develop against proprietary
APIs (such as from Flickr, Google, etc.)
Loss of time for developers
The Web of Data …
Provides a uniform data model (RDF)
Provides a uniform API for accessing data (RDF/SPARQL)
Provide common semantics for this data (RDFS/OWL)
Enables serendipitous usage of data
10
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
11
Digital Enterprise Research Institute www.deri.ie
What is the Web of Data?
Web of information structured via standards and made
available on the Web
Microformats and GRDDL
RDF using various serializations: RDF/XML, RDFa, etc.
12
Digital Enterprise Research Institute www.deri.ie
RDF
13
RDF: Resource Description Framework
As of RDF abstract syntax, a data model: a directed,
labeled graph based on URIs
RDF is not XML !
RDF/XML is only one of the multiple way to serialize RDF data
(N3, RDFa …)
Triple: (subject predicate object)
subject
predicate
object
<http://sw-app.org/#i><http://xmlns.com/foaf/0.1/knows>
<http://apassant.net/alex>.
Digital Enterprise Research Institute www.deri.ie
RDF
14
Digital Enterprise Research Institute www.deri.ie
RDF
@prefix dcterms: <http://purl.org/dc/terms/> .
<http://deri.ie/teaching/tutorials/lod-intro>
dcterms:title “Tutorial on Linked Data - A Practical Introduction” ;
dcterms:author<http://sw-app.org/mic.xhtml#i> ;
dcterms:subject <http://dbpedia.org/resource/Linked_Data> .
15
Digital Enterprise Research Institute www.deri.ie
“Semantic Data” on the Web
Web of Data == the Semantic Web?
Not really, it’s the “Web” facing part of it
It’s part of it, a kind of subset
In contrast to the full-fledged Semantic Web vision, the Web
of Data more about raw data publishing in interoperable
format than about logic inference and reasoning on it
16
Digital Enterprise Research Institute www.deri.ie
What is the Web of Data?
17
Digital Enterprise Research Institute www.deri.ie
Web of data via “Linked Data”
Linked data principles, by Tim Berners-Lee, ca. 2006
Use URIs to identify things (anything, not just documents)
– “To benefit from and increase the value of the World Wide Web,
agents should provide URIs as identifiers for resources”
– http://www.w3.org/TR/webarch/
Use HTTP URIs – globally unique names, distributed ownership
– allows people to look up things
Provide useful information in RDF – when someone looks up a
URI
Include RDF links to other URIs– to enable discovery of related
information
Example http://dbpedia.org/resource/Dublin
18
Digital Enterprise Research Institute www.deri.ie
Linked data growth
19
2008
2007
Digital Enterprise Research Institute www.deri.ie
Linked data growth
20
2009
2008
Digital Enterprise Research Institute www.deri.ie
Needed: shared vocabularies (Ontologies)
21
Ontologies provide common semantics for the Web of
Data
“An ontology is a specification of a conceptualization.”
Main languages are RDFS and OWL
This tutorial will mainly focus on RDFS
OWL allows advanced axioms (contraints, unions …)
Classes and properties
:Person a rdfs:Class
:father a rdfs:Property
:father rdfs:domain :Person
:father rdfs:range :Person
21
Digital Enterprise Research Institute www.deri.ie
Ontologies
22
Hierarchies in ontologies
Are needed to define narrower / broader concepts
:LivingThing > :Person
Can be applied to both classes and properties
:Person rdfs:subClassOf :LivingThing
:father rdfs:subPropertyOf :familyRelation
Inference engines can take advantage of it to create new
facts
Can be used when querying information
Retrieve all :LivingThing instances with a :familyRelation
– Will get :Person and :father !
22
Digital Enterprise Research Institute www.deri.ie
Notable ontologies
Social networks and social data
FOAF – Friend Of A Friend
SIOC – Semantically-Interlinked Online Communities
Software development
DOAP – Description Of A Project
BEATLE - Bug And Enhancement Tracking LanguagE
Comprehensive / Top-level
Yago (From Wikipedia)
OpenCYC
Taxonomies
SKOS – Simple Knowledge Organisation System
23
Digital Enterprise Research Institute www.deri.ie
Zooming in: FOAF Ontology
A model to describe people and social networks
http://foaf-project.org
Concepts
Person, OnlineAccount, Document, etc.
Properties
name, homepage, holdsAccount, knows, etc.
24
Digital Enterprise Research Institute www.deri.ie
FOAF in use
Google Social Graph API
http://code.google.com/intl/fr/apis/socialgraph/
Uses FOAF information already there on the Web to find
your contacts
http://socialgraph-
resources.googlecode.com/svn/trunk/samples/findcontacts.html
E.g.: http://apassant.net
– http://socialgraph-
resources.googlecode.com/svn/trunk/samples/findcontacts.html?q=
http%3A%2F%2Fapassant.net
– Contacts found in various FOAF files that link to myself and to my
profile
25
Digital Enterprise Research Institute www.deri.ie
Zooming in: SIOC Ontology
Describe Web communities and their social interactions
Who’s writing what, who’s answering who, etc.
A simple model to ensure easy-integration into existing
applications
Lightweight: one core ontology, 4 modules
Plug-ins /core-feature for several CMS
Drupal, Wordpress etc.
Enables interoperability between social applications
http://sioc-project.org
26
Digital Enterprise Research Institute www.deri.ie
The SIOC ontology
The main classes and properties are:
27
Digital Enterprise Research Institute www.deri.ie
28
Combining FOAF + SIOC
Digital Enterprise Research Institute www.deri.ie
Adoption of SIOC
29
Digital Enterprise Research Institute www.deri.ie
Which ontologies to use ?
SearchMonkey Vocabularies
http://developer.yahoo.com/searchmonkey/smguide/profile_voca
b.html
30
Digital Enterprise Research Institute www.deri.ie
Which ontologies to use ?
How to Publish Linked Data on the Web
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
31
Digital Enterprise Research Institute www.deri.ie
Extending ontologies ?
What if existing ontologies are not enough for your
needs ?
Create a new one
… or extend existing ones !
Ontologies can be extended in a decentralized way
E.g. you can create a subproperty of foaf:knows,
“attendedTutorialWith”, in your own ontology
Publish in on your own
Or use http://open.vocab.org
32
Digital Enterprise Research Institute www.deri.ie
Attention: Domain and range
Domain and range of properties are descriptive, not
prescriptive
Example if we say :father rdfs:domain :Person
– Not only pre-defined Persons can be fathers
– But every father is a Person !
Consequence 1: One triple is enough to describe several
informations
Consequence 2: DON’T use foaf:homepage for a shoe
For details
Based on RDF semantics (Rule rdfs2)
http://www.w3.org/TR/rdf-mt/
33
Digital Enterprise Research Institute www.deri.ie
Linking Open Data Project
Community project with W3C support started in early
2007 [LOD]: http://linkeddata.org
Idea: take existing (open) data sets and make them
available on the Web in RDF
Interlink them with other data sets
Expand the network effect of Linked Data !
Raw Data Now !
Tim Berners-Lee TED talk
http://www.ted.com/index.php/talks/tim_berners_lee_on_the_nex
t_web.html
34
Kudos to Tom Heath and Richard Cyganiak; the material in
this section is heavily based on their work.
Digital Enterprise Research Institute www.deri.ie
Linking Open Data Project
35
May 2007
Digital Enterprise Research Institute www.deri.ie
Linking Open Data Project
36
Feb 2009
Digital Enterprise Research Institute www.deri.ie
Notable datasets of the LOD cloud
37
Linking Open Data
Community project started in 2007 - http://linkeddata.org
Dbpedia – http://dbpedia.org
Wikipedia in RDF: “more than 2.6 million things, including at
least 213,000 persons, 328,000 places, 57,000 music albums,
36,000 films, 20,000 companies”
Geonames – http://geonames.org
“over eight million geographical names”: coordinates, etc.
Freebase - http://rdf.freebase.com/
“5203825 Topics 14110006 Named Entities”
E.g. http://rdf.freebase.com/rdf/en.blade_runner
Digital Enterprise Research Institute www.deri.ie
Linking Open Data Project
38
DBpedia
Digital Enterprise Research Institute www.deri.ie
Linking Open Data Project
39
Geonames
Digital Enterprise Research Institute www.deri.ie
Querying dbpedia
Programmatically (via SPARQL, see later)
Via User Interface
http://wikipedia.aksw.org
40
Digital Enterprise Research Institute www.deri.ie
Tools and Applications
Linking Open Data homepage [LOD] has
Browsing with Tabulator, VisiNav, Sig.ma, DBpedia Mobile,
iLOD, etc.
Searching with Sindice, SWSE, Falcons, etc.
Mashups, e.g. Revyu, BBC Music, DERI Pipes
See further
http://esw.w3.org/topic/SweoIG/TaskForces/Community
Projects/LinkingOpenData/Applications
41
Digital Enterprise Research Institute www.deri.ie
Tools and Applications
42
DBpedia Mobile
Digital Enterprise Research Institute www.deri.ie
Applications integration
43
BBC music beta
Digital Enterprise Research Institute www.deri.ie
Typical architectures of applications
for the Web of Data (1/2)
from: Heitmann, B., et al., “Towards a reference architecture for Semantic Web
applications,” Proceedings of the 1st Int. Web Science Conference, 2009
44
Digital Enterprise Research Institute www.deri.ie
Typical architectures of applications
for the Web of Data (2/2)
Data Interface: Abstraction layer regarding implementation,
number and distribution of persistence layers.
Persistence Layer: Persistent storage of data and run time state.
User Interface: Human accessible interface for using application
and viewing data. (“read-only”)
Annotation User Interface: Edit, create, import or export data.
Integration Service: Merge Structure, Syntax or Semantics of data
from multiple heterogeneous sources.
Search Engine: Search on content or semantic features.
Crawler: Retrieval of remote data for integration service.
45
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
46
Digital Enterprise Research Institute www.deri.ie
Creating Structured Data
Overview of different methods:
Create RDF/XML manually (using your favourite text-editor or
Web-based interfaces)
Create XHTML+RDFa documents and use GRDDL
transformation
– For both human and machines !
Use exporters / wrappers for existing service
Use applications that natively expose RDF data
Provide mappings from RDBMS to RDF data
Hands-on !
We will go through several of them to create interlinked RDF
data from various sources of structured data
47
Digital Enterprise Research Institute www.deri.ie
Getting a FOAF profile
Or how to give yourself a URI
Be part of the Web of Data
Create your FOAF file
http://www.ldodds.com/foaf/foaf-a-matic (requires hosting -
provided during the tutorial)
http://foafbuilder.qdos.com/builder/ (requires OpenID)
I already have an homepage, what about duplication of
information ?
Use RDFa to embed RDF annotations in yourhomepage !
More on thistopic in a few slides
48
Digital Enterprise Research Institute www.deri.ie
Extend your FOAF profile
The foaf:knows property aims to represent social
connections between people
:alex foaf:knows :michael
Going further with the relationship vocabulary
http://vocab.org/relationship/: colleagueOf, hasMet …
Add some people from the workshop, validate, and
upload to the workshop repository
http://www.w3.org/RDF/Validator/
http://helloopenworld.net/semtech09/data
You finally got a URI !
– http://helloopenworld.net/semtech09/data/apassant.rdf#me
49
Digital Enterprise Research Institute www.deri.ie
Defining personal interests
Instead of modeling interests as plain-text strings, use
URIs to describe them !
Allows interlinking of various resources for advanced query
purposes: “find all people that like movies directed by Tarantino”
And link them to you using foaf:topic_interest
:me foaf:topic_interest :movie
But … where to get these URIs ?
The Linking Open Data cloud !
– Provide URIs for million of concepts, esp. thanks to DBpedia
Sindice can be used to find URIs for a given concept
– http://sindice.com
50
Digital Enterprise Research Institute www.deri.ie
Defining personal interests
51
Digital Enterprise Research Institute www.deri.ie
Defining personal interests
52
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
GRDDL is a mechanism to transform any kind of XML to
RDF
XHTML+RDFa is XML, hence GRDDL can extract it
Simply embeds RDFa annotations in your HTML code
Indexed by Yahoo! SearchMonkey and Google
Done via XSLT, available at http://www.w3.org/2008/07/rdfa-xslt
53
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
The GRDDL Primer athttp://www.w3.org/TR/grddl-
primer/#scheduling shows the overall processing of
XHTML+RDFa:
54
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
http://sdow2009.semanticweb.org
55
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
http://sdow2009.semanticweb.org
Browse source to check RDFa annotations
56
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
http://sdow2009.semanticweb.org
Header contains prefixes and links to the GRDDL transformation
57
Digital Enterprise Research Institute www.deri.ie
RDFa and GRDDL
http://sdow2009.semanticweb.org
Webpage can be translated to native RDF/XML using an RDFa
distiller - http://www.w3.org/2007/08/pyRdfa/
58
Digital Enterprise Research Institute www.deri.ie
AddRDFa to yourhomepage
Choose the right DTD
http://www.w3.org/TR/rdfa-syntax/DTD/xhtml-rdfa-1.dtd
Addprefixesdefinition in the header
Depending on ontologies youwill use
Addappropriate profile
E.g. http://ns.inria.fr/grddl/rdfa/
Addaditionalmarkup
E.g. rel, about, typeof
Example
http://helloopenworld.net/semtech2009/files/profile.html
59
Digital Enterprise Research Institute www.deri.ie
Wrappers for existing sources
Creating and maintaining a FOAF file by hand can be a
time-consuming task
How can we automatically get RDF data from existing sources ?
What about Web 2.0 services in which we already give
lots of personal information ?
Most of them provide APIs to get structured information (JSON,
XML …) about the user profiles, content, etc.
API to RDF wrappers can easily be implemented
60
Digital Enterprise Research Institute www.deri.ie
Wrappers for Web 2.0 services
Facebook wrapper
Generates a FOAF file from your Facebook profile
http://www.dcs.shef.ac.uk/~mrowe/foafgenerator.html
Flickr wrapper
Generates FOAF + SIOC + links to geographical information
(using geonames.org)
http://apassant.net/home/2007/12/flickrdf
61
Digital Enterprise Research Institute www.deri.ie
RDFification services
Translates many structured sources into RDF
URIBurner
– http://linkeddata.uriburner.com/
– Open Source, C++ , Based on Virtuoso
Any23
– Sindice sponsored
– Open Source, Java based
Swignition
– http://buzzword.org.uk/swignition/
– Perl based
Triplr
– Purelysyntactic, fast
– http://triplr.org
62
Digital Enterprise Research Institute www.deri.ie
Interlinking identities
The previous exporters create different URIs
A need to unify your online identity on the Web of Data
owl:sameAs
Used to state that two resources with different URIs are about the same entitiy
http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
owl:InverseFunctionalProperty
foaf:mbox, foaf:openid, etc.
“Inverse Functional” properties can be used to identify uniqueness for a foaf:Person
63
Digital Enterprise Research Institute www.deri.ie
Interlinking identities and networks
64
Digital Enterprise Research Institute www.deri.ie
Native export of RDF data
CMS can expose RDF data natively using dedicated
plug-ins
SIOC Export for Drupal: http://drupal.org/project/SIOC
Provide RDF export of each blog post
– http://apassant.net/blog/2009/03/07/call-suggested-features-sparql-
working-group
– http://apassant.net/sioc/node/235
Using RDF autodiscovery feature in the HTML header
– So that RDF can be discovered when browsing HTML
– Semantic Radar: http://sioc-project.org/firefox
RDFa to be included in Drupal7 core !– http://groups.drupal.org/node/16597
– 100.000’s of RDFa-powered websites
65
Digital Enterprise Research Institute www.deri.ie
Overview: SIOC for vBulletin
66
Digital Enterprise Research Institute www.deri.ie
Relational to RDF Mapping
Relational data (RDB) is structured data and can be
mapped to RDF straight-forward
Main issues:
Closed-world vs. open-world modeling
Assigning URIs for entities (records)
Mapping language expressivity
For a state-of-the-art see
http://www.w3.org/2005/Incubator/rdb2rdf/RDB2RDF_Su
rveyReport.pdf
67
Digital Enterprise Research Institute www.deri.ie
Relational to RDF Mapping
Standardization
W3C RDB2RDF Incubator Group 2008/2009
Upcoming W3C RDB2RDF Working Group
Current solutions (see state-of-the-art)
D2RQ
– http://www4.wiwiss.fu-berlin.de/bizer/d2rq/
– DBLP in RDF: http://dblp.l3s.de/d2r/
OpenLink’s Virtuoso
– http://www.openlinksw.com/virtuoso/
Triplify
– http://triplify.org
68
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
69
Digital Enterprise Research Institute www.deri.ie
Discovery of RDF data
Discovery is the process of starting with a URI and learn
more about the resources that can be accessed or
described through it
.. Without using a search engine
70
Digital Enterprise Research Institute www.deri.ie
Discovering RDF data
Simple case: Dereference
“Follow-Your-Nose” approach
Semantic Sitemaps
to find SPARQL endpoints or data dumps
http://sw.deri.org/2007/07/sitemapextension/
Links from within the RDF
voiD, vocabulary of interlinked datasets
– Allows to learn what a dataset is about
– Provides quantitative data on interlinking (statistics)
– Enables to deliver licensing, provenance and access information
– http://semanticweb.org/wiki/VoiD
SeeAlso, owl:import or simply dereference other URIs
71
Digital Enterprise Research Institute www.deri.ie
72
Semantic Web Sitemaps
Easy to create metadata from your existing database
(D2RQ, microformats etc)
But you need to tell the world about it!
More is needed to make your data useful (e.g. linking to OTHER
URIs if your entities are not something completely “yours”)
Need to make the world know your data is there.
Semantic Web Sitemaps can help
Digital Enterprise Research Institute www.deri.ie
73
Large quantities of linked data: how to
expose?
The fact that the data is HTTP retrievable in small bits
makes it crawlable.
But data producers are very scared of this:
Million of hits for each refresh
And clearly something better must be possible
Most data producers do in fact already provide full dumps of the
base data
Or SPARQL endpoints
Digital Enterprise Research Institute www.deri.ie
74
Extending Sitemaps to expose data
Sitemaps:
Originally by Google, immediately adopted by all (Yahoo, MSN)
etc
Expose the “deep web”, by providing a list of pages “to be
crawled”
Written in XML, Linked directly in the robot.txt
Example:<?xml version="1.0" encoding="UTF-8"?>
< urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
< url>
< loc>http://www.example.com/</loc>
< lastmod>2005-01-01</lastmod>
< changefreq>monthly</changefreq>
< priority>0.8</priority>
</url>
</urlset>
Digital Enterprise Research Institute www.deri.ie
75
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
76
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
77
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
78
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
79
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
80
The Semantic Sitemap Extention
Example first:
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:sc="http://sw.deri.org/2007/07/sitemapextension/scschema.xsd">
<sc:dataset>
<sc:datasetLabel>Product Catalog for Example.org</sc:datasetLabel>
<sc:dataDumpLocation>http://example.org/cataloguedump.rdf
</sc:dataDumpLocation><sc:linkedDataPrefix>http://example.org/products/</sc:linkedDataPrefix><changefreq>monthly</changefreq>
</sc:dataset>
</urlset>
Digital Enterprise Research Institute www.deri.ie
81
How it is meant to be used
As a crawler
If you are given a URL for an RDF site check for the sitemap
If a dump is available, download that instead
As a client
If you have a dump, and want an update
Check the sitemap, to locate it in case it has changed position
Or to locate a SPARQL endpoint
Digital Enterprise Research Institute www.deri.ie
Or Search!
Sindice Search Engine
http://sindice.com
Look up by RDF by keywords and on property/value
descriptions
Simple queries but executed fast.
Fast indexing (20 to 60m) of newly “pinged” information
Sindice can be thought as a “Spider In the middle” for application
2 application semantic communication via published data.
82
Digital Enterprise Research Institute www.deri.ie
Searching the Web of Data
E.g. search for people who claim to know X
83
Digital Enterprise Research Institute www.deri.ie
Query service over all the major “linked dataset”
Advanced queries, (SPARQL, see later)
No pings
In general best effort results due to timeouts (on non
simple queries)
84
Lod.openlinksoftware.com
Digital Enterprise Research Institute www.deri.ie
How to manipulate and query RDF?
Querying Web data on runtime
Needs to load RDF data in memory, can be quite slow
Storing information in RDF-stores and using SPARQL
Involves data replication and need to sync between data from the
Web and data in your RDF-store
– Sindice API to help syncing
Lots of RDF-stores available on the market
– Sesame
– Jena
– Openlink Virtuoso
– Allegrograph
– OpenAnzo
– Mulgara, etc
85
Digital Enterprise Research Institute www.deri.ie
SPARQL
SPARQL Protocol and RDF Query Language
“The SQL of the Semantic Web”
Both a protocol and a query language
– RDF data can be queried via REST
Four different query forms
SELECT, CONSTRUCT, ASK, DESCRIBE
We will mainly focus on the first one
SPARQL is based on a graph-matching approach
Retrieve statements that match some patterns in one (or more)
RDF graph(s): independant from serialization
W3C SPARQL WG currently working on new features
http://www.w3.org/2009/01/sparql-charter
86
Digital Enterprise Research Institute www.deri.ie
SPARQL SELECT
SELECT all people and their name
http://helloopenworld.net/semtech2009/files/select1.sparql
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person ?name
WHERE
?person a foaf:Person ;
foaf:name ?name .
87
Digital Enterprise Research Institute www.deri.ie
SPARQL CONSTRUCT
Contruct an RDF graph from other ones
Can be seen as the XSLT of the Semantic Web
http://helloopenworld.net/semtech2009/files/construct1.sparql
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX semtech09: <http://ex.org/semtech09/>
CONSTRUCT ?person a semtech09:attendee .
WHERE ?person a foaf:Person .
88
Digital Enterprise Research Institute www.deri.ie
SPARQL DESCRIBE
Get information about a given resource
DESCRIBE is implementation specific and can return
different results depending on the triple-store used
http://helloopenworld.net/semtech2009/files/desc1.sparql
DESCRIBE
<http://helloopenworld.net/semtech2009/data/apassant.r
df#me>
89
Digital Enterprise Research Institute www.deri.ie
SPARQL ASK
Check if a particular pattern matches the RDF graph
Is Alex a foaf:Person ?
http://helloopenworld.net/semtech2009/files/ask1.sparql
PREFIX foaf: http://xmlns.com/foaf/0.1/
ASK
<http://helloopenworld.net/semtech2009/data/apassant.r
df#me> a foaf:Person .
90
Digital Enterprise Research Institute www.deri.ie
SPARQL Protocol
REST-compliant protocol for SPARQL queries
http://helloopenworld.net/semtech2009/store1/?query=PREFIX+f
oaf%3A+%3Chttp%3A%2F%2Fxmlns.com%2Ffoaf%2F0.1%2F
%3E%0D%0ASELECT+%3Fperson+%3Fname%0D%0AWHER
E+%0D%0A+%3Fperson+a+foaf%3APerson+%3B%0D%0A++
+foaf%3Aname+%3Fname+.%0D%0A%0D%0A&output=htmlta
b
Easy for remote SPARQL querying
91
Digital Enterprise Research Institute www.deri.ie
SPARCool.net
http://sparcool.net
Run SPARQL queries on any URI thatfollows the Linked
Data principles
http://sparcool.net/j/dbp:abstract;l=en/http://dbpedia.org/resource
/Semantic_Web
Various formats for results (HTML, JSON, etc.)
Embedresults in your documents using JSONP
92
Digital Enterprise Research Institute www.deri.ie
Setup a SPARQL endpoint
Various open-source triple-store availables
Virtuoso, Sesame, Joseki …
Based on various back-ups (MySQL, dedicated FS …)
We will focus on xAMP solutions with ARC2
Lightweight RDF framework for PHP - http://arc.semsol.org
RDF Store based on MySQL
Only a few lines of code to set-up a repository
– http://helloopenworld.net/semtech2009/store1/index.phps
Using SPARQL+ to LOAD / UPDATE / DELETE RDF data
– SPARQL being read-only
Used in various of our projects
– SMOB, LODr, etc
93
Digital Enterprise Research Institute www.deri.ie
Loading RDF data
SPARQL is a read-only language
SPARQL+ allows to add / modify / delete RDF data
LOAD <URI> [INTO <URI>]
Will load the RDF data from <URI> into the store before going
into SPARQL querying
LOAD your FOAF files in the RDF store
http://helloopenworld.net/semtech2009/store1/
E.g. LOAD
<http://helloopenworld.net/semtech2009/data/apassant.rdf>
– NB: To be done in POST mode
94
Digital Enterprise Research Institute www.deri.ie
Lightweight inference
ARC2 does not provide RDFS inference engine
But triggers can be used to write one
http://apassant.net/blog/2008/10/01/lightweight-subpropertyof-
subclassof-inference-arc2
Rule rdfs9: inference on subproperties
http://www.w3.org/TR/rdf-mt/#RDFSRules
Can be done with SPARQL CONSTRUCT and ARC2 triggers
http://helloopenworld.net/semtech2009/store2/index.phps
http://helloopenworld.net/semtech2009/files/ARC2_SubPropertyInfe
renceTrigger.phps
95
Digital Enterprise Research Institute www.deri.ie
Lightweight inference
LOAD your profile in the inference-enabled store
http://helloopenworld.net/semtech2009/store2
Try the following query in both stores
Only the second one deals with subProperty inference
http://helloopenworld.net/semtech/2009/files/select2.sparql
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?person
WHERE
<YOUR_URI>foaf:knows ?person .
96
Digital Enterprise Research Institute www.deri.ie
Complex triggers
LOAD the RDF file corresponding to each user interest
Must be done to avoid issues of distributed SPARQL querying
Trigger designed using SPARQL SELECT + SPARUL
LOAD
For each loaded file, check if there are any foaf:topic_interest
and load them into the store
http://helloopenworld.net/semtech2009/files/ARC2_InterestLoad
Trigger.phps
http://helloopenworld.net/semtech2009/store3
97
Digital Enterprise Research Institute www.deri.ie
SPARQL SELECT w/ Triggers
Advanced querying capabilities
http://helloopenworld.net/semtech2009/files/select3.sparql
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?person
WHERE
?person foaf:topic_interest [
dbo:director<http://dbpedia.org/resource/Quentin_Tarantino
> . ]
98
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
99
Digital Enterprise Research Institute www.deri.ie
Mash-ups & Advanced Topics
Applying and extending what we have learned
Displaying and rendering the Web of Data
End-user Interaction, UI
Read/write Web of Data
Writing Applications!
100
Digital Enterprise Research Institute www.deri.ie
Exhibit faceted browsing
Exhibit
JavaScript library for faceted browsing
http://www.simile-widgets.org/exhibit/
Can be used directly on the top of a SPARQL endpoint
Thanks to SPARQL CONSTRUCT and the Babel translation
service
http://helloopenworld.net/semtech2009/files/construct2.sparql
http://helloopenworld.net/semtech2009/exhibit
http://microplanet.sioc-project.org/
– With geolocation services
101
Digital Enterprise Research Institute www.deri.ie
Ubiquity command
Ubiquity
Mozilla Firefox command line for the Web
http://ubiquity.mozilla.com/
Find people that like a given topic when browsing
Wikipedia
(1) Query Dbpedia to find the related URI
(2) Query the RDF Store to identify related people
Total: 2 SPARQL queries !
http://helloopenworld.net/semtech2009/ubiquity
http://en.wikipedia.org/wiki/Reservoir_Dogs
102
Digital Enterprise Research Institute www.deri.ie
Ubiquity command
103
Digital Enterprise Research Institute www.deri.ie
Mash-up tools
DERI Pipes, a tool for semantic mashups
http://pipes.deri.org
104
Digital Enterprise Research Institute www.deri.ie
Sig.ma Mashup Maker
Web of data comes together
In an interactive mashup maker
Mashups can be embedded or queries programmatically
105
Digital Enterprise Research Institute www.deri.ie
Geolocation mash-ups
Since data is interlinked, it’s easy to combine it
Can mix personal and public data, e.g. in organisations
106
Digital Enterprise Research Institute www.deri.ie
Data directories
http://doapstore.org
A directory of Software projectsdescribedwith DOAP
CompletelyRDF-based
107
Digital Enterprise Research Institute www.deri.ie
Ongoing works
Enabling the write on the Web of Data
“Pushback”
http://esw.w3.org/topic/PushBackDataToLegacySources
More demos at: http://ld2sd.deri.org/pushback/
Check out code at
http://code.google.com/p/pushback/ and contribute !
Using RDFa to automate web forms
RDFormshttp://rdfs.org/ns/rdforms
108
Digital Enterprise Research Institute www.deri.ie
Outline
Content
Motivation
What is the Web of Data?
Creating Structured Data
Discovery, Accessing & Querying
Mash-ups & Advanced Topics
109
Digital Enterprise Research Institute www.deri.ie
Conclusion
Web of Data is a reality
Tools and technologies exist to
Create and describe data on the Web
Store and access data on the Web
Discover and query data on the Web
Build and expand your applications with data on the Web
Challenges
Technical issues such as scalability and usability
Social issues (trust, privacy, etc.)
Economic issues (building a critical mass)
110
Digital Enterprise Research Institute www.deri.ie
Events
LDOW Series
http://events.linkeddata.org/
ESWC and ISWC
Major venues for academic research on the Semantic Web
SFSW: Scripting For the Semantic Web Workshop
– http://www.semanticscripting.org/
Triplification challenge
Applications enabling existing systems being part of the Web of
Data
http://triplify.org/Challenge/2009
Deadline June 30th, 2009
111
Digital Enterprise Research Institute www.deri.ie
Feedback
Did you learn something during the tutorial ?
Do you think you can now explain the Web of Data and
build applications ?
Which topics that you expected were not covered ?
Feel free to discuss or contact us
#swig on irc.freenode.net
112