Semantic Web 101
-
Upload
david-ratcliffe -
Category
Technology
-
view
122 -
download
1
description
Transcript of Semantic Web 101
David Ratcliffe & Armin HallerCSIRO ICT Centre, Canberra23rd April 2012
Semantic Web 101Canberra Semantic Web meetup
INFORMATION ENGINEERING LABORATORY
What is the Semantic Web?
“The Semantic Web is a web of data, in some ways like a global database”1
“The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web - a Web of data that can be processed directly or indirectly by machines”2
1. http://www.w3.org/DesignIssues/Semantic.html
2. Tim Berners-Lee, Weaving the Web. Harper, San Francisco, 1999.
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
FundamentalzZzZz…
Linked Open Data RDFa, -Formats Vocabularies GRDDL Triplestores Apps!
RDFS, OWLRDF, SPARQLURIs, XML
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Fundamentals: Semantic Web “Layer Cake”
Linked Open Data RDFa, -Formats Vocabularies RDDL Triplestores Apps!
URIs, RDF, RDFS,SPARQL, OWL
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Fundamentals
Linked Open Data RDFa, -Formats Vocabularies GRDDL Triplestores Web Applications!
URIs, RDF, RDFS,SPARQL, OWL
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
String of characters identifying a resource (which can be… anything!)Syntax (absolute):
<scheme> : <path> [? <query>] [# <fragment>]Examples:
ICT Centre website:http://www.ict.csiro.au/
The actual ICT Centre itself:http://www.ict.csiro.au/resource#id
URIs are more general than URLs (and less so than IRIs!) URIs include URNs (e.g., urn:oid:2.16.840)
Used extensively in the Linked Open Data web! Many recommendations about URIs to come…
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Uniform Resource Identifiers (URIs)
Resource Description Framework (RDF)
Data model - captures knowledge about resources (as URIs!)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
Data model - captures knowledge about resources (as URIs!)
Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra
The ‘triple’ data modelSubject, predicate, object
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
Data model - captures knowledge about resources
Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra
The ‘triple’ data modelSubject, predicate, object
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
Data model - captures knowledge about resources
Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra
The ‘triple’ data modelSubject, predicate, object
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
Data model - captures knowledge about resources
Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra
The ‘triple’ data modelSubject, predicate, object
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
Data model - captures knowledge about resources
Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra
The ‘triple’ data modelSubject, predicate, object
All knowledge can be encoded using triples like this in RDF
Canberra is located_in ACT(…‘subjects’ can be ‘objects’ – links resources!)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
XML Serialization
Australian War Memorial located_in Canberra
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <rdf:Description></rdf:RDF>
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Resource Description Framework (RDF)
XML Serialization
Australian War Memorial located_in Canberra
“” unveiled 1941
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <dbpprop:unveiled rdf:datatype="http://www.w3.org/2001/XMLSchema#int"> 1941 </dbpprop:unveiled> <rdf:Description></rdf:RDF>
Resource Description Framework (RDF)
XML Serialization
Australian War Memorial located_in Canberra
“” unveiled 1941 (integer data type)
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <dbpprop:unveiled rdf:datatype="http://www.w3.org/2001/XMLSchema#int"> 1941 </dbpprop:unveiled> <rdf:Description></rdf:RDF>
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
RDF Graph
Collection of triples referencing common resource(s) = a graph
Australian War Memorial located_in Canberra
Canberra located_in ACT
Australian War Memorial unveiled 1941
Australian_War_Memorial
Canberra
{1941}
ACT
_located_in
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
RDF ‘Triplestore’ – Database for triples
Basically a database for triples, stored as graphs
Triplestores not only store triples, but give you ways of getting them out in interesting ways… (e.g., via queries – SPARQL)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Australian_War_Memorial
Canberra
{1941}
ACT
_located_in
Data modelRelational data (tables)
Data instancesRecords in tables
Query supportSQL
Indexing mechanismOptimized for evaluating queries as relational expressions
Relational Database RDF Triplestore
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Data modelRDF graphs
Data instancesRDF triples
Query supportSPARQL
Indexing mechanismOptimized for evaluating queries as graph patterns
Language for querying (and constructing new) RDF graphs A W3C Recommendation (version 1.1 released in January 2012) Syntax similar to SQL…
SPARQL Protocol and RDF Query Language (SPARQL)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {
dbpedia:Australian_War_Memorial obo:_located_in ?l .}
SPARQL – Anatomy (simple SELECT)
PREFIXDefine prefix for URI to make SPARQL more readable
SELECT Variables (columns) in the result (here, ?l)
WHEREThe graph pattern to look for involving the selected variables…“Where [?l] is the Australian_War_Memorial _located_in?”
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
SPARQL: Querying an RDF graph
e.g. “Where is the Australian_War_Memorial _located_in?”
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {
dbpedia:Australian_War_Memorial obo:_located_in ?l .}
Australian_War_Memorial
Canberra
{1941}
ACT
_located_in
SPARQL: Querying an RDF graph
e.g. “Where is the Australian_War_Memorial _located_in?”
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {
dbpedia:Australian_War_Memorial obo:_located_in ?l .}
Australian_War_Memorial
Canberra
{1941}
ACT
_located_in
SPARQL: Querying an RDF graph
e.g. “Where is the Australian_War_Memorial _located_in?”
Result:
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
?l
http://dbpedia.org/resource/Canberra
Australian_War_Memorial
Canberra
{1941}
ACT
_located_in
RDF Schema (RDFS)
Describes structures of triples into particular graphs Assigns semantics to RDF triples
(…just as E-R does for relational data!) Used in describing vocabularies and ontologies over RDF
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
RDFS: Structure + Semantics over RDF
Class (set) of instances (set members): E.g., class ‘City’ – the set of all cities
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
City
Canberra
RDFS: Structure + Semantics over RDF
Subclass (subset) of a class (set) of instances (set members): Class ‘CapitalCity’ (subclass of ‘City’) –
defined as the set of all cities that are the capital of a federated state
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
City
Canberra
CapitalCity
City
StateCapital
CapitalCity
RDFS: Structure + Semantics over RDF
Subclass (subset) of a class (set) of instances (set members): Class ‘StateCapital’ (subclass of ‘City’) –
defined as the set of all cities that are capitals of a state within a federated state
Cities that are CapitalCity and StateCapital
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Canberra
RDFS: Structure + Semantics over RDF
Properties (set set) of instances (set members) over classes:
observes: Sensor Observation
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
City
_located_in
_located_in
Federated State
StateCapital
CapitalCity
Canberra ACT
More (+ expressive) structure and semantics…
Web Ontology Language (OWL)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
FederalStateCapital = CapitalCity StateCapital
City
_located_in
_located_in
Federated State
StateCapital
CapitalCity
That’s enough…
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
... for putting Data openly on the Web …
and linking it to other data …
a.k.a. Linked Open Data Building a web of data Semantic Web done right Combination of Openess with data + open standards uses RDF with de-referenceable URIs → not only to identify data
but actually pointing to data about the entity
What is this all useful for?
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Open
Data
Linked
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Linking data paves the way for all kinds of improvements in: search, filtering, automation…
Real Business Value “I found that RDFa was a much more stable concept – based on the use of long established
vocabularies (also known as ontologies) that have existed for years. …Within just a couple of months, we began to see an increase in our organic search results … by 30% over historical rates. We also saw an increase in our click-through rate.” [Jay Myers, Lead Development Engineer Best Buy]
“2012: The Year of the Semantic Web” [Steve Hamby, CTO, Orbis Technologies, Inc.]
Enormous amount of linked (RDF) data already available• Many vocabularies to get your data described and linked into this world• Estimated tens (to hundreds) of billions of triple statements• Now (relatively) easier for people to join in!
Motivation
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
LOD ‘Cloud’ Growth: September 2010
[Richard Cyganiak, Anja Jentzsch, Linking Open Data cloud diagram, 2010]
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
LOD ‘Cloud’ Growth: September 2011
[Richard Cyganiak, Anja Jentzsch, Linking Open Data cloud diagram, 2011]
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:
1. Use URIs for names of things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)
4. Include links to other URIs so that a user can discover more things.
The Four Linked Open Data Principles
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Source: http://www.w3.org/DesignIssues/LinkedData.html
Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:
1. Use URIs for names of things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)
4. Include links to other URIs so that a user can discover more things.
The Four Linked Open Data Principles
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Use URIs for names of things – Example:
Large US retailer
Loads of different products for sale!
Data include:Product categories(computers, videogames, televisions,digital cameras, mp3players, mobile phones,appliances, etc…)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Product data: Price Model # ID Stock levels Rating Reviews Images etc…
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Identify things to reference with a URI
Store data: Names Locations Phone # Hours Ratings/Reviews Events etc…
Source: http://www.readwriteweb.com/archives/how_best_buy_is_using_the_semantic_web.php
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Identify things to reference with a URI
Which URI to usefor the location
Carbondale, Illinois, USA?
Choose URIs for things in your data
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
http://www.ci.carbondale.il.us/
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Which URI to usefor the location
Carbondale, Illinois, USA?
Choose URIs for things in your data
http://en.wikipedia.org/wiki/Carbondale,_Illinois
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Choose URIs for things in your data
Which URI to usefor the location
Carbondale, Illinois, USA?
Problem:http://www.ci.carbondale.il.us/ http://en.wikipedia.org/wiki/Carbondale,_Illinois
These are webpages about Carbondale, IL!
Instead, we want some identifier for the town Carbondale, IL.
DBpedia has an RDF Resource URI based on info from Wikipedia:
http://dbpedia.org/resource/Carbondale,_Illinois
Choose URIs for things in your data
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:
1. Use URIs for names of things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)
4. Include links to other URIs so that a user can discover more things.
The Four Linked Open Data Principles
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
DBpedia is a major Linked Open Data contributor!
So, what happens when we put: http://dbpedia.org/resource/Carbondale,_Illinois
… into a browser?
The dbpedia.org server responds with an HTTP redirect:303 SEE OTHERLOCATION: http://dbpedia.org/page/Carbondale,_Illinois
Use de-referenceable HTTP URIs
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
The browser (client) performs content negotiation with the server:ACCEPT parameters in the HTTP request specified HTML, e.g.:
Accept: text/*, text/html, text/html;level=1, */*
The client could have asked for RDF instead!
Accept: application/rdf+xml
… or other syntaxes too (N3, Turtle, etc.):
Accept: application/rdf+xml, text/rdf+n3, …
Content negotiation
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
http://dbpedia.org/page/Carbondale,_Illinois
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Content negotiation
So, what happens when we ask for: http://dbpedia.org/resource/Carbondale,_Illinois including:
Accept: application/rdf+xml
...?
The dbpedia.org server responds with another, different redirect:303 SEE OTHERLOCATION: http://dbpedia.org/data/Carbondale,_Illinois
The server may provide the triple data in RDF/XML format...
Content negotiation
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
http://dbpedia.org/data/Carbondale,_Illinois
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Content negotiation
Three Different URIs for different data about the same thing:1. An abstract identifier (RDF resource) for the ‘thing’ in question2. An HTML page about that ‘thing’3. Some RDF data about that ‘thing’
Example: Carbondale, IL on dbpedia.org:
1. Abstract identifier: http://dbpedia.org/resource/Carbondale,_Illinois
2. HTML page: http://dbpedia.org/page/Carbondale,_Illinois
3. RDF data: http://dbpedia.org/data/Carbondale,_Illinois
Different URIs for different content!
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Broken links are very uncool
URIs that don’t change are cool. Should strive to maintain URIs! Even if content moves,
server should still responds appropriately.
Make sure your URIs are cool
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
The Four Linked Open Data Principles
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:
1. Use URIs for names of things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)
4. Include links to other URIs so that a user can discover more things.
RDF data usually resides in an RDF database (Triplestore)
…how do we ‘put them out’ on the web?
SPARQL Endpoint (SPARQL query over HTTP)• Direct connection to triplestore over HTTP
Publish RDF data in files on your web server• Might not even need a triplestore…
Providing RDF on the Web
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Problem: Web data (HTML) and RDF data are separate.
Separate HTML content and RDF data?
WebPage
(HTML)RDFdata
(XML)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Exposing RDF on the web
Separate HTML content and RDF data?
Maintenance problem• Both need to be managed separately• RDF content and web content have much overlap (redundancy)
duplication of content, effort = data integrity issues• RDF/XML difficult to author = extra overhead
Verification problem• How to reconcile differences as content changes?
Visibility problem• Easy to ignore the RDF stuff!
(out of sight, out of mind)
Exposing RDF on the web
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Embed RDF into your web content using RDFa
WebPage
(HTML)RDFdata
(XML)
‘Embed’ RDF into HTML instead!
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Exposing RDF on the web
Embed RDF into your web content using RDFa
Extra (RDFa) markup is ignored by web browsers.
RDF – in attributes (RDFa)
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Source: Peter Mika (Yahoo!), RDFa, 2011
…but is “one of the… what I call a gateway drug… to the Semantic Web.” Jay Myers, Lead Development Engineer for Best Buy
RDFa is not the only solution…
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
The Four Linked Open Data Principles
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:
1. Use URIs for names of things
2. Use HTTP URIs so that people can look up those names
3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)
4. Include links to other URIs so that a users can discover more things.
For describing classes (categories) and properties (relationships), try to re-use existing vocabularies
Easier to interoperate if we’re talking the same language!
Many vocabularies/ontologies out there: schema.org is a great place to start looking! Vocabs for products (Good Relations), people (FOAF), social media (SIOC), places,
events, businesses, e-commerce, music, etc., you name it…
If nothing relevant, you can: Hack existing vocabularies, then publish your dirty hacks on http://open.vocab.org/ Create your own, but make sure you…
Publish it!Reconcile (map) it to other vocabularies, if you can.
Link to existing vocabularies to describe data!
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Linking similar entities (individuals/ concepts) from different datasets (“entity resolution”)
Several ways, different semantics: owl:equivalentClass – strong assertion! owl:sameAs – strong assertion! rdfs:seeAlso – weak (loose) relation skos:closeMatch – weak (loose) relation skos:exactMatch – stronger than close match skos:related – weak semantic relation
http://dbpedia.org/resource/Canberra owl:sameAs
http://rdf.freebase.com/rdf/en.canberra
Link your data!
CSIRO. Semantic Web 101 – Canberra Semantic Web meetup
Thank you!ICT CentreDavid RatcliffePhD Student / Software Engineer
Phone: +61 2 6216 [email protected]
ICT CentreDr. Armin HallerOffice Manager, Australian W3C Office
Phone: +61 2 6216 [email protected]
ICT CENTRE, INFORMATION ENGINEERING LABORATORY