Semantic Web 101

Post on 27-Jan-2015

122 views 1 download

Tags:

description

An extremely brief introduction to some Semantic Web technologies: URIs, XML, RDF, RDFS and OWL, SPARQL, and an introduction to Linked Open Data (LOD), principles and real-world applications. This presentation was given as a primer to talks about the Semantic Web at a Canberra Semantic Web Meetup Group meeting entitled "Semantic Web In Use" at the Australian War Memorial, BAE Systems Theatre on Monday 24th April, 2012. http://www.meetup.com/Canberra-Semantic-Web-Meetup-Group/events/59032632/

Transcript of Semantic Web 101

David Ratcliffe & Armin HallerCSIRO ICT Centre, Canberra23rd April 2012

Semantic Web 101Canberra Semantic Web meetup

INFORMATION ENGINEERING LABORATORY

What is the Semantic Web?

“The Semantic Web is a web of data, in some ways like a global database”1

“The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web - a Web of data that can be processed directly or indirectly by machines”2

1. http://www.w3.org/DesignIssues/Semantic.html

2. Tim Berners-Lee, Weaving the Web. Harper, San Francisco, 1999.

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

FundamentalzZzZz…

Linked Open Data RDFa, -Formats Vocabularies GRDDL Triplestores Apps!

RDFS, OWLRDF, SPARQLURIs, XML

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Fundamentals: Semantic Web “Layer Cake”

Linked Open Data RDFa, -Formats Vocabularies RDDL Triplestores Apps!

URIs, RDF, RDFS,SPARQL, OWL

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Fundamentals

Linked Open Data RDFa, -Formats Vocabularies GRDDL Triplestores Web Applications!

URIs, RDF, RDFS,SPARQL, OWL

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

String of characters identifying a resource (which can be… anything!)Syntax (absolute):

<scheme> : <path> [? <query>] [# <fragment>]Examples:

ICT Centre website:http://www.ict.csiro.au/

The actual ICT Centre itself:http://www.ict.csiro.au/resource#id

URIs are more general than URLs (and less so than IRIs!) URIs include URNs (e.g., urn:oid:2.16.840)

Used extensively in the Linked Open Data web! Many recommendations about URIs to come…

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Uniform Resource Identifiers (URIs)

Resource Description Framework (RDF)

Data model - captures knowledge about resources (as URIs!)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

Data model - captures knowledge about resources (as URIs!)

Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra

The ‘triple’ data modelSubject, predicate, object

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

Data model - captures knowledge about resources

Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra

The ‘triple’ data modelSubject, predicate, object

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

Data model - captures knowledge about resources

Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra

The ‘triple’ data modelSubject, predicate, object

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

Data model - captures knowledge about resources

Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra

The ‘triple’ data modelSubject, predicate, object

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

Data model - captures knowledge about resources

Relates resources with predicates (named relationships)Australian War Memorial located_in Canberra

The ‘triple’ data modelSubject, predicate, object

All knowledge can be encoded using triples like this in RDF

Canberra is located_in ACT(…‘subjects’ can be ‘objects’ – links resources!)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

XML Serialization

Australian War Memorial located_in Canberra

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <rdf:Description></rdf:RDF>

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Resource Description Framework (RDF)

XML Serialization

Australian War Memorial located_in Canberra

“” unveiled 1941

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <dbpprop:unveiled rdf:datatype="http://www.w3.org/2001/XMLSchema#int"> 1941 </dbpprop:unveiled> <rdf:Description></rdf:RDF>

Resource Description Framework (RDF)

XML Serialization

Australian War Memorial located_in Canberra

“” unveiled 1941 (integer data type)

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:obo="http://purl.obolibrary.org/obo/OBO_REL#" xmlns:dbpprop="http://dbpedia.org/property/"> <rdf:Description rdf:about="http://dbpedia.org/resource/Australian_War_Memorial"> <obo:_located_in> <rdf:Description rdf:about="http://dbpedia.org/resource/Canberra"/> </obo:_located_in> <dbpprop:unveiled rdf:datatype="http://www.w3.org/2001/XMLSchema#int"> 1941 </dbpprop:unveiled> <rdf:Description></rdf:RDF>

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

RDF Graph

Collection of triples referencing common resource(s) = a graph

Australian War Memorial located_in Canberra

Canberra located_in ACT

Australian War Memorial unveiled 1941

Australian_War_Memorial

Canberra

{1941}

ACT

_located_in

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

RDF ‘Triplestore’ – Database for triples

Basically a database for triples, stored as graphs

Triplestores not only store triples, but give you ways of getting them out in interesting ways… (e.g., via queries – SPARQL)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Australian_War_Memorial

Canberra

{1941}

ACT

_located_in

Data modelRelational data (tables)

Data instancesRecords in tables

Query supportSQL

Indexing mechanismOptimized for evaluating queries as relational expressions

Relational Database RDF Triplestore

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Data modelRDF graphs

Data instancesRDF triples

Query supportSPARQL

Indexing mechanismOptimized for evaluating queries as graph patterns

Language for querying (and constructing new) RDF graphs A W3C Recommendation (version 1.1 released in January 2012) Syntax similar to SQL…

SPARQL Protocol and RDF Query Language (SPARQL)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {

dbpedia:Australian_War_Memorial obo:_located_in ?l .}

SPARQL – Anatomy (simple SELECT)

PREFIXDefine prefix for URI to make SPARQL more readable

SELECT Variables (columns) in the result (here, ?l)

WHEREThe graph pattern to look for involving the selected variables…“Where [?l] is the Australian_War_Memorial _located_in?”

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

SPARQL: Querying an RDF graph

e.g. “Where is the Australian_War_Memorial _located_in?”

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {

dbpedia:Australian_War_Memorial obo:_located_in ?l .}

Australian_War_Memorial

Canberra

{1941}

ACT

_located_in

SPARQL: Querying an RDF graph

e.g. “Where is the Australian_War_Memorial _located_in?”

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

PREFIX dbpedia:<http://dbpedia.org/resource/>PREFIX obo:<http://purl.obolibrary.org/obo/OBO_REL#>SELECT ?lWHERE {

dbpedia:Australian_War_Memorial obo:_located_in ?l .}

Australian_War_Memorial

Canberra

{1941}

ACT

_located_in

SPARQL: Querying an RDF graph

e.g. “Where is the Australian_War_Memorial _located_in?”

Result:

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

?l

http://dbpedia.org/resource/Canberra

Australian_War_Memorial

Canberra

{1941}

ACT

_located_in

RDF Schema (RDFS)

Describes structures of triples into particular graphs Assigns semantics to RDF triples

(…just as E-R does for relational data!) Used in describing vocabularies and ontologies over RDF

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

RDFS: Structure + Semantics over RDF

Class (set) of instances (set members): E.g., class ‘City’ – the set of all cities

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

City

Canberra

RDFS: Structure + Semantics over RDF

Subclass (subset) of a class (set) of instances (set members): Class ‘CapitalCity’ (subclass of ‘City’) –

defined as the set of all cities that are the capital of a federated state

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

City

Canberra

CapitalCity

City

StateCapital

CapitalCity

RDFS: Structure + Semantics over RDF

Subclass (subset) of a class (set) of instances (set members): Class ‘StateCapital’ (subclass of ‘City’) –

defined as the set of all cities that are capitals of a state within a federated state

Cities that are CapitalCity and StateCapital

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Canberra

RDFS: Structure + Semantics over RDF

Properties (set set) of instances (set members) over classes:

observes: Sensor Observation

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

City

_located_in

_located_in

Federated State

StateCapital

CapitalCity

Canberra ACT

More (+ expressive) structure and semantics…

Web Ontology Language (OWL)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

FederalStateCapital = CapitalCity StateCapital

City

_located_in

_located_in

Federated State

StateCapital

CapitalCity

That’s enough…

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

... for putting Data openly on the Web …

and linking it to other data …

a.k.a. Linked Open Data Building a web of data Semantic Web done right Combination of Openess with data + open standards uses RDF with de-referenceable URIs → not only to identify data

but actually pointing to data about the entity

What is this all useful for?

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Open

Data

Linked

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Linking data paves the way for all kinds of improvements in: search, filtering, automation…

Real Business Value “I found that RDFa was a much more stable concept – based on the use of long established

vocabularies (also known as ontologies) that have existed for years. …Within just a couple of months, we began to see an increase in our organic search results … by 30% over historical rates. We also saw an increase in our click-through rate.” [Jay Myers, Lead Development Engineer Best Buy]

“2012: The Year of the Semantic Web” [Steve Hamby, CTO, Orbis Technologies, Inc.]

Enormous amount of linked (RDF) data already available• Many vocabularies to get your data described and linked into this world• Estimated tens (to hundreds) of billions of triple statements• Now (relatively) easier for people to join in!

Motivation

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

LOD ‘Cloud’ Growth: September 2010

[Richard Cyganiak, Anja Jentzsch, Linking Open Data cloud diagram, 2010]

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

LOD ‘Cloud’ Growth: September 2011

[Richard Cyganiak, Anja Jentzsch, Linking Open Data cloud diagram, 2011]

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:

1. Use URIs for names of things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)

4. Include links to other URIs so that a user can discover more things.

The Four Linked Open Data Principles

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Source: http://www.w3.org/DesignIssues/LinkedData.html

Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:

1. Use URIs for names of things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)

4. Include links to other URIs so that a user can discover more things.

The Four Linked Open Data Principles

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Use URIs for names of things – Example:

Large US retailer

Loads of different products for sale!

Data include:Product categories(computers, videogames, televisions,digital cameras, mp3players, mobile phones,appliances, etc…)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Product data: Price Model # ID Stock levels Rating Reviews Images etc…

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Identify things to reference with a URI

Store data: Names Locations Phone # Hours Ratings/Reviews Events etc…

Source: http://www.readwriteweb.com/archives/how_best_buy_is_using_the_semantic_web.php

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Identify things to reference with a URI

Which URI to usefor the location

Carbondale, Illinois, USA?

Choose URIs for things in your data

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

http://www.ci.carbondale.il.us/

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Which URI to usefor the location

Carbondale, Illinois, USA?

Choose URIs for things in your data

http://en.wikipedia.org/wiki/Carbondale,_Illinois

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Choose URIs for things in your data

Which URI to usefor the location

Carbondale, Illinois, USA?

Problem:http://www.ci.carbondale.il.us/ http://en.wikipedia.org/wiki/Carbondale,_Illinois

These are webpages about Carbondale, IL!

Instead, we want some identifier for the town Carbondale, IL.

DBpedia has an RDF Resource URI based on info from Wikipedia:

http://dbpedia.org/resource/Carbondale,_Illinois

Choose URIs for things in your data

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:

1. Use URIs for names of things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)

4. Include links to other URIs so that a user can discover more things.

The Four Linked Open Data Principles

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

DBpedia is a major Linked Open Data contributor!

So, what happens when we put: http://dbpedia.org/resource/Carbondale,_Illinois

… into a browser?

The dbpedia.org server responds with an HTTP redirect:303 SEE OTHERLOCATION: http://dbpedia.org/page/Carbondale,_Illinois

Use de-referenceable HTTP URIs

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

The browser (client) performs content negotiation with the server:ACCEPT parameters in the HTTP request specified HTML, e.g.:

Accept: text/*, text/html, text/html;level=1, */*

The client could have asked for RDF instead!

Accept: application/rdf+xml

… or other syntaxes too (N3, Turtle, etc.):

Accept: application/rdf+xml, text/rdf+n3, …

Content negotiation

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

http://dbpedia.org/page/Carbondale,_Illinois

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Content negotiation

So, what happens when we ask for: http://dbpedia.org/resource/Carbondale,_Illinois including:

Accept: application/rdf+xml

...?

The dbpedia.org server responds with another, different redirect:303 SEE OTHERLOCATION: http://dbpedia.org/data/Carbondale,_Illinois

The server may provide the triple data in RDF/XML format...

Content negotiation

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

http://dbpedia.org/data/Carbondale,_Illinois

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Content negotiation

Three Different URIs for different data about the same thing:1. An abstract identifier (RDF resource) for the ‘thing’ in question2. An HTML page about that ‘thing’3. Some RDF data about that ‘thing’

Example: Carbondale, IL on dbpedia.org:

1. Abstract identifier: http://dbpedia.org/resource/Carbondale,_Illinois

2. HTML page: http://dbpedia.org/page/Carbondale,_Illinois

3. RDF data: http://dbpedia.org/data/Carbondale,_Illinois

Different URIs for different content!

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Broken links are very uncool

URIs that don’t change are cool. Should strive to maintain URIs! Even if content moves,

server should still responds appropriately.

Make sure your URIs are cool

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

The Four Linked Open Data Principles

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:

1. Use URIs for names of things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)

4. Include links to other URIs so that a user can discover more things.

RDF data usually resides in an RDF database (Triplestore)

…how do we ‘put them out’ on the web?

SPARQL Endpoint (SPARQL query over HTTP)• Direct connection to triplestore over HTTP

Publish RDF data in files on your web server• Might not even need a triplestore…

Providing RDF on the Web

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Problem: Web data (HTML) and RDF data are separate.

Separate HTML content and RDF data?

WebPage

(HTML)RDFdata

(XML)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Exposing RDF on the web

Separate HTML content and RDF data?

Maintenance problem• Both need to be managed separately• RDF content and web content have much overlap (redundancy)

duplication of content, effort = data integrity issues• RDF/XML difficult to author = extra overhead

Verification problem• How to reconcile differences as content changes?

Visibility problem• Easy to ignore the RDF stuff!

(out of sight, out of mind)

Exposing RDF on the web

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Embed RDF into your web content using RDFa

WebPage

(HTML)RDFdata

(XML)

‘Embed’ RDF into HTML instead!

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Exposing RDF on the web

Embed RDF into your web content using RDFa

Extra (RDFa) markup is ignored by web browsers.

RDF – in attributes (RDFa)

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Source: Peter Mika (Yahoo!), RDFa, 2011

…but is “one of the… what I call a gateway drug… to the Semantic Web.” Jay Myers, Lead Development Engineer for Best Buy

RDFa is not the only solution…

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

The Four Linked Open Data Principles

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Tim Berners-Lee outlined 4 principles for Publishing Linked Open Data:

1. Use URIs for names of things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a HTTP URI, provide useful information (using the standards, like RDF)

4. Include links to other URIs so that a users can discover more things.

For describing classes (categories) and properties (relationships), try to re-use existing vocabularies

Easier to interoperate if we’re talking the same language!

Many vocabularies/ontologies out there: schema.org is a great place to start looking! Vocabs for products (Good Relations), people (FOAF), social media (SIOC), places,

events, businesses, e-commerce, music, etc., you name it…

If nothing relevant, you can: Hack existing vocabularies, then publish your dirty hacks on http://open.vocab.org/ Create your own, but make sure you…

Publish it!Reconcile (map) it to other vocabularies, if you can.

Link to existing vocabularies to describe data!

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Linking similar entities (individuals/ concepts) from different datasets (“entity resolution”)

Several ways, different semantics: owl:equivalentClass – strong assertion! owl:sameAs – strong assertion! rdfs:seeAlso – weak (loose) relation skos:closeMatch – weak (loose) relation skos:exactMatch – stronger than close match skos:related – weak semantic relation

http://dbpedia.org/resource/Canberra owl:sameAs

http://rdf.freebase.com/rdf/en.canberra

Link your data!

CSIRO. Semantic Web 101 – Canberra Semantic Web meetup

Thank you!ICT CentreDavid RatcliffePhD Student / Software Engineer

Phone: +61 2 6216 7001David.Ratcliffe@csiro.au

ICT CentreDr. Armin HallerOffice Manager, Australian W3C Office

Phone: +61 2 6216 7149Armin@w3.org

ICT CENTRE, INFORMATION ENGINEERING LABORATORY