Linked data: Four rules and five stars for the Amsterdam Museum

47
Linked Data Victor de Boer Slide stolen from Christophe Gueret

description

Slides used for a guest lecture about Linked Data for the course "Knowledge and Media" at the VU Amsterdam (Nov 2011). The talk takes the practical example of converting Amsterdam Museum data to Five-star Linked Open Data.

Transcript of Linked data: Four rules and five stars for the Amsterdam Museum

Page 1: Linked data: Four rules and five stars for the Amsterdam Museum

Linked Data

Victor de BoerSlide stolen from Christophe Gueret

Page 2: Linked data: Four rules and five stars for the Amsterdam Museum

Why Linked Data?

Page 3: Linked data: Four rules and five stars for the Amsterdam Museum

Why linked data (1/2)

Slide stolen from Christophe Gueret

Page 4: Linked data: Four rules and five stars for the Amsterdam Museum

Why linked data (2/2)

Slide stolen from Christophe Gueret

Page 5: Linked data: Four rules and five stars for the Amsterdam Museum

``Sharable, spreadable and nerd-friendly’’

-- Charlotte S H Jensen, kulturweb

Page 6: Linked data: Four rules and five stars for the Amsterdam Museum
Page 7: Linked data: Four rules and five stars for the Amsterdam Museum

Four rules of Linked Data

1. Use URIs as names for things (Resources)

2. Use HTTP URIs so that people can look up those names. (Dereferencing)

3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 8: Linked data: Four rules and five stars for the Amsterdam Museum

★ Available on the web (whatever format), but with an open license

★★Available as machine-readable structured data (e.g. excel instead of image scan of a table)

★★★ as (2) plus non-proprietary format (e.g. CSV instead of excel)

★★★★

All the above plus, Use open standards from W3C (RDF and SPARQL) to identify things, so that people can point at your stuff

★★★★★All the above, plus: Link your data to other people’s data to provide context

www.w3.org/designissues/linkeddata.html

Linked Open Data five star system

Page 9: Linked data: Four rules and five stars for the Amsterdam Museum

Linked Data Cloud Diagram

Page 10: Linked data: Four rules and five stars for the Amsterdam Museum

May 2007

Page 11: Linked data: Four rules and five stars for the Amsterdam Museum

Oct 2007

Page 12: Linked data: Four rules and five stars for the Amsterdam Museum
Page 13: Linked data: Four rules and five stars for the Amsterdam Museum
Page 14: Linked data: Four rules and five stars for the Amsterdam Museum

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Page 15: Linked data: Four rules and five stars for the Amsterdam Museum

“Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/”

Page 16: Linked data: Four rules and five stars for the Amsterdam Museum

Amsterdam Museum as Linked Open Data

Page 17: Linked data: Four rules and five stars for the Amsterdam Museum

Use case on how to transform “raw” XML data into 5-star Linked Open Data

Page 18: Linked data: Four rules and five stars for the Amsterdam Museum

Europeana• “Europeana enables people to explore the digital

resources of Europe's museums, libraries, archives and audio-visual collections.’’

www.europeana.eu

From portal… …to data aggregator.

Page 19: Linked data: Four rules and five stars for the Amsterdam Museum

Amsterdam Museum

• Formerly Amsterdam Historic Museum– “The rich collection of works of art, objects and

archaeological finds brings to life the fortunes of Amsterdammers of days gone by and today.”

• In March 2010 published their whole collection online– 70.000 objects– CC license

• We converted their data to RDF

Page 20: Linked data: Four rules and five stars for the Amsterdam Museum

AM metadata• Adlib database XML API

• Object metadata • 73.000 objects, 256MB • Nested XML

• Concept Thesaurus• 27.000, 9MB• Different types (geo,motif, event)

• Person ‘Thesaurus’• 67.000 persons, 10MB• Consolidated from object metadata fields• Creators, annotators, reproduction

creators, institutions,

<record priref="10541“ > <acquisition.date>1997</acquisition.date> <dimension> <dimension.type>hoogte</dimension.type> <dimension.unit>cm</dimension.unit> <dimension.value>6</dimension.value> </dimension> …</record>

<record priref="28024“ > <term>Kalverstraat 124</term> <broader_term>Kalverstraat</broader_term> <term.type>GEOKEYW </term.type> </record>

<record priref="6" > <biography>boekverkoper en uitgever van cartografie</biography> <birth.date.start>1659</birth.date.start> <death.date.start>1733</death.date.start> <name>Aa, Pieter van der</name> <nationality>Nederlands</nationality> <use>Aa, Pieter van der (I)</use> </record>

Page 21: Linked data: Four rules and five stars for the Amsterdam Museum
Page 22: Linked data: Four rules and five stars for the Amsterdam Museum

Back to the four rules of Linked Data

1. Use URIs as names for things 2. Use HTTP URIs so that people can look up

those names. 3. When someone looks up a URI, provide

useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 23: Linked data: Four rules and five stars for the Amsterdam Museum

How to make cool URI’s

Use HTTP://Use a namespace you controlUnique, stable and persistent

• Don’t use:– Author name, subject, status, access, file name

extension, software mechanismC://MyDisk/awesome/VdeBoer/latest/cgi_bin/rembrandt.html

Page 24: Linked data: Four rules and five stars for the Amsterdam Museum

Amsterdam Museum URIs• PURL basename: http://purl.org/collections/nl/am/

• Objects: Use “prirefs”, prefixed by “proxy-”– http://purl.org/collections/nl/am/proxy-63432

• Concepts & Persons: Use “prirefs”, prefixed by “p-”, or “t-” – http://purl.org/collections/nl/am/p-201

• Properties (schema): Use XML element name – http://purl.org/collections/nl/am/acquisition.date

PS: am:p-1234 is a shorthand for http://

purl.org/collections/nl/am/p-1234

Page 25: Linked data: Four rules and five stars for the Amsterdam Museum

Again, the rules of Linked Data

1. Use URIs as names for things 2. Use HTTP URIs so that people can look up

those names. 3. When someone looks up a URI, provide

useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 26: Linked data: Four rules and five stars for the Amsterdam Museum

RDF reminderSubject Predicate Object

am:Rembrandt am:hasBirthdate “1651” Triples

am:Rembrandt

“1651”am:hasBirthdate

Graphfoaf:knows

am:PiterLastman

geonames:Amsterdam

am:wasBornIn

am:Rembrandt foaf:knows am:PiterLastman

am:PiterLastman am:wasBornIn geonames:Amsterdam

Page 27: Linked data: Four rules and five stars for the Amsterdam Museum

RDF conversion<record priref="19319 “ > <date>1651</date> <maker>Rembrandt (1606-1669)</maker> <object.type>etsplaat</object.type> …</record>

am:Record_:bn1

“19319 ”

“1651”

priref

date

am:Personam:p-1234

skos:Conceptam:etsplaat

“1234”

“1606”

am:prirefam:birthdate

“etsplaat”

maker

object.type

“Rembrandt (1606-1669)”

“etsplaat”

am:Recordam:proxy-19319

“19319 ”

“1651”am:priref

am:date

am:maker

am:object.type

“Rembrandt”rda:name

skos:prefLabel

Page 28: Linked data: Four rules and five stars for the Amsterdam Museum

Architecture

RDF(s) storage

HTTP server

SPARQL

Prolog

Web interface

SPARQL-app Browser

Logic

Purl.org redirect

http://semanticweb.cs.vu.nl/

cliop

atria

Page 29: Linked data: Four rules and five stars for the Amsterdam Museum

How to access the data

• PURL 303 redirect to VU semantic layerhttp://purl.org/collections/nl/am/proxy-63432 http://semanticweb.cs.vu.nl/europeana/browse/list_resource?r=http://purl.org/collections/nl/am/proxy-63432

• At our server: content negotiation– HTTP request text/html:

• Local condensed view• Local full view

– HTTP request application/rdf+xml• rdf/xml “describe”

• SPARQL endpoint

Page 30: Linked data: Four rules and five stars for the Amsterdam Museum

text/html

Page 31: Linked data: Four rules and five stars for the Amsterdam Museum

text/html

Page 32: Linked data: Four rules and five stars for the Amsterdam Museum

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix ore: <http://www.openarchives.org/ore/terms/> .@prefix ens: <http://www.europeana.eu/schemas/edm/> .@prefix ahm: <http://purl.org/collections/nl/am/>

ahm:proxy-66970a ore:Proxy ;ahm:title "Zegelstempel Felix Meritis"@nl ;ahm:material ahm:t-12463 ,

ahm:t-5447 ;ahm:objectCategory ahm:t-5504 ;ahm:objectName ahm:t-13817 ,

ahm:t-8489 ;ahm:objectNumber "KA 7653.1" ;ahm:priref "66970" .

ahm:proxy-66972a ore:Proxy ;ahm:acquisitionDate "0000" ;ahm:title "Zegelstempel mogelijk van familiewapen"@nl .

application/rdf+xml

Page 33: Linked data: Four rules and five stars for the Amsterdam Museum

http://semanticweb.cs.vu.nl/europeana/user/query

SPARQL

Page 34: Linked data: Four rules and five stars for the Amsterdam Museum

Again, the rules of Linked Data

1. Use URIs as names for things 2. Use HTTP URIs so that people can look up

those names. 3. When someone looks up a URI, provide

useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

Page 35: Linked data: Four rules and five stars for the Amsterdam Museum

Link to other sources

am:Personam:p-1234

“1234”

“1606”

am:prirefam:birthdateam:Record

am:proxy-19319

“19319 ”

“1651”am:priref

am:date

am:maker

“Rembrandt”rda:name

Viaf:PersonViaf:RebrandtvanRijn

“Dutch”Viaf:nationality

rdfs:label

“Rembrandt Harmensz. Van Rijn”

owl:sameAs (?)

Page 36: Linked data: Four rules and five stars for the Amsterdam Museum

Amalgame alignment platform

• Semi-automatic matching – Simple automatic techniques, – chained together by hand

• 3500+ links put in RDF– 143 places linked to

GeoNames– 1076 persons linked to ULAN

(VIAF)– 34 persons linked to DBPedia– 2498 concepts AATNed.

Page 37: Linked data: Four rules and five stars for the Amsterdam Museum

CKAN Data Hub

http://thedatahub.org/dataset/amsterdam-museum-as-edm-lod

Page 38: Linked data: Four rules and five stars for the Amsterdam Museum

Four rules and Five stars

1. Use URIs as names for things 2. Use HTTP URIs so that

people can look up those names.

3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL)

4. Include links to other URIs. so that they can discover more things.

Page 39: Linked data: Four rules and five stars for the Amsterdam Museum

And now applications!…right??

Page 40: Linked data: Four rules and five stars for the Amsterdam Museum

Developers still do this…

…although more and more of this is happening

Page 41: Linked data: Four rules and five stars for the Amsterdam Museum
Page 42: Linked data: Four rules and five stars for the Amsterdam Museum
Page 43: Linked data: Four rules and five stars for the Amsterdam Museum

Some issues with L(O)D

• Extra burden on the data provider• Nerd-only (aka “SPARQL is hard”)• How do we build user-friendly systems?

– Ranking, user-friendly information presentation

• Scalability (how do you query a huge graph?)

• Licenses• Is Open always a good idea?

– Context?

Page 44: Linked data: Four rules and five stars for the Amsterdam Museum

end

Page 45: Linked data: Four rules and five stars for the Amsterdam Museum

EDM

Page 46: Linked data: Four rules and five stars for the Amsterdam Museum

What kind of RDF?

• Europeana Data Model (EDM)– Keep original metadata intact– Use sem web (LD) principles: RDF

• Re-use of standard models– Dublin Core for metadata representation

• creator, date, title etc.

– SKOS for vocabularies• preferredLabel, hasBroader, etc.

Page 47: Linked data: Four rules and five stars for the Amsterdam Museum

EDM voorbeeld

proxyobject

metadataAggregation

Provenance +web

views/plaatjes

Physical Objectgeen

metadata