LOD Examplar - LOD Museum -

Post on 19-Aug-2015

1.129 views 3 download


Transcript of LOD Examplar - LOD Museum -

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LOD Application Exemplar- A case study: LODAC Museum

Hideaki TakedaFumi Kato

National Institute of Informaticstakeda@ nii.ac.jp


Hideaki Takeda, Fumi Kato / National Institute of Informatics

Aim of this talk

• How to plan, design, and implement LOD?• Learn from the case

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LODAC Project• Open Social Semantic Web Platform for Academic

Resources– Providing platforms for Linked Open Data– Practicing data accumulation and publishing

• Interested Areas– Museum information– Geographical information, especially geographical names– Local information– Taxonomic information on species– …


Hideaki Takeda, Fumi Kato / National Institute of Informatics

Linked Open Data Initiative


• Non Profit Organization– (Under application for approval)

• Academia + IT People + local people• Aim: facilitate LOD activities among local


Hideaki Takeda, Fumi Kato / National Institute of Informatics

Museum data as LOD

• The state-of-the-art of museum information in Japan (nearly 6,000 museums in Japan)– Distributed

• Self maintained• Isolated

– Opaque• Self designed• Messy

• Aggregating and associating museum information– LODAC-Museum

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LODAC Museum – Main work

• Gathering of data– Thesaurus, museum collections, etc

• Standardization of data– Representing data from different sources in a

unique form• Integration of data– Identifying data– Associating the same data

• Consuming of data

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LODAC Museum Architecture

Gathering of data

Standardization of data

Integration of dataConsuming of data

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Gathering data

• No museums publish data as LOD!• We use data published as Web pages– Scrape and translate data– License is not clear • It is a serous problem• We need permission from every site in principle• We got permission from some data publishers not all

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Gathering data

• No museums publish data as LOD!• We use data published as Web pages– Scrape and translate data– License is not clear • It is a serous problem• We need permission from every site in principle• We got permission from some data publishers not all

Hideaki Takeda, Fumi Kato / National Institute of Informatics

DatasetType No. Data source

Art work (lodac:Work)

ca.80,000 Catalog of the collections of 3 National Art Museum (25,180), National Museum of Western Art (4,373), Tokushima Pref. Art Museum (18,482) … over 100 museums

Database for National Treasure & Important Cultural Property of National Designated (915)

The Japanese Art Thesaurus (266)Specimen (lodac:Speciment)

ca.1,690,000 (100+ Museum collections)Science Net (National Science Museum)

Person (foaf:Person) ca. 8,800 The Japanese Art Thesaurus

Facilities (icls. Museum)

ca. 200,000 The Japanese Art ThesaurusCultural Heritage OnlineGIS data National and Regional Planning Bureau

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Extracting collection data from museum websites


Hideaki Takeda, Fumi Kato / National Institute of Informatics


Extracting collection data from museum websites

Property Value

Property Value

Hideaki Takeda, Fumi Kato / National Institute of Informatics


Standardization of dataRe-organized common metadata.

Raw Data





Re-organized Metadata

Current organized policies・ Use existing metadata・ Define own metadata.


Hideaki Takeda, Fumi Kato / National Institute of Informatics



Prefix Metadata Name


dc11 Dublin Core 1.1

dc DCMI Terms

skos Simple Knowledge Organization System

rdfs Resource Description Frame Work Schema

foaf Friend of a Friend

rda2 Resource Description and Access

lodac LODAC Project

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Metadata schema for works

lodac:Work PropertyGenre lodac:genreType of cultural assets lodac:culturalAssetsCreator dc:creator / dc11:creatorNationality crm:P7_took_place_atTitle dc:title / skos:prefLabelTitle Pronunciation (yomi) dc:title @ja-hrkt / skos:altLabelTitle in English dc:title @en / skos:altLabelInscription crm:P62I_is_depicted_bySeal crm:P65_shows_visual_itemNo. of parts crm:P57_has_number_of_partsCollection dc:isPartOfCreated year dc:createdEstimated starting year lodac:estimatedStartYearMaterial dc:medium / crm:P45_consists_of

Hideaki Takeda, Fumi Kato / National Institute of Informatics

(Ref-resource)Creator’s reference

(ID-resource)Creator’s information

dc:references dc:references

(Ref-resource)Creator’s reference

Integrating Data

• How to integrate data from different sources – sharing of responsibility• Each source is responsible for its data

– Identifying IDs for data and managing data with the IDs

• LODAC is only responsible for integration– Assigning original IDs and associating other IDs to them

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Integrating Data

Data from Source BIntegrated data

dc:references dc:references

dc:references dc:references

dc:references dc:references





Data from Source AWork



Minimum Data to identify entitiesRaw Data for entities Raw Data for entities

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Integration of Person Data• Matching of Creators– Base: List of Artists from Thesaurus of Japanese Art– Target: Creators of collection in museums + Dbpedia– Method: String match of names– Results: Links from artist nodes to work nodes are added

LODAC data

Link to Work


Basic Information for Creators


Hideaki Takeda, Fumi Kato / National Institute of Informatics


Integrating DataIntegrate Item Source Amount

of DataIntegration


FacilitiesA.Japanese Art Thesaurus 648

77B.Cultural Heritage Online 915

Title of important cultural properties

A.Japanese Art Thesaurus (Art work) 3,80074

B.DB for National Treasure (Art work) 10,115

Creator information and Work Title

A.Japanese Art Thesaurus (Creator) 1,33215,020

B.All of art work (Work title string) 61,861

Creator nameA.Japanese Art Thesaurus (Creator) 1,332

615B.All of art work title(using creator name) 61,861

Hideaki Takeda, Fumi Kato / National Institute of Informatics

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:lodac="http://lod.a

c/ns/lodac#" xmlns:dc="http://purl.org/dc/terms/"




<foaf:Person rdf:about="http://lod.ac/id/359">

<lodac:creates rdf:resource="http://lod.ac/id/20029"/>

<lodac:creates rdf:resource="http://lod.ac/id/20128"/>

<lodac:creates rdf:resource="http://lod.ac/id/20755"/>

<lodac:creates rdf:resource="http://lod.ac/id/24768"/>

<lodac:creates rdf:resource="http://lod.ac/id/26732"/>


<dc:references rdf:resource="http://ja.dbpedia.org/resource/ 下村観山 "/>

<dc:references rdf:resource="http://lod.ac/ref/359"/>

<rdfs:label xml:lang="ja"> 下村観山 </rdfs:label>

<skos:prefLabel xml:lang="ja"> 下村観山 </skos:prefLabel>

<foaf:name xml:lang="ja"> 下村観山 </foaf:name>



Publishing data as RDF

ID-resource URI(Own address)


Ref-resource URIhttp://lod.ac/ref/359

External linkDBpedia Japanese

Links to her/his work URI

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LODAC Museum Architecture

Gathering of data

Standardization of data

Integration of dataConsuming of data

Hideaki Takeda, Fumi Kato / National Institute of Informatics

LODAC Applications

• Photo BURARI Pro• Yokohama Art Spot• Go2Museum• http://lod.ac/apps

Hideaki Takeda, Fumi Kato / National Institute of Informatics


Photo BURARI Pro

Photo App with SPARQL


Hideaki Takeda, Fumi Kato / National Institute of Informatics

• SPARQL Endpoints– DBpedia– Linked Geo Data– LODAC

• Other data source– Sinsai.info

• Using JSON Result– JSON Framework for

Objective C

Photo BURARI Pro(C)ATR-Promotions,Inc

Hideaki Takeda, Fumi Kato / National Institute of Informatics

An example in Objective C

NSString* sparql = @” PREFIX dct: <http://purl.org/dc/terms/ > PREFIX omgeo: <http://www.ontotext.com/owlim/geo#> PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT distinct ?link ?title ?lat ?long WHERE{ ?link dct:references ?ref. ?ref rdfs:label ?title. ?ref geo:lat ?lat. ?ref geo:long ?long. ?ref omgeo:within(NW_lat NW_long SE_lat SE_long). } LIMIT 30” ;NSString* query = (NSString*)CFURLCreateStringByAddingPercentEscapes(kDFAllocatorDefault, (CFStringRef)sparql, NULL, CFSTR(“;,/?:@=+$#”), kCFStringEncodingUTF8) ;

NSURL *url = [NSURL URLWithString: query ];NSMutableURLRequest *req = [NSMutableURLRequest requestWithURL:url]; [req setValue:@”application/sparql-results+json” forHTTPHeaderField:@”Accept”];

NSURLResponse *resp;NSError *err;NSData *data = [NSURLConnection sendSynchronousRequest:req returningResponse:&resp error:&err]; NSString* result = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Yokohama Art Spot

–Application using museum and local data–Data related to art in

Yokohama• Collections• Events• Q&A


LODAC Museum   ×   Yokohama Art LOD   ×   PinQA

Hideaki Takeda, Fumi Kato / National Institute of Informatics

System Architecture


InstitutionArtistArtist Institution



PinQAYokohama Art LOD

LODAC Museum





Yokohama Art Spot

‣ Python + SPARQLWrapper‣ Geolocation

Hideaki Takeda, Fumi Kato / National Institute of Informatics

PREFIX ical: <http://www.w3.org/2002/12/caaltzd#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-sl/icchema#>PREFIX event: <http://lod.ac/ns/event#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dc: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX lodacid: <http://lod.ac/id/>PREFIX omgeo: <http://www.ontotext.com/owlim/geo#>PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>

SELECT distinct ?event ?lat ?long ?title ?location_name ?location ?fee ?dtstart ?dtend WHERE { ?event a event:Event ; rdfs:label ?title ; event:fee ?fee; ical:location ?location ; ical:dtstart ?dtstart ; ical:dtend ?dtend . ?location rdfs:label ?location_name ; dc:references ?locRef. ?locRef omgeo:within(%(NE_lat)s %(NE_long)s %(SW_lat)s %(SW_long)s); vcard:postal-code ?postalcode; geo:lat ?lat; geo:long ?long. FILTER ((?dtstart > "%(dtstart)s"^^xsd:dateTime && ?dtstart < "%(dtend)s"^^xsd:dateTime) || (?dtend > "%(dtstart)s"^^xsd:dateTime && ?dtend < "%(dtend)s"^^xsd:dateTime) || (?dtstart < "%(dtstart)s"^^xsd:dateTime && ?dtend > "%(dtend)s"^^xsd:dateTime))}ORDER BY (omgeo:distance(?lat, ?long, %(C_lat)s, %(C_long)s))

Hideaki Takeda, Fumi Kato / National Institute of Informatics

PREFIX ical: <http://www.w3.org/2002/12/cal/icaltzd#>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX event: <http://lod.ac/ns/event#>PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>PREFIX dc: <http://purl.org/dc/terms/>PREFIX foaf: <http://xmlns.com/foaf/0.1/>PREFIX lodacid: <http://lod.ac/id/>PREFIX dc11: <http://purl.org/dc/elements/1.1/>

SELECT *WHERE { ?link a event:Event ; rdfs:label ?title ; event:fee ?fee; ical:categories ?cat; ical:location %(museum_id)s ; ical:dtstart ?dtstart ; ical:dtend ?dtend . ?cat dc11:title ?category. OPTIONAL{ ?link event:Credit ?crd . ?crd dc11:description ?credit . }}

Hideaki Takeda, Fumi Kato / National Institute of Informatics

PREFIX dc: <http://purl.org/dc/terms/>PREFIX dc11: <http://purl.org/dc/elements/1.1/>PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>PREFIX lodac: <http://lod.ac/ns/lodac#>PREFIX lodacid: <http://lod.ac/id/>

SELECT ?link ?title ?creator ?created ?genre ?material ?sizeWHERE { %(museum_id)s lodac:isProviderOf ?link . ?link rdfs:label ?title; dc:references ?workRef . ?workRef lodac:genre %(genre)s; dc11:creator ?creator; dc:medium ?material; dc:extent ?size . OPTIONAL{ ?workRef dc:created ?created; }}LIMIT 100

Hideaki Takeda, Fumi Kato / National Institute of Informatics


Hideaki Takeda, Fumi Kato / National Institute of Informatics

iPhone Android

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Museum data from various web sites











Hideaki Takeda, Fumi Kato / National Institute of Informatics

Twitter: @go2museum

• “Today’s museum”• Recommendation based on lat&long of tweets

Hideaki Takeda, Fumi Kato / National Institute of Informatics


• A life cycle of data is described– Scraping, standardizing, integrating, and publishing

• Important issues– Recognizing data– Designing schema

• Good for data• Good for RDF Store and SPARQL

– Developing applications• More people can be involved• Next cycle of data

Hideaki Takeda, Fumi Kato / National Institute of Informatics

Hideaki Takeda, Fumi Kato / National Institute of Informatics

• Please submit papers• Meet at Nara