Data translation with SPARQL 1.1

Post on 03-Jul-2015

3.364 views 0 download

Transcript of Data translation with SPARQL 1.1

WWW 2012 Tutorial

Schema Mapping with SPARQL 1.1

Andreas Schultz(Freie Universität Berlin)

Outline

Why Do We Want to Integrate Data? Schema Mapping Translating RDF Data with SPARQL 1.1 Mapping Patterns

Motivation

Web of Data is heterogeneous Many different and overlapping ways to

represent information

Distribution of the most widely used vocabularies

Data is represented...

Using terms from a wide range of vocabularies Using diverging structures With values of different (data type) formats Fine grained vs. coarse grained Using different measuring units

Naming Differences

SELECT ?longTrack ?runtime{ { ?longTrack mo:duration ?time

} UNION {

?longTrack dbpedia-owl:runtime ?time

} FILTER (?track > 300)}

Structural Differences

dbpedia:Three_Little_Birds dbpedia-owl:musicalArtist dbpedia:Bob_Marley_&_The_Wailers .

dbpedia:Bob_Marley_&_The_Wailers foaf:made dbpedia:Three_Little_Birds .

Value Level Differences

“2012-04-15” vs. “2012-04-15”^^xsd:date “John Doe” vs. “John Doe”@en 20 in Celcius vs. 68 in Fahrenheit “Doe, John” vs. “John Doe”

Outline

Motivation Schema Mapping Translating RDF Data with SPARQL 1.1 Mapping Patterns

Schema Mapping

For data translation:

A mapping specifies how data under a source representation is translated to a target

representation, that is, the representation that you or your application expects.

Ways to Express Executable Mappings for RDF Data

Ontology constructs OWL, RDFS (rdfs:subClassOf, rdfs:subPropertyOf)

Rules SWRL RIF

Query Languages SPARQL 1.1

Outline

Motivation Schema Mapping Translating RDF Data with SPARQL 1.1 Mapping Patterns

Data Translation with SPARQL 1.1

Nearly W3C Recommendation status Many scalable SPARQL engine

implementations out there Data translation with SPARQL CONSTRUCT Huge improvements to version 1.0 regarding

data translation

How to Use SPARQL Construct Mappings Ideally

SELECT ?longTrack ?runtimeFROM { Construct { ?subj target:relevantProperty ?obj } WHERE { ?subj sour:relProperty ?obj } } WHERE { ?subj target:relevantProperty ?obj . …}

How to Use SPARQL Construct Mappings in Practice

SELECT ?longTrack ?runtimeFROM { Construct { ?subj target:relevantProperty ?obj } WHERE { ?subj sour:relProperty ?obj } } WHERE { ?subj target:relevantProperty ?obj . …}

Possibilities

Execute all SPARQL Construct queries on the source data set(s) to generate local versions of the target data set(s)

Optionally merge multiple target data sets into one

Reference these files in FROM clause Possibility we won't cover: Write the results

directly into a RDF store and query the store.

1. Transform Data Sets with ARQ

query –query=constructQuery.qry –data=sourceDataset.nt > targetDataset.ttl

# Execute more queries

2. Use Target Data Set Files in Query

SELECT ?longTrack ?runtimeFROM <targetDataset.ttl>FROM …WHERE { ?subj target:relevantProperty ?obj . …}

Outline

Motivation Schema Mapping Translating RDF Data with SPARQL 1.1 Mapping Patterns

Pattern based approach

Presentation of common mapping patterns Ordered from common to not so common Learn how to tackle these mapping patterns

with SPARQL 1.1

Simple Renaming Mapping Patterns

Rename Class / Property

Substitute the class or property URI

src:inst a mo:MusicArtist

src:inst a dbpedia-owl:MusicalArtist

Rename Class

SPARQL Mapping:

CONSTRUCT {

?s a mo:MusicArtist

} WHERE {

?s a dbpedia-owl:MusicalArtist

}

Rename Property

SPARQL Mapping:

CONSTRUCT {

?s rdfs:label ?o

} WHERE {

?s freebase:type.object.name ?o

}

Structural Mapping Patterns

Rename Class based on Property Existence

Rename class based on the existence of a property relation.

dbpedia:William_Shakespeare a fb:people.deceased_person .

dbpedia:William_Shakespeare a dbpedia-owl:Person ; dbpedia-owl:deathDate "1616-04-23"^^xsd:date .

Rename Class based on Property

SPARQL Mapping:

CONSTRUCT {

?s a freebase:people.deceased_person

} WHERE {

?s a dbpedia-owl:Person ; dbpedia-owl:deathDate ?dd .

}

Rename Class based on Value

Instances of the source class become instances of the target class if they have a specific property value.

gw-p:Kurt_Joachim_Lauk_euParliament_1840_P ; a fb:government.politician .

gw-p:Kurt_Joachim_Lauk_euParliament_1840_P a gw:Person ; gw:profession "politician"^^xsd:string .

Rename Class based on Value

SPARQL Mapping:

CONSTRUCT {

?s a fb:government.politician

} WHERE {

?s a gw:Person ; gw:profession "politician"^^xsd:string .

}

Reverse Property

The target property represents the reverse relationship regarding the source property.

dbpedia:Joey_Castillo mo:member_of dbpedia:Queens_of_the_Stone_Age .

dbpedia:Queens_of_the_Stone_Age dbpedia-owl:currentMemberdbpedia:Joey_Castillo .

Reverse Property

SPARQL Mapping:

CONSTRUCT {

?s mo:member_of ?o

} WHERE {

?o dbpedia-owl:currentMember ?s

}

Resourcesify

Represent an attribute by a newly created resource that then carries the attribute value.

dbpedia:The_Usual_Suspects po:version _:new ._:new po:duration 6360.0 .

dbpedia:The_Usual_Suspects dbpedia-owl:runtime 6360.0 .

Resourcesify

SPARQL Mapping:

CONSTRUCT {

?s po:version _:newversion . _:newversion po:runtime ?runtime .

} WHERE {

?s dbpedia-owl:runtime ?runtime .

}

Deresourcesify

Inverse pattern to the Resourcesify pattern.

dbpedia:John_F._Kennedy_International_Airport lgdp:owner "New York City" .

dbpedia:John_F._Kennedy_International_Airport dbpedia-owl:city dbpedia:New_York_City .dbpedia:New_York_City rdfs:label "New York City" .

Deresourcesify

SPARQL Mapping:

CONSTRUCT {

?s lgdp:owner ?cityLabel .

} WHERE {

?s dbpedia-owl:city ?city . ?city rdfs:label ?cityLabel .

}

Value Transformation based Mapping Patterns

SPARQL 1.1 functions

SPARQL 1.1 offers functions covering:

RDF terms (str, lang, IRI, STRDT etc.)

Strings (SUBSTR, UCASE etc.)

Numerics (abs, round etc.)

Dates and Times (now, year, month etc.)

Hash Functions (SHA1, MD5 etc.)

XPath Constructor Functions (xsd:float etc.)

Transform Value 1:1

Transform the (lexical) value of a property.

dbpedia:The_Shining_(film) movie:runtime 142 .

dbpedia:The_Shining_(film)dbpedia-owl:runtime 8520 .

Transform Value 1:1

SPARQL Mapping:

CONSTRUCT {

?s movie:runtime ?runtimeInMinutes .

} WHERE {

?s dbpedia-owl:runtime ?runtime . BIND(?runtime / 60 As ?runtimeInMinutes)

}

BIND( Expression As ?newVariable )

Transform Literal to URI

Transform a literal value into a URI.

dbpedia:Von_Willebrand_disease diseasome:omim <http://bio2rdf.org/omim:193400> .

dbpedia:Von_Willebrand_diseasedbpedia-owl:omim 193400 .

Also 1:1 transformation pattern!

Transform Literal to URI

SPARQL Mapping:

CONSTRUCT {

?s diseasome:omim ?omimuri .

} WHERE {

?s dbpedia-owl:omim ?omim . BIND(IRI(concat(“http://bio2rdf.org/omim:”, str(?omim))) As ?omimuri)

}

Construct Literals

SPARQL 1.1 offers several functions to construct literals:

STR – returns the lexical form → plain literal STRDT – construct data type literal STRLANG – construct literal with language tag

Cast to another Datatype

fb:en.clint_eastwood dbpedia-owl:birthDate “1930-05-31”^^xsd:date .

fb:en.clint_eastwoodfb:people.person.date_of_birth

“1930-05-31T00:00:00”^^xsd:dateTime .

Cast to another Datatype

SPARQL Mapping:

CONSTRUCT {

?s dbpedia-owl:birthDate ?date

} WHERE {

?s fb:people.person.date_of_birth ?dateTime . BIND(xsd:date(?dateTime) As ?date)

}

Transform Value N:1

dbpedia:William_Shakespeare foaf:name "Shakespeare, William" .

dbpedia:William_Shakespeare foaf:givenName "William" ; foaf:surname "Shakespeare" .

Transform multiple values from different properties to a single value.

Transform Value N:1

SPARQL Mapping:

CONSTRUCT {

?s foaf:name ?name

} WHERE {

?s foaf:givenName ?givenName ; foaf:surname ?surname . BIND(CONCAT(?surname, “, ”, ?givenName) As ?name)

}

Aggregation based Mapping Pattern

Aggregation

fb:en.berlin_u-bahndbpedia-owl:numberOfLines 2 .

fb:en.berlin_u-bahn fb:metropolitan_transit.transit_system.transit_lines fb:en.u1 ; fb:metropolitan_transit.transit_system.transit_lines fb:en.u3 .

Aggregate multiple values / occurrences into one value.

Aggregation

SPARQL Mapping:

CONSTRUCT {

?s dbpedia-owl:numberOfLines ?nrOfLines

} WHERE {

{ SELECT ?s (COUNT(?l) AS ?nrOfLines) { ?s fb:metropolitan_transit.transit_system.transit_lines ?l . } GROUP BY ?s }}

Data Cleaning

Data Cleaning

fb:en.jimi_hendrix dbpedia-owl:birthDate "1942-11-27"^^xsd:date

fb:en.jimi_hendrix fb:people.person.date_of_birth "1942-11-27" .

fb:en.josh_wink fb:people.person.date_of_birth "1970" .

Integrating data cleaning into a mapping.

Data Cleaning

SPARQL Mapping:

Construct {

?s dbpedia-owl:birthDate ?birthDate

} Where {

?s fb:people.person.date_of_birth ?bd . FILTER regex(?bd, “[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]”) BIND ((xsd:date(?bd)) As ?birthDate)

}

Questions?