Improve Efficiency of Mapping Data Between XML and RDF with XSPARQL
-
Upload
stefan-bischof -
Category
Education
-
view
652 -
download
1
Transcript of Improve Efficiency of Mapping Data Between XML and RDF with XSPARQL
13/03/2008 FAST kick-off, Madrid, 2008 Copyright 2011 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Int. Conf. on Web Reasoning and Rule SystemsAugust 28, 2011
Improve Efficiency of Mapping Data between XML and RDF with XSPARQL
Stefan Bischof, Nuno Lopes, and Axel Polleres
1
Digital Enterprise Research Institute www.deri.ie
XSPARQL: Bridging the gap of XML and RDF
2
XML
SPARQL XQuery
RDF XSPARQL
XMLFOAFExample XSPARQL
Digital Enterprise Research Institute www.deri.ie
Problem: Evaluating Nested Graph Patterns
for $p $name from <persons.rdf>
where { $p a foaf:Person .
$p foaf:name $name . }
return
<person> <name>{ $name }</name>
for $friend from <persons.rdf>
where { $p foaf:knows $friend . $friend foaf:name $fname . }
return <friend>{ $fname }</friend> </person>
3
SPARQLXQuery
Digital Enterprise Research Institute www.deri.ie
One Approach: Nested Loop Join in XQuery
friendlist :=
for $fname from <persons.rdf> where { $p1 foaf:knows $friend .
$friend foaf:name $fname . } for $p $name from <persons.rdf>
where { $p a foaf:Person .
$p foaf:name $name . }
return
<person>
<name>{ $name }</name> for $friend in friendlist
where $p = $friend/$p1 return <friend>{ $fname }</friend>
</person>
4
SPARQLXQuery
Join
Digital Enterprise Research Institute www.deri.ie
Evaluation Results
5
1
10
100
1000
1 10 100
Tim
e (s
ec)
Dataset Size (MB)
Naive X
SPAR
QL Imple
mentat
ionNes
ted Lo
op W
HERE
Clause
Sort-Merge Nested Loop XPath
Merge Graph Patterns
Named Graph
scales with number of saved
SPARQL calls
Digital Enterprise Research Institute www.deri.ie
Future Work/My PhD Proposal
• Formalise the integrated language XSPARQL– Formalism combining XQuery (functional) with SPARQL (rel.algebra)
• Optimise XSPARQL using this formal model– Currently only manual optimisations
– Useful for any approach manipulating both XML and RDF data
• RDFS + OWL reasoning– Add different kinds of reasoning to the formal model
– SPARQL 1.1 entailment regimes
6
Digital Enterprise Research Institute www.deri.ie
Enabling Networked Knowledge
Digital Enterprise Research Institute
Improve Efficiency of Mapping Data between XML and RDF with XSPARQL
Stefan Bischof, Nuno Lopes, and Axel Polleres
XSPARQL: Bridging the gap of XML and RDF‣ Language to map data between XML and RDF
‣ Combines the strengths of XQuery and SPARQL query languages
‣ Provides XQuery’s function library to SPARQL
‣ Provides SPARQL’s graph pattern matching facility to XQuery
Prototype: Rewrite XSPARQL to XQuery‣ Uses standard XQuery and SPARQL engines
‣ Try the prototype http://xsparql.deri.org/demo
Problem: Evaluating Nested Graph Patterns‣ Loops with nested graph patterns result in a large number
interactions between XQuery and SPARQL engines
‣ Prototype evaluates such joins naively as nested loop join
‣ Prototype is unable to exploit high similarity of the SPARQL calls
Proposed Optimisations‣ Minimize communication overhead for problematic queries
‣ Reduce the number of interactions between XQuery and SPARQL
‣ Perform only a static number of SPARQL calls by moving the join
‣ Move join to pure XQuery
-! Nested loop join using an XQuery WHERE clause or XPath
-! Tail recursive implementation of sort-merge join
‣ Move join to SPARQL
-! Join by merging SPARQL graph patterns
-! Join using named graph injection in triple store
Evaluation: Optimisations on several data sizes ‣ XMark benchmarks for XQuery adopted to XSPARQL use case
‣ Optimisations are applicable for the 3 slowest out of 20 queries
Results: XSPARQL can be faster‣ Optimisations performed always better than standard XSPARQL
‣ SPARQL join optimisations were the fastest (when applicable)
Future Work: More Optimisations and Features‣ Query also relational databases
‣ Create a concise formalisation of XSPARQL
‣ Exploit properties of XSPARQL fragments for optimisation
‣ Support SPARQL 1.1 and SPARQL 1.1 Entailment Regimes
More information http://xsparql.deri.org/
AcknowledgementsThis work has been funded by Science Foundation Ireland, Grant No. SFI/08/CI/I1380 (Lion-2) and by an IRCSET scholarship
XSPARQL query
XSPARQL rewriter
SPARQL engine
RDFdata
XQuery query
XQuery engine
XMLdata
XML or RDF
XML
SPARQLXQuery
RDFXSPARQL
Conclusion: Maintainable and Efficient Mapping‣ Performance of standard XSPARQL is drastically reduced for
queries containing nested graph patterns
‣ Performance of such queries improves with different optimisations
‣ XSPARQL can provide better performance than ad-hoc setups for mapping data between XML and RDF
1
10
100
1000
1 10 100
Tim
e (s
ec)
Dataset Size (MB)
Naive X
SPAR
QL Imple
mentat
ion
Nested
Loop
WHERE C
lause
Sort-Merge Nested Loop XPath
Merge Graph Patterns
Named Graph
scales with number of saved SPARQL calls
Questions about XSPARQL, syntax, semantics, implementation, prototype, optimisation, performance, RDF/XML …
… visit us at our poster in the afternoon!
Thanks for your attention!
7