Pal gov.tutorial2.session5 1.rdf_jarrar

29
1 PalGov © 2011 1 PalGov © 2011 فلسطينيةلكترونية الية الحكومة ا أكاديمThe Palestinian eGovernment Academy www.egovacademy.ps Tutorial II: Data Integration and Open Information Systems RDF Resource Description Framework Session 5.1 Prof. Mustafa Jarrar Sina Institute, University of Birzeit [email protected] www.jarrar.info Reviewed by Prof. Marco Ronchetti, Trento University, Italy

Transcript of Pal gov.tutorial2.session5 1.rdf_jarrar

Page 1: Pal gov.tutorial2.session5 1.rdf_jarrar

1PalGov © 2011 1PalGov © 2011

أكاديمية الحكومة اإللكترونية الفلسطينيةThe Palestinian eGovernment Academy

www.egovacademy.ps

Tutorial II: Data Integration and Open Information Systems

RDFResource Description Framework

Session 5.1

Prof. Mustafa Jarrar

Sina Institute, University of Birzeit

[email protected]

www.jarrar.info

Reviewed by

Prof. Marco Ronchetti, Trento University, Italy

Page 2: Pal gov.tutorial2.session5 1.rdf_jarrar

2PalGov © 2011 2PalGov © 2011

About

This tutorial is part of the PalGov project, funded by the TEMPUS IV program of the

Commission of the European Communities, grant agreement 511159-TEMPUS-1-

2010-1-PS-TEMPUS-JPHES. The project website: www.egovacademy.ps

University of Trento, Italy

University of Namur, Belgium

Vrije Universiteit Brussel, Belgium

TrueTrust, UK

Birzeit University, Palestine

(Coordinator )

Palestine Polytechnic University, Palestine

Palestine Technical University, PalestineUniversité de Savoie, France

Ministry of Local Government, Palestine

Ministry of Telecom and IT, Palestine

Ministry of Interior, Palestine

Project Consortium:

Coordinator:

Dr. Mustafa Jarrar

Birzeit University, P.O.Box 14- Birzeit, Palestine

Telfax:+972 2 2982935 [email protected]

Page 3: Pal gov.tutorial2.session5 1.rdf_jarrar

3PalGov © 2011 3PalGov © 2011

© Copyright Notes

Everyone is encouraged to use this material, or part of it, but should

properly cite the project (logo and website), and the author of that part.

No part of this tutorial may be reproduced or modified in any form or by

any means, without prior written permission from the project, who have

the full copyrights on the material.

Attribution-NonCommercial-ShareAlike

CC-BY-NC-SA

This license lets others remix, tweak, and build upon your work non-

commercially, as long as they credit you and license their new creations

under the identical terms.

Page 4: Pal gov.tutorial2.session5 1.rdf_jarrar

PalGov © 2011 4

Tutorial Map

Topic h

Session 1: XML Basics and Namespaces 3

Session 2: XML DTD’s 3

Session 3: XML Schemas 3

Session 4: Lab-XML Schemas 3

Session 5: RDF and RDFs 3

Session 6: Lab-RDF and RDFs 3

Session 7: OWL (Ontology Web Language) 3

Session 8: Lab-OWL 3

Session 9: Lab-RDF Stores -Challenges and Solutions 3

Session 10: Lab-SPARQL 3

Session 11: Lab-Oracle Semantic Technology 3

Session 12_1: The problem of Data Integration 1.5

Session 12_2: Architectural Solutions for the Integration Issues 1.5

Session 13_1: Data Schema Integration 1

Session 13_2: GAV and LAV Integration 1

Session 13_3: Data Integration and Fusion using RDF 1

Session 14: Lab-Data Integration and Fusion using RDF 3

Session 15_1: Data Web and Linked Data 1.5

Session 15_2: RDFa 1.5

Session 16: Lab-RDFa 3

Intended Learning Objectives

A: Knowledge and Understanding

2a1: Describe tree and graph data models.

2a2: Understand the notation of XML, RDF, RDFS, and OWL.

2a3: Demonstrate knowledge about querying techniques for data

models as SPARQL and XPath.

2a4: Explain the concepts of identity management and Linked data.

2a5: Demonstrate knowledge about Integration &fusion of

heterogeneous data.

B: Intellectual Skills

2b1: Represent data using tree and graph data models (XML &

RDF).

2b2: Describe data semantics using RDFS and OWL.

2b3: Manage and query data represented in RDF, XML, OWL.

2b4: Integrate and fuse heterogeneous data.

C: Professional and Practical Skills

2c1: Using Oracle Semantic Technology and/or Virtuoso to store

and query RDF stores.

D: General and Transferable Skills2d1: Working with team.

2d2: Presenting and defending ideas.

2d3: Use of creativity and innovation in problem solving.

2d4: Develop communication skills and logical reasoning abilities.

Page 5: Pal gov.tutorial2.session5 1.rdf_jarrar

5PalGov © 2011 5PalGov © 2011

Reading

RDF Primer (http://www.w3.org/TR/rdf-syntax-grammar/ )

Please have a look only to:

Chapter 2 Making Statements About Resources

Chapter 3 An XML Syntax for RDF: RDF/XML

Chapter 3 Defining RDF Vocabularies: RDF Schema

Page 6: Pal gov.tutorial2.session5 1.rdf_jarrar

6PalGov © 2011 6PalGov © 2011

What is XML

• Provides a common syntax for marking up documents.

• Easy to exchange between computers (web)

• Data model: an XML Document is an ordered labeled tree

(/collection of trees).

• W3C standard since 1997.

<bookInfo>

<title>Orientalism</title>

<author>

<persName>

<title>Prof.</title>

<foreName>Edward</foreName>

<surName>Said</surName>

<roleName>University Professor

<placeName>Columbia University

</placeName>

</roleName>

</persName>

</author>

</bookInfo>

bookinfo

title

surNametitle

author

persNam

e

foreName

placeName

roleName

Page 7: Pal gov.tutorial2.session5 1.rdf_jarrar

7PalGov © 2011 7PalGov © 2011

XML Example

<address>

<name>Universsity of Birzeit</name>

<street>Almarj 435</street>

<town>Birzeit</town>

</address>

Page 8: Pal gov.tutorial2.session5 1.rdf_jarrar

8PalGov © 2011 8PalGov © 2011

<address>

<name>University of Birzeit</name>

<place>

<street>Almarj 435</street>

<town>Birzeit</town>

</place>

</address>

Address Example: XML to XML

<address>

<name>University of Birzeit</name>

<street>Almarj 435</street>

<town>Birzeit</town>

</address>

XML Markup 1:

XML Markup 2:

XML stylesheets arealso usable to transformXML representations

Page 9: Pal gov.tutorial2.session5 1.rdf_jarrar

9PalGov © 2011 9PalGov © 2011

Why XML is Not Enough

• It provides syntax, but not semantics, which is important when

exchanging/representing data over the Web.

• Not primitive. Same data can be represented in many ways, which is a

problem when exchanging/representing data over the Web.

<aaaa>

<bbbb>Universsity of Birzeit</aaaa>

<cccc>Almarj 435</cccc>

<dddd>Birzeit</dddd>

</aaaa>

<address>

<name>University of Birzeit</name>

<street>Almarj 435</street>

<town>Birzeit</town>

</address>

<address name=“University of Birzeit”>

<street>Almarj 435</street>

<town>Birzeit</town>

</address>

Page 10: Pal gov.tutorial2.session5 1.rdf_jarrar

10PalGov © 2011 10PalGov © 2011

What is RDF?

• RDF stands for Resource Description Framework.

• It is used for describing resources on the web.

• Makes use of URIs to identify web resources and is

written in XML.

• It is not a language but a framework• You see it as a way of writing XML making it meaningful and more

primitive.

• You may see it independent, RDF data might never occur in XML form

• W3C standard since 1999

Page 11: Pal gov.tutorial2.session5 1.rdf_jarrar

11PalGov © 2011 11PalGov © 2011

Designed to be read by Computers

• RDF was designed to provide a common way to describe

information so it can be read (and understood) by

computer applications.

• RDF descriptions are not designed to be displayed on the

web.

Page 12: Pal gov.tutorial2.session5 1.rdf_jarrar

12PalGov © 2011 12PalGov © 2011

Makes use of URIs

• RDF uses a URI (Uniform Resource Identifier) to identify aweb resource, and properties to describe the resource

• Unlike URLs, URIs are not limited to identifying things thathave a network location.

• A URI reference (URIref) is a URI, together with anoptional fragment identifier at the end.http://www.example.org/index.html#section2

Page 13: Pal gov.tutorial2.session5 1.rdf_jarrar

13PalGov © 2011 13PalGov © 2011

<?xml version="1.0"?>

<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:a="http://www.example.com"

xmlns:w="http://en.wikipedia.org/wiki/">

<rdf:Description rdf:about="http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067X">

<a:Title>Orientalism</a:Title>

<a:Price>11$</a:Price>

<a:Author>Edward_Said</a:Author>

</rdf:Description>

</rdf:RDF>

Example

http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067Xc

Orientalism

Edward Said

11$

:Author

Page 14: Pal gov.tutorial2.session5 1.rdf_jarrar

14PalGov © 2011 14PalGov © 2011

<?xml version="1.0"?>

<rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#

xmlns:a="http://www.example.com/"

xmlns:w="http://en.wikipedia.org/wiki/">

<rdf:Description rdf:about="http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067X">

<a:Title>Orientalism</a:Title>

<a:Price>11$</a:Price>

<a:Author>

<rdf:Description rdf:about="w:Edward_Said">

<a:BirthCity>Jerusalem</a:BirthCity>

<a:BornAt>1/11/1935</a:BornAt>

<a:DiedA>25/9/2003</a:DiedA>

</rdf:Description>

</a:Author>

</rdf:Description>

</rdf:RDF>

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>

Example

http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067Xc

Orientalism

http://en.wikipedia.org/wiki/Edward_Said

11$ Edward Saïd

Jerusalem

1/11/1935

25/9/2003

a:Author

a:B

irth

Cit

y

Page 15: Pal gov.tutorial2.session5 1.rdf_jarrar

15PalGov © 2011 15PalGov © 2011

<?xml version="1.0"?>

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns:dc="http://purl.org/dc/elements/1.1/"

xmlns:a="http://www.example.com#"

xmlns:w="http://en.wikipedia.org/wiki/">

<rdf:Description rdf:about="http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067X">

<a:Title>Orientalism</a:Title>

<a:Price>11$</a:Price>

<a:Author>

<rdf:Description rdf:about="w:Edward_Said">

<a:BornAt>1/11/1935</a:BornAt>

<a:DiedA>25/9/2003</a:DiedA>

<a:BirthCity>

<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Jerusalem">

<a:Population>760800</a:Population>

<a:CapitalOf>Palestine</a:CapitalOf>

</rdf:Description>

</a:BirthCity>

</rdf:Description>

</a:Author>

</rdf:Description>

</rdf:RDF>

Example

http://www.amazon.com/Orientalism-Edward-W-Said/dp/039474067Xc

Orientalism

http://en.wikipedia.org/wiki/Edward_Said

11$ Edward Saïd

http://en.wikipedia.org/wiki/Jerusalem

1/11/1935

25/9/2003

760800 Palestine

:Author

:Bir

thC

ity

Page 16: Pal gov.tutorial2.session5 1.rdf_jarrar

16PalGov © 2011 16PalGov © 2011

Another Example

Page 17: Pal gov.tutorial2.session5 1.rdf_jarrar

17PalGov © 2011 17PalGov © 2011

Directed Graph Representation

Page 18: Pal gov.tutorial2.session5 1.rdf_jarrar

18PalGov © 2011 18PalGov © 2011

Simple Example

<?xml version="1.0"?>

<RDF>

<Description about="http://www.w3schools.com/default.asp">

<author>Jan Egil Refsnes</author>

<created>November 1, 1999</created>

<modified>February 1, 2004</modified>

</Description>

</RDF>

Page 19: Pal gov.tutorial2.session5 1.rdf_jarrar

19PalGov © 2011 19PalGov © 2011

Example Explained

• In the example:

– the URI "http://www.w3schools.com/-default.asp" isused to identify a web page,

– the property "author" describes the author of the page,

– the property value is "Jan Egil Refsnes".

– The property "created" tells when the page wascreated, and the property "modified" when it was lastmodified.

Page 20: Pal gov.tutorial2.session5 1.rdf_jarrar

20PalGov © 2011 20PalGov © 2011

Subject, Predicate & Object

• RDF terminology also use the words subject,

predicate and object.

• The resource http://www.w3schools.com/default.asp is

the subject

• The property "author" is the predicate

• The value "Jan Egil Refsnes" is the object

Page 21: Pal gov.tutorial2.session5 1.rdf_jarrar

21PalGov © 2011 21PalGov © 2011

RDF Important Concepts

• Data is represented in RDF as a directed labeled graph.

• An RDF graph is a set of triples, of the form <Subject, Predicate, Object>

Each Subject and each Predicate must be a URI; that is, it has to be a

unique identifier, not a literal. An Object can be either a URI or a literal.

Page 22: Pal gov.tutorial2.session5 1.rdf_jarrar

22PalGov © 2011 22PalGov © 2011

Example 2

<rdf:RDF>

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax ns#"

xmlns:cd="http://www.recshop.fake/cd">

<rdf:Description

rdf:about="http://www.recshop.fake/cd/Empire Burlesque">

<cd:artist>Bob Dylan</cd:artist>

<cd:country>USA</cd:country>

<cd:company>Columbia</cd:company>

<cd:price>10.90</cd:price>

<cd:year>1985</cd:year>

</rdf:Description>

. . . . . . . . . . . .

Page 23: Pal gov.tutorial2.session5 1.rdf_jarrar

23PalGov © 2011 23PalGov © 2011

Example 2 (cont)

<rdf:Descriptionrdf:about="http://www.recshop.fake/cd/Hide your heart"> <cd:artist>Bonnie Tyler</cd:artist> <cd:country>UK</cd:country> <cd:company>CBS Records</cd:company> <cd:price>9.90</cd:price> <cd:year>1988</cd:year>

</rdf:Description></rdf:RDF>

Page 24: Pal gov.tutorial2.session5 1.rdf_jarrar

24PalGov © 2011 24PalGov © 2011

List of Triples (Table Representation)

Subject Predicate Object

http://www.recshop.fake/cd/Empire

Burlesquehttp://www.recshop.fake/cdartist

"Bob

Dylan"

http://www.recshop.fake/cd/Empire

Burlesque

http://www.recshop.fake/cdcount

ry"USA"

http://www.recshop.fake/cd/Empire

Burlesque

http://www.recshop.fake/cdcomp

any"Columbia"

http://www.recshop.fake/cd/Empire

Burlesquehttp://www.recshop.fake/cdprice "10.90"

http://www.recshop.fake/cd/Empire

Burlesquehttp://www.recshop.fake/cdyear "1985"

Page 25: Pal gov.tutorial2.session5 1.rdf_jarrar

25PalGov © 2011 25PalGov © 2011

Details of the Example

• The first line in the XML file is the XML declaration, telling the version of

XML.

• The rdf:RDF element indicates that the content is RDF.

• The xmlns:rdf namespace, specifies that tags with the rdf: prefix are

from the namespace defined by "http://www.w3.org/1999/02/22-rdf-syntax-ns#".

• The xmlns:cd namespace, specifies that tags with the cd: prefix are

from the namespace defined by "http://www.recshop.fake/cd".

• The rdf:Description element contains the description of a resource

identified by the rdf:about attribute.

• The cd:artist element describes a property of the resource, and so

does cd:country, etc.

Page 26: Pal gov.tutorial2.session5 1.rdf_jarrar

26PalGov © 2011 26PalGov © 2011

Main RDF Properties and Attributes

Main RDF Properties

• rdf:subject The subject of the resource in an RDF Statement

• rdf:predicate The predicate of the resource in an RDF Statement

• rdf:object The object of the resource in an RDF Statement

• rdf:type The resource is an instance of a class

Main RDF Attributes

• rdf:RDF The root of an RDF document

• rdf:about Defines the resource being described

• rdf:Description Container for the description of a resource

• rdf:resource Defines a resource to identify a property

• rdf:datatype Defines the data type of an element

Page 27: Pal gov.tutorial2.session5 1.rdf_jarrar

27PalGov © 2011 27PalGov © 2011

RDF Validator

• Check the correctness of an RDF document:

http://www.w3.org/RDF/Validator/

• Result shows the subject, predicate and object of each

element of the document and a graph of the model.

Page 28: Pal gov.tutorial2.session5 1.rdf_jarrar

28PalGov © 2011 28PalGov © 2011

XML into RDF (Eaxmple)

<XML>

<Article ID=B1 Year=“2000”>

<Author ID=A1>

<Name>Tom</Name>

<Affiliation ID=UoM>

<Name>University of Malta</Name>

<Country ID=mt>

<Name> Malta</Name>

<Capital>Valletta</Capital>

</Country>

</Affiliation>

<Author>

</Article>

<Article ID=B2 Year=“2005”>

<Author href=“#A1”/>

<Author ID=A2>

<Name>Bob</Name>

<Affiliation ID=UoC>

<Name>University of Cyprus</Name>

<Country ID=cy>

<Name> Cyprus</Name>

<Capital>Nicosia</Capital>

</Country>

</Affiliation>

<Author>

</Article>

<Article ID=B3 Year=“2007”>

<Author ID=A3>

<email>ps@uoc</email>

<Affiliation href=“#UoC”/>

<Author>

</Article>

<Article ID=B4 Year=“2008”>

<Publisher ID=ACM/>

</Article>

</XML>

XML

B1

A2

UoM

“Bob”

County

A1

“Tom”

B2UoC

Author

cy

mtCounty

Affiliation

p3@uoc

Affiliation

B3A3

“University of Malta”

“University of Cyprus”

2000

2005

2007

“Cyprus”

“Nicosia”

“Malta”

“Valletta”

Employs

Employs

Root

B4

2008

Artic

le

ACM

RDF

Page 29: Pal gov.tutorial2.session5 1.rdf_jarrar

29PalGov © 2011 29PalGov © 2011

Database into RDF (Example)

Person

ID Name Affiliation

A1 Tom Lara UoM

A2 Bob Hacker UoC

Article

ID Tile Year

B1 Data Web 2007

B2 Semantic Web 2005Author

Article Person

B1 A1

B1 A2

B2 A2

Country

ID Name

mt Malta

cy Cyprus

Affiliation

ID Country

UoM mt

UoC cy

RDFS P O

:B1 rdf:Type :Article:B1 :Title “Data Web”:B1 :Year 2007:B2 rdf:Type :Article:B2 :Title “Semantic Web”:B2 :Year 2005:B1 :Author :A1:B1 :Author :A2:B2 :Author :A1:A1 rdf:Type :Person:A1 :Name “Tom Lara”:A1 :Affiliation :UoM:A2 :Type :Person:A2 :Name “Bob Hacker”:A2 :Affiliation :UoC:UoM :Type :University:UoM :Country :mt:mt :Type :Country:mt :Name “Malta”:UoC :Type :University:UoC :Country :cy:cy :Type :Country:cy :Name “Cyprus”