GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been...

38
GetLOD - Linked Open Data and Spatial Data Infrastructures W3C Linked Open Data LOD2014 Roma, 20-21 February 2014 Stefano Pezzi, Massimo Zotti, Giovanni Ciardi, Massimo Fustini

Transcript of GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been...

Page 1: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD - Linked Open Data and

Spatial Data Infrastructures

W3C Linked Open Data LOD2014

Roma, 20-21 February 2014

Stefano Pezzi, Massimo Zotti, Giovanni Ciardi, Massimo Fustini

Page 2: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Context

Geoportal & OpenData Portal

SDI management

Towards Linked Open geoData

GetLOD: Open GeoData Solution

Agenda

2

Page 3: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Context

• Local and interoperable Geo-Information (GI) is

crucial for an increasing number of added value

services provided by private companies on top of

“open government data”

• Actually, local governments are playing an emerging

role as they represent authoritative sources for high-

quality certified data for interlinking external

information, and for smart cities applications

• In Europe main drivers for interoperable and open

data are INSPIRE and Public Sector Information

Directives and Open Data strategies, at various levels.

Page 4: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Context

• Usually geographical datasets are provided as “quick-

and-dirty”, simple and flat predefined files, with

heterogeneous data models, semantics, content, as

well

• Four critical issues:

– Local data should be published on different infrastructures;

– SDI and LOD infrastructures are not interoperable;

– Two parallel workflows and risk of additional workload and

data quality;

– GI lacks persistent URIs and information cannot easily be

linked on record level.

Page 5: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD is an open and reusable solution for publishing

geographic data on the Web as Linked Open Data,

according to the standard RDF / XML.

GetLOD thus ensures the Web publication of geospatial

data and its related metadata as open and linkable

data, starting from traditional cartographic webservices

GetLOD: Open GeoData Solution

5

Page 6: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Geoportal & OpenData Portal

• The Geoportal represents an important part of the

Open Data policy of the Region Emilia-Romagna.

• With a strong integration with the regional Open

Data portal, the Geoportal is a provider of

(geo)data in favor of the portal dati.emilia-

romagna.it

Page 7: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

ER Geoportal

• The ER Geoportal allows the diffusion, the distribution

and the use of geospatial data, information and

geographical services both to the public and the

staff of the local and national government.

• It is compliant to the latest regional, national (AGiD)

and international (INSPIRE, CEN, ISO, OGC)

standards in terms of interoperability.

Page 8: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

ER Geoportal

Home Page of the ER Geoportal

Page 9: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

ER Geoportal

The data catalog

Page 10: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

ER Geoportal

Example of ISO Metadata

Page 11: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Regional SDI management

Moka is a suite to organize

the Geographic Information

System and to develop

applications that provide GIS

services to citizens,

professionals and businesses.

Regione Emilia–Romagna

organizes his SDI with

Moka Catalog and builds

GIS applications with

Moka CMS (Content

Management System).

Page 12: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

SDI and LOD will interoperate

through Moka

In Regione Emilia – Romagna SDI and LOD infrastructures

will interoperate through Moka.

• Moka Content Management System organizes SDI

and builds GIS applications (web, desktop, apps for

smartphone).

• Moka Catalog organizes the whole SDI

• Moka builds open data using “GetLOD” services.

Page 13: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

SDI and OpenData will

interoperate through Moka

How Moka (CMS GIS) helps users to create OpenData

through GetLOD

In Moka Catalog user

selects the geodata to

be published as Open

Data

Moka Catalog invokes GetLOD

services to create Open Data

Open Data are catalogued

in Moka repository

From Moka users can manage the

update of Open Data

1

5

4

3 2

Page 14: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

In Regione Emilia – Romagna Moka will

Catalog

OpenData

Oraganize

SDI

• GeoData

• RDBMD and tables

• Web services

• Metadata (RNDT, Metadata RER)

• OpenData

• Functions and appications

Create applications

with data and

OpenData

• Web Applications

• Desktop Applications

• Apps for smartphone

• Uses GetLOD

services

Create

OpenData

SDI and OpenData will

interoperate through Moka

Page 15: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Data, if isolated, have little value.

The value of data increases when different data sets, produced and published independently by different individuals, can be crossed freely – by third parties.

The generation of dataset in RDF format (Linked Data) increases the value of the data allowing connections among themselves and with external dataset!

Towards Linked Open geo-Data

15

Page 16: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Towards Linked Open geo-Data

• Free data is not enough!

In order to offer a really useful service to citizens,

institutions and companies, you need to aggregate,

process data and offer them as services.

• The creation of an "ontology network" of the Geoportal

data allows to move from one conceptual dataset to

another.

• Ontologies are considered one of the pillars of the

Semantic Web: a number of on-going initiatives in EU

Member States and EU projects (such as InGeoCloudS,

GeoKnow and SmartOpenData) are creating RDF

vocabularies based on the INSPIRE data models.

Page 17: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

17

L'integrazione a livello di dati

Applicazione sopra il modello concettuale esplicito Applicazione sopra il modello concettuale esplicito Applicazione sopra il modello concettuale esplicito Applicazione sopra il modello concettuale esplicito

L'integrazione a livello di dati

Applicazione sopra il modello concettuale esplicito

Integration at the level data

Application over conceptual model

Towards Linked Open geo-Data

Page 18: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

The focus of GetLOD is on the governance of Linked

Open Data from authoritative sources: data about

addresses and buildings derive from municipal registers

(e.g. building permits) provided by more than 200

municipalities, 9 provinces and gathered by the Region

in the DBTR (Regional Topographic DB).

GetLOD: Open GeoData Solution

18

Page 19: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Open and reusable solution

It is integrated with the Spatial Data Infrastructure

thanks to the standards defined by the Open

Geospatial Consortium (OGC) WFS and CSW.

It allows to publish the geographic open data

both as RDF (Linked Open Data )

and, as a side effect, in other non-linkable

interchange formats Shapefile and GML )

GetLOD: Open GeoData Solution

19

Page 20: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

20

GeoRepository

GI Middleware

MD 19115

RDF

dump

TripleStore

www

OGC server

OGC WFS

GI Data & Metadata

LOD Back-end

MD server

OGC

CSW

Download

Triple server

LOD Front-end

JAVA

API

connettori

mapping

file GetLOD

MD catalogo

Open Data Catalogo

catalogazione

Ricerca

API

GetLOD: Open GeoData Solution

Page 21: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD is substantialy a batch RDFizer that extracts

data from OGC services and transforms them in

RDF/XML.

It’s a java application that can execute scheduled

transformation jobs.

A mapping file between GML elements and an ontology

concepts controls the transformation.

The core transformer is based on Apache Velocity.

Data as well metadata are transformed in RDF graphs.

GetLOD: Open GeoData Solution

21

Page 22: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD has a plug-in architecture for what concernes

the output destination of data extracted, so you can:

• Create a dump file;

• Transfer the file and index it on the ER custom open

data portal;

• Load the data into standard (CKAN) open data

portals using APIs.

• Load the data into a SPARQL endpoint

• ...

GetLOD: Open GeoData Solution

22

Page 23: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Ontologies used by GetLOD have been derived directly

from the conceptual model of the Topographic DB or,

better, from the dissemination model of the DBTR.

We did not start from scratch, asking ourself “what is a

building?”. In this way, the mapping of concepts was

fairly direct.

Nevertheless, existing ontologies has been reused where

possible, especially for geometry.

Particular attention has been paid thinking at the real

use of geometry in LOD data and we made some

reasoning and drewed some conclusions…

GetLOD: Open GeoData Solution

23

Page 24: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Geometry

1. LOD data are especially used in mash up apps that

likely use common maps APIs

WGS84 instead of official regional SRS

GetLOD: Open GeoData Solution

24

Page 25: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Geometry

2. If XML is verbose, RDF is really prolix

In LOD context, location usually is more important

than shape

No complex geometry, but only simple & derived

centroids for buildings, bounding box for Administrative Units

GetLOD: Open GeoData Solution

25

Page 26: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Geometry

3. OGC service are still there, but let’s use them only

when we need them.

Link to WFS GetFeature for Full geometry

If an app need to draw the shape of a particular building, RDF

carries the GetFeatureByID query as the value of a specific

ontology predicate.

GetLOD: Open GeoData Solution

26

Page 27: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Geometry

4. Standards are important, but does anyone already

use them?

Use “OGC GeoSparql”, but even “WGS84”

Redundancy is not a problem in LOD

GetLOD: Open GeoData Solution

27

Page 28: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

In order to extract spatial LOD from SDI, some basic

principle must be adopted in the SDI data model;

fortunately DBTR already was almost compliant:

1. Unique and persistent identifier for every geographic

object

2. Hystorical management and object’s life cycle well

defined

But some things could be better:

1. UUID are not URI friendly

2. Codelist vs dictionaries

GetLOD: Open GeoData Solution

28

Page 29: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Not all geographical object are noteworthy: it makes no

sense to convert to LOD a contour line or a land cover

polygon.

Only spatial object that can be thought as individual

that can evolve in time (change and eventually die)

and can be referred by other objects in the same or

other datasets can be correctly converted to LOD.

A lifeless object does not really die, that’s why you

should define its life cycle, that is which are the events

that terminate its individual identity.

GetLOD: Open GeoData Solution

29

Page 30: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

Interlinking

The data that GetLOD extracts do not have interlinks for

the moment.

Interlinks are important but since we are talking about

datasets coming from authoritative sources, interlinks

that lead to general dataset like Geonames do not add

particular value.

Interlinks should be created also from other PA datasets

towards these reference data.

GetLOD: Open GeoData Solution

30

Page 31: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD: Open GeoData Solution

31

1. Identification &

dataset selection

6. validation

7. release

2. cleaning up

3. analysis & modeling

4. enrichmenet

5. external linking

LOD Life Cycle

Source“Linee Guida per l’Interoperabilità Semantica attraverso i Linked Open Data” (Agenzia per l'Italia Digitale)

GetLOD: solution that implements the entire LOD Life Cycle

Page 32: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD: Open GeoData Solution

32

ShapeFile

GML

XML metadata ISO 19115 (RNDT compliant)

XML describing OpenData

RDF for Data

RDF for Metadata ISO 19115 (RNDT

compliant)

XML describing Linked Open Data

Generate OpenData

Generate Linked OpenData

Page 33: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD: Open GeoData Solution

33

Page 34: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD: Open GeoData Solution

34

Page 35: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD RDF Browser integrates:

GeoER-API

RDF Administrative boundaries, Buildings, Road Toponyms, Civic Numbers

Events from E-R Culture (http://dati.emilia-romagna.it/dato/item/37-37-eventi-e-r-cultura.html?goback=.gmp_3816349.gde_3816349_member_203780094)

Query Endpoint SPARQL

GetLOD: Demo

35

Page 36: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

In 2014 we will focus on:

• Interoperability: In Regione Emilia – Romagna SDI

and LOD infrastructures will interoperate through

Moka

• Interlinking: to compare entities from different

datasets available as LOD and calculate

similarities through textual, geographical and

temporal distance to match

• Natural browsing: to integrate the existing map

viewer with navigation and browsing of Linked

Open Data

GetLOD: Evolution in 2014

36

Page 37: GetLOD - Linked Open Data and Spatial Data InfrastructuresOntologies used by GetLOD have been derived directly from the conceptual model of the Topographic DB or, better, from the

GetLOD: Evolution in 2014

37