Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the...

21
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union [email protected] Bangkok, 28-30 September 2015 SDMX Global Conference

Transcript of Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the...

Page 1: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

SDMX and Global Standardisation

Marco PellegrinoEurostat, Statistical Office of the European Union

[email protected]

Bangkok, 28-30 September 2015 SDMX Global Conference

Page 2: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

• Evolution of SDMX

• Standards integration- Examples

• Opportunities and challenges- All good standards change

2

Outline

Page 3: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

3

A model to describe statistical data and metadata

A standard for automated communication from machine to machine

A technology supporting standardised IT tools

A common language for statistics Statisticians agree to use a common description for data and metadata The data exchange process is then driven by this common description Data descriptions are made available for everybody who wants to

understand and reuse the data

SDMX provides

Page 4: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

• The same information is needed for exchange between different steps in a statistical production process.

• The use of SDMX throughout the process, in combination with a metadata registry (central storage of definitions, classifications, etc.) makes it more efficient and coherent to implement changes, e.g. in definitions

• Metadata-driven systems

Broadening the scope of SDMX

4

Page 5: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

5

Standard metadata layer for the description and use of data and metadata throughout the process

Broadening the scope of SDMX

Page 6: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

GSBPM and SDMX: towards a more complete picture

6

Page 7: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

SDMX and standards integration

• SDMX promotes an incremental movement towards a data and metadata sharing model with the production of comparable and accurate statistics.

• The increasing use of SDMX:a) improves the quality of the statistical processb) enables simplified exchange and dissemination processes, improving timeliness and accessibility

• Statistical integration goes hand-in-hand with technical integration and standardisation.

7

Page 8: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

Building bridges

8

…not walls

Page 9: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

9

Building bridges

Page 10: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

SDMX and Linked Open Data

• Based on RDF - Resource Description Framework - a family of specifications published by W3C allowing for machine-actionable, semantically rich linking of things found on the Web.

• Main RDF vocabulary for statistical data: → Data Cube VocabularySimplified version of the SDMX model covering data structures

10

https://open-data.europa.eu/en/linked-data

Building bridges

Page 11: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

SDMX Data Structure Definition

RDF Data Cube Vocabulary

SDMX Data Set structured by

dim

ensio

nality

Page 12: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

SDMX and RDF: Scenario

Triple Store(DataCube)Statistical

DisseminationSystem

RDF Service

SPARQL

SDMX-MLFile

SDMX-ML File to RDF Transformer

Either

Or

Using SDMX Component Architecture

DataCube Writer

Page 13: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

Data validation “Technical”

- Covered by SDMX today

- Format Check (SDMX-ML)- Codes exist (SDMX DSD)- Codes used correctly(Dataflow & Constraint)

“Statistical Domain”- Not yet covered by SDMX (VTL)

- Value check- Time series- Revisions- Validation expressions

Building bridges

Page 14: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

14

Standard language for defining validation and transformation rules• Validation (now)• Transformation (partially now, to be enriched at a later stage)

Main goals• Define and preserve validation and transformation rules • Exchange and share rules• Apply rules in industrialized processes • Apply to several standards (e.g. SDMX, DDI, GSIM) thanks to a

generic information model

VTL: Validation and Transformation Language

Page 15: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

15

SDMX and DDI DDI Lifecycle can provide a very

detailed set of metadata, covering:

• Surveys and processing of microdata

• Structure of data files, including hierarchical files and complex relationships

• Archiving of data files and their metadata

• Tabulation and processing of data into tables

• Link between microdata variables and resulting aggregates

• SDMX can provide:• Metadata describing the structure

of dimensional data• Stand-alone metadata sets

(“reference metadata”)• Formats for dimensional data• A model of data reporting and

dissemination• Standard registry interfaces,

providing a catalogue of resources• Guidelines for deploying standard

web services• A way of describing statistical

processes

Building bridges

Page 16: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

SDMX and DDI: similarities and differences

• Both standards use a similar model for identifiable, versionable and maintainable artefacts

• Both standards use “schemes”, as packages for lists of items, and XML “schemas”

• Both standards are designed to support reuse

• DDI has much more detailed metadata at the level of the study domain, and provides more complete descriptions of the processing of data

• SDMX provides more architectural components to support registration, reporting/collecting and exchange, and has a solid information model

16

Page 17: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

17

Page 18: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Other relevant standards

Geospatial standards

DDI

SDMX

GSIMConceptual model

Implementationstandards

18

Page 19: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

Opportunities and challenges• SDMX is interacting well with other standards (GSIM, DDI,

RDF Linked Open Data, JSON) and this “complementarity” opens us new perspectives for the innovation of statistical processes.

• Common data validation and processing procedures are required (from structural validation to content).

• Better metadata-driven statistical production systems, with the use of standards throughout the processes in combination with a metadata registry.

• Better maintenance and developments of SDMX (e.g. support to use cases, new functions, more formats, etc.) using the wealth of its Information Model.

19

Page 20: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

All good standards change

20

Version 1.0

Version 2.0

Version 2.1

September 2004 April 2011 November 2005

Version 2.0

SDMX-EDISDMX-MLSDMX Registry

Version 1.0

GESMES/TS

• Too much change may discourage adoption

But…

• not giving users the functionalities they want would also discourage adoption

Page 21: Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union marco.pellegrino@ec.europa.eu Bangkok, 28-30.

Eurostat

Thanks for your attention!

[email protected]

21

SDMX and Global Standardisation

« If you are not sure where you are goingyou will finish someplace else »