3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical...

22
3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms Marco Pellegrino Eurostat • Background Work Products Inputs to the Joint Vocabulary The Challenge Current Status Looking Forward

Transcript of 3 rd Annual European DDI Users Group Meeting, 5-6 December 2011 The Ongoing Work for a Technical...

3rd Annual European DDI Users Group Meeting, 5-6 December 2011

The Ongoing Work for a Technical Vocabulary of DDI and SDMX Terms

Marco PellegrinoEurostat

1

• Background• Work Products• Inputs to the Joint Vocabulary• The Challenge• Current Status• Looking Forward

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 2

Background• At the EDDI 2010 conference, an informal dialogue

between SDMX, the DDI Alliance and interested members of the community was held

• 4 other meetings since then, and some telephone conferences

• No formal membership: secretariat provided by UN/ECE (more than 40 people on the mailing list)

• Goal of this work: to help the standards bodies coordinate to better serve their users

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 3

Background (Continued)

• Several areas of work: the different terminology between the SDMX and DDI communities was identified as one of the problems in the dialogue

• A joint SDMX-DDI Vocabulary is being created to help address this issue

• All relevant documents and information for the SDMX-DDI Dialogue can be found at http://www1.unece.org/stat/platform/display/metis/SDMX+DDI+Dialogue+-+Overview+Page

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 4

Work Products• So far, a small number of work products have been

identified:– Joint SDMX-DDI Vocabulary – Business Case for using SDMX and DDI– A proposed coordinated approach for using the standards in an

interoperable way (register data use case)• Other documents are envisaged:

– DDI, SDMX and the GSBPM to support statistical quality improvements

– Detailed examples• Each of the work products is being created by a small team

of volunteers from the SDMX and DDI communities

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 5

Work Products (Continued)

• The team working on the initial drafting of the Joint SDMX-DDI Vocabulary includes:– Marco Pellegrino (Eurostat)– Arofan Gregory (Open Data Foundation)– Chris Nelson (Metadata Technology)– Mary Vardigan (DDI Alliance)– Joachim Wackerow (GESIS/DDI Alliance)

• We anticipate many more participants as we get further along in the process, especially in a review capacity

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 6

The terminology challenge

Definitions and descriptions are often insufficient to support a correct use of a standard

Names are often not definitive for concepts

Standardization must focus on definitions rather than names

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 7

ISO/IEC 11179 Part 4:Rules and Guidelines for the Formulation of Data Definitions

The purpose of a data element definition is to define a data element with words or phrases that describe, explain, or make definite and clear its meaning

Good definitions promote the standardization and reuse of data elements, leading to data sharing and integration of information systems

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 8

Data Definition Rules

• A data definition shall be:– Unique– Singular– A statement of concept, not its negative– A descriptive phrase or sentence– Commonly understood abbreviations– Without embedded definitions

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 9

Data Definition Guidelines

State the essential meaning of the concept Be precise and unambiguous Be concise Be able to stand alone Be expressed without embedding rationale,

functional usage, domain information or procedural information

Avoid circular reasoning Use consistent terminology and structure for related

definitions

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 10

Inputs to the Joint Vocabulary

• The SDMX Secretariat has been working to develop a comprehensive SDMX Vocabulary for use within that community– SDMX Metadata Common Vocabulary developed as

part of the “Content-Oriented Guidelines” (2009)– SDMX Technical Vocabulary based largely on the

SDMX Information Model, with other inputs• Early draft of a DDI Vocabulary was developed by

the DDI alliance for input into this process

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 11

The Challenge

• Question: What is a Category Scheme?• Answer: That really depends (on which

standard you are using…)• This is a simple example of how the same

term is used to refer to two completely different types of metadata!

• There are other, similar differences of terminology which could produce confusion.

Dataor Metadata

Structure Definition

SDMX: is everything well described?

Category Scheme

CategoryData or Metadata

Flow

Data Provider

Provision Agreement

Data Set or Metadata

Set

Content

Constraint

Structure and Item Scheme

Maps

Registered Data Source or

Metadata Source

Attachment

Constraint

Categorisation

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 13

Study

Concepts

Concepts

measures

SurveyInstruments

using

Questions

made up of

Universes

about

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 14

DDI: everything clear?

• Category Scheme• Code Scheme• Concept Scheme• Control Construct Scheme• GeographicStructureScheme• GeographicLocationScheme• InterviewerInstructionScheme• Question Scheme• NCubeScheme• Organization Scheme• Physical Structure Scheme• Record Layout Scheme• Universe Scheme• Variable Scheme

• Dataset• Dcelements• DDI profile• Conceptual component• Study unit• Group• Resource package• Instance• Coverage• …• …• …

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 15

Technical Vocabulary: expected benefits

Support a common understanding of the agreed technical standards by providing a single authoritative list of the technical terms used in the standards, together with a description of each term and, if needed, some context explanations

Facilitate a comparison with other standards and a mapping of concepts with minimum need to determine “semantic equivalence”

Improve visibility for existing definitions (building on existing sources and avoiding a proliferation of “standard” terminologies)

Improve accessibility to a set of standard definitions through a single address

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 16

Vocabulary STRUCTURETerm (mandatory)

Definition (mandatory)

Definition source (mandatory)

Context (in SDMX and DDI)

Links to related terms within the glossary (optional)

URL to more detailed information (optional)

Several outputs (doc, html, xml)

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 17

Current Status

• The terms in the SDMX Vocabulary are now being evaluated (TWG) so that an appropriate subset can be mapped to DDI

• The first draft will not be comprehensive– It will only address the main objects in each standard, and

those which have very strong similarities between the two standards

• The initial set of DDI terms, plus their relationship to SDMX objects, has been drafted

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 18

The Initial Draft DDI-SDMX Vocabulary(example)

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 19

The Initial Draft DDI-SDMX Vocabulary(example)

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 20

Looking Forward

• We expect to have the initial draft ready for consideration by the larger group by march 2012

• Hopefully, this document can be finalized and then expanded:– We expect it to be a living document as the SDMX-

DDI dialogue proceeds– It will be published as a contribution to the

integrated use of DDI and SDMX

Generic Process Example

Survey/Register

Raw Data SetRaw Data Set

Anonymization, cleaning, Anonymization, cleaning, recoding, etc.recoding, etc.

Micro-Data Set/Micro-Data Set/Public Use FilesPublic Use Files

Tabulation, processing,

Tabulation, processing,

case selection, etc.

case selection, etc.

Aggregation,

Aggregation,

harmonizatio

n

harmonizatio

n

Aggregation, Aggregation, harmonizationharmonization

Aggregate Data SetAggregate Data Set(Lower level)(Lower level)

Aggregate Data SetAggregate Data Set(Higher Level)(Higher Level)

DDIDDI

SDMXSDMX

IndicatorsIndicators

3rd Annual European DDI Users Group Meeting, 5-6 December 2011 22

Business case: a key issue in the DDI-SDMX dialogue

Thank [email protected]