Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

155
Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007

Transcript of Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Page 1: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Introduction to DDI 3.0

Sanda Ionescu ICPSR

CESSDA Expert Seminar, September 2007

Page 2: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0

• Radically different.

• More complex…

(…but certainly doable!)

• Brings important benefits.

sandai
Page 3: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Workshop Schedule

14:30 – 15:10 Overview (40) 15:10 – 15:35 Structure and Technical Mechanisms (25) 15:35 – 15:45 Break (10) 15:45 – 16:10 Study Unit – Modules Content (25) 16:10 – 16:30 Variable Markup Example (20) 16:30 – 16:40 Break (10) 16:40 – 17:10 Grouping – Modules Content and Examples (30) 17:10 – 17:30 Getting Started (20)

Page 4: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0

Overview

Page 5: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI BackgroundDevelopment History

• 1995 – A grant-funded project initiated and organized by ICPSR proposes to create a new standard for documenting social science data, to replace OSIRIS tagged codebooks.

• First drafts used SGML, then converted to Web-friendly XML.

• 2000 – DDI Version 1.0 published as a mainly document- and codebook-centric standard.

Page 6: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI BackgroundDevelopment History

• 2003 – DDI Version 2.0 published with extended scope:– Aggregate data coverage (based on matrix structure)– Additional geographic representation to assist

geographic search systems and GIS users

• Versions 1.0 through 2.1 (latest published) are backwards compatible, and based on the same structure.

Page 7: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI BackgroundDevelopment History

• February 2003 – Formation of the DDI Alliance, a self-sustaining membership organization whose members have a voice in the development of the DDI specification.

http://www.ddialliance.org/

Page 8: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI BackgroundDevelopment History

Version 3.0:

• 2004-2006: Planning and Development

• November 2006: Internal Review

• February 2007: Public Review

• July 2007: Candidate Draft Release

http://www.ddialliance.org/ddi3/index.html

Page 9: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Benefits of using DDI as an XML-based standard

• Interoperability: – Enables seamless exchange and reuse by other systems.

• Repurposing: – Provides a core document from which different types of outputs can be

generated.

• Value-added documentation: – Tagging carries “intelligence” in the document by describing content.

• Enhanced Data Discovery: – Increases precision and granularity of searches.

• Support for Data Analysis: – Variables description is accepted as input by online analysis systems.

• Multiple presentation formats: – ASCII – text; PDF; HTML; RTF.

• Preservation-friendly: – Non-proprietary format.

Page 10: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Why DDI 3.0?

DDI 3.0 presents new features in response to:• Perceived needs of:

-Data users

-Data producers

-Data archivists/librarians

• Developments in documenting and archiving data• Advances in XML technology

Page 11: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 and the Data Life Cycle Model

DDI Versions 1/2 were codebook-centric:

• Closely followed the structure of traditional print codebooks.

• Captured data documentation at a single, “frozen” point in time – archiving.

Page 12: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 and the Data Life Cycle Model

Version 3.0 is Life Cycle oriented:-Designed to cover all stages in the life cycle of a

data collection: pre-production production post-production

secondary use

Page 13: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

Planning for the Study: Proposal / Design

Study Purpose / OutlineConceptsStudy PopulationAuthor(s)Funding Sources

Version 3.1Survey / Sample Design

Pre-testing

Page 14: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

Proposal becomes reality…

Data Collection methodology: sampling, time, etc.Instrument characteristics QuestionnaireData cleaning, weighting, coding, etc.

Page 15: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

Publishing the data…

Intellectual content:Variables, Categories, Codes.

Physical representation:Data format, Record structure, Statistics.

Page 16: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

Archiving / (Re)Distributing the data collection…

Processing checksHoldings, availability and access conditions

Page 17: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

DDI becomes “visible” to the outside world…

DDI Instance:Pulls together all life cycle stagesAcquires its own identity as an objectBecomes a tool for data discovery and analysis

Page 18: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Life Cycle Coverage in DDI 3.0

Secondary use of data – new conceptual framework…

New DDI Instance:New PurposeNew Logical ProductNew Physical Description of Data

Page 19: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 and the Data Life Cycle Model

Advantages of Life Cycle orientation:

• Allows capture and preservation of metadata generated by different agents at different points in time.

• Facilitates tracking changes and updates in both data and documentation.

Page 20: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 and the Data Life Cycle Model

Advantages of Life Cycle orientation:

• Enables investigators, data collectors and producers to document their work directly in DDI, thus increasing the metadata’s visibility and usability.

• Benefits data users, who need information from the full data life cycle for optimal discovery, evaluation, interpretation, and re-use of data resources.

Page 21: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Questionnaire

Versions 1/2:- No instrument coverage.- Question text only as part of variable description.- No documentation for question flow / conditions.

Version 3.0:- Full description of instrument as a separate entity.- Documents specific use of questions: flow, conditions,

loops.- Compatible with Computer Assisted Interviewing

software.

Page 22: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Complex Data

Versions 1/2:- Inadequate representation of complex / hierarchical

data

Version 3.0:- Detailed documentation for complex / hierarchical

data

Logical structure of recordsRecord Types and RelationshipsRelevant variables: key-link, case identification, record type locator

Physical layout of records Single “hierarchical” file for all records, multiple rectangular files,

relational database, etc.

Page 23: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Aggregate Data

Versions 1/2:- Initially designed for microdata only- Aggregate data section added in V 2.1 to support limited

representation (Census-type data, delimited files)

Version 3.0:- Adds support for tabular, spreadsheet-type, representation of

aggregate data- Aggregate data transport option: cell content may be

included inline with the data item description

Page 24: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Data Transport

Versions 1/2:-None

Version 3.0:-In-line inclusion enabled for both aggregate data

and microdata

Page 25: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Longitudinal / Time Series / Cross-national Data

Comparability

Versions 1/2:-None

Version 3.0:-Grouping structure documents studies related on

one or several dimensions (time, geography, language, etc.) as well as their comparability

Page 26: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

New / Extended Functionalities in DDI 3.0: Increased Multilingual Support

Versions 1/2:- Limited <anytag xml:lang=“”>

Version 3.0:- Support for multiple language use and translations <InternationalStringType xml:lang=“” translated=“” translatable=“”>

<Variable> <Label xml:lang=“ger” translated=“false” translatable=“true”> Geburtsjahr</Label> <Label xml:lang=“eng” translated=“true”>Year of Birth</Label> </Variable>

Page 27: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Specification: Schema-based

Versions 1/2:- DTD-based

Version 3.0:- Schema-based:

Data typing supports machine actionability

Use of namespaces supports- Modularity- Extensibility and reuse- Alignment with / use of other standards

Page 28: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Specification: Machine-actionable

Versions 1/2:- Machine-readable

Version 3.0:- Machine-actionable:

1. Data typing: increased use of controlled vocabularies and standard codes

2. Larger set of required elements

Predictable content = a more consistentbase for programming

Page 29: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Modular Structure

Version 1/2:- Single file, hierarchical design

Version 3.0:- Modular design:

- Facilitates reuse- Facilitates versioning and maintenance- Supports life cycle model- Allows flexibility in organizing the DDI Instance- Supports grouping and comparing studies- Supports creation of metadata registries

Page 30: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Alignment with other metadata standards

Versions 1/2:- MARC, Dublin Core (bibliographic standards)

Version 3.0:- MARC, DC, but also…- SDMX (Statistical Data and Metadata Exchange)- ISO 11179 (Metadata Registries)- FGDC (Digital Geospatial Metadata)- ISO 19115 (Geographic Information Metadata)

Page 31: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 1/2 or DDI 3.0?

• DDI 3.0 will not supersede DDI 2.1.

• Both versions will– coexist– continue to be maintained– be used according to specific needs.

• All DDI 1/2 markup will not have to be migrated to Version 3.0.

Page 32: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0

Structure and Mechanisms

Page 33: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure

Building blocks of DDI 3.0:

» Modules

» Schemes

Page 34: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure

Modules:• Document different aspects of a study, or group

of studies, following the data through their life cycle (Conceptual Components, Data Collection, Logical Product, Physical Instance, etc.)

Schemes:• Include collections of sibling “objects” that are

traditionally components of a variable description: Concepts, Universes, Questions, Variable Labels and Names, Categories, Codes.

Page 35: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure

Modules:• Can live independently (have their own

schemas) or connected to one another within a hierarchical structure.

Schemes:• Can live semi-independently (need a higher-

level wrapper as they do not have their own schemas) or in-line within a Study Unit or Group module.

Page 36: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure DDI 3.0 model = a multi-branched hierarchyModule level:

DDI Instance

Resource PackageGroupStudy Unit

SubgroupStudyUnit

ConceptualComponents

DataCollection

Archive

OrganizationsStudyUnit

Subgroup

(Sub)groupStudyUnit

Page 37: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure

DDI 3.0 model = a multi-branched hierarchy

Within modules:

DataCollection

Question Scheme ProcessingMethodology

Sampling Time MethodQuestion

ItemQuestion

ItemWeighting Coding

Page 38: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Modular Structure

Relationships are established through:

• In-line inclusion

(Relational order is explicit)

• Referencing Internal

External (Relational order is implicit)

Page 39: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 – Structural mechanisms

Enable modular design and help actualize its benefits.

• Inheritance

• Referencing

• Identification

Page 40: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Inheritance

• Inheritance is based on the hierarchical structure of the model.

• In DDI 3.0 a number of elements are reused at different levels of the hierarchy.

• When the same element is present at multiple levels, lower levels inherit content from the upper levels, and only need to specify differences (=local overrides).

Page 41: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 InheritanceExample

• Instance: Coverage: Spatial: 50 US states

-Study Unit A – no Spatial Coverage defined

= will be inherited from Instance

-Study Unit B – Coverage: Spatial: 48 coterminous states

= supersedes definition in Instance

Page 42: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Referencing

• DDI 3.0 modular structure is dependent upon creating relationships by reference.

• Referencing implies bringing up the content of a DDI object within, or in association with, another object, by specifying its Unique Identifier.

• Identifiers are the key links between DDI objects.

Page 43: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: ReferencingExample

Data Collection Module: Question Scheme: Question: ID: “Q1”

Text: “How many days in the past week did you watch the national network news on TV?”

Conceptual Components Module:Concept Scheme: Concept: ID: “C1”

Description: “Exposure to national TV news”

Logical Product Module: Variable Scheme: Variable: ID: “V1”Name: V043014 Label: Days past week watch natl news on TV Question Reference: ID: “Q1” Concept Reference: ID : “C1”

Page 44: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: ReferencingExample

Page 45: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification

Consistency in building and using identifiers is needed for:

– Proper functioning of reference systems, enabling a smooth exchange and reuse of existing metadata.

– Machine-actionability of DDI instances, allowing them to serve as a basis for running programs and processes.

Page 46: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification

Element types used in the Identification system:

All elementsIdentifiableVersionableMaintainable

Page 47: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: IdentificationElement Types

Non-identified elements:

– Require context, which is provided by containing parents.

Example: codes within code schemes– Are not reusable.

Example: variable and category statistics

Page 48: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: IdentificationElement Types

Identifiables

– Carry their own ID– May be referenced / reused– Cannot be versioned or maintained, except as

part of a complex parent element

(Example: Variable – a change implies a new version of the entire scheme).

Page 49: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: IdentificationElement Types

Versionables

– Carry their own ID– Carry their own Version: content changes are

important to note

(Example: Concept – may be independently versioned within a scheme).

Page 50: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: IdentificationElement Types

Maintainables

– Are higher level DDI objects– Are both identifiable and versionable– Can also be published and maintained as

separate entities

(Example: all modules, schemes, comparison maps)

Page 51: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification Structure

• Maintainable elements:– URN and / or ID + Identifying Agency

+ Versioning Information:

Version Version Date

Version Responsibility

Version Rationale

• Versionable elements:– URN and / or ID + Versioning Information

• Identifiable elements:– URN and / or ID

Page 52: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification StructureNon-specified Identification information is inherited from the

levels above.

Example 1:

Inheritance is assumed….Maintainable: Variable Scheme:

ID: VarScheme_AIdentifying Agency: ICPSR

Version: 1.0

Identifiable: Variable:

ID: Var_1

[Identifying Agency]

[Version]

Page 53: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification StructureNon-specified Identification information is inherited from the

levels above.

Example 1:

Inheritance is assumed…Maintainable: Variable Scheme:

ID: VarScheme_A

Identifying Agency: ICPSR

Version: 1.0

Identifiable: Variable:

ID: V1 [Identifying Agency]

[Version]

Example 2:

Inheritance is applied by defaultMaintainable: Logical Product

ID: LogicalProd_Y

Identifying Agency: ICPSR

Version: 1.0

Maintainable: Variable Scheme:

ID: VarScheme_A

Identifying Agency: [ ]

Version: [ ]

Page 54: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification Structure: IDs

Uniqueness of Identifiers is necessary for both internal and external referencing:

1) All IDs MUST be unique within a maintainable

2) All maintainables MUST have unique IDs across an Agency

Page 55: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification Structure: Creating unique Identifiers

A DDI Instance may include multiple maintainables at different hierarchical levels:

Instance (maintainable) – unique ID within Identifying Agency Study Unit (maintainable) – unique ID within Identifying Agency

Logical Product (maintainable) – unique ID within Identifying Agency

Variable Scheme (maintainable) – unique ID within Identifying Agency

Page 56: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification Structure: Creating Unique Identifiers

Instance_A (unique at ICPSR)

StudyUnit_1

Logical Product_1

VariableScheme_1

Variable_1

Instance_B (unique at ICPSR)

StudyUnit_1

Logical Product_1

VariableScheme_1

Variable_1

Post-markup:Variable ID: Instance_AStudyUnit_1LogicalProduct_1VariableScheme_1Variable_1Instance_BStudyUnit_1LogicalProduct_1VariableScheme_1Variable_1

Markup:

Page 57: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification Structure: URNs

• Have a fixed structure and MUST include object ID, Identifying Agency, and Version.

• For versionable and identifiable elements, the containing maintainable is specified.

• Take precedence when both a URN and the Identification sequence are used for the same object.

• May be constructed post-markup from the Identification sequence.

Page 58: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Identification:URN Structure

Examples:• Maintainables:

urn:ddi:3.0:StudyUnit:ddialliance.org:StudyUnit_ID:1.0

• Versionables:

urn:ddi:3.0:ConceptScheme:ddialliance.org:ConceptScheme_ID:1.0: Concept:Concept_ID:2.1

• Identifiables:

urn:ddi:3.0:VariableScheme:ddialliance.org:VariableScheme_ID:1.0: Variable:Variable_ID

Object nameIdentifying

Agency Object IDObjectVersion

Page 59: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Referencing

Reference structure:

• URN, and/or:• [Referenced object’s] ID + Identifying Agency + Version

+ [Containing] Module ID

+ [Containing] Scheme ID

Page 60: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Reuse of Information

Referencing Mechanisms for REUSE Inheritance

Reuse of Information:

1. Facilitates development of documentation throughout the study life cycle

2. Promotes interoperability and standardization across organizations

3. Saves markup time and effort4. Reduces the risk of human entry error5. Provides a basic level of implicit comparability

Page 61: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Modules

Content, Markup Examples

Page 62: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup Resource PackageResource Package

Study UnitStudy Unit SubgroupSubgroup Study UnitStudy Unit Sub(Group)Sub(Group)ConceptsConcepts

Data Coll.Data Coll.

Logical Pr.Logical Pr.

etc…

Page 63: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Other “specialized” DDI 3.0 modules

• Aggregate Data:– NCube Logical Product– Inline NCube Record Layout– NCube Record Layout– Tabular NCube Record Layout

• Inline Microdata:– Dataset

• User-specific Markup Templates:– DDI Profile

Page 64: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 65: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0

Modules used to mark up a simple study

Page 66: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 modules for documenting a single, survey-type study

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 67: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 modules for documenting a single, survey-type study

• [Reusable]• [XHTML]

• Instance– Study Unit

• Conceptual Component• Data Collection• Logical product• Physical Data Product• Physical Instance• Archive

– Organizations

Page 68: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 69: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Instance -- wrapper for all modules --

• Identification– URN– Identification Sequence– Name

• Citation … (+ optional DC Elements)• Coverage

– Topical– Spatial– Temporal

• Group (module) – repeatable• Resource Package (module) - repeatable• Study Unit (module) - repeatable• Other Material(s)• Note(s)• Translation Information

Page 70: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Coverage in DDI 3.0

Study: American National Election Study (ANES), 2004• Topical Coverage:

– Subject:• Historical and Contemporary Electoral Processes

– Keyword:• Electoral campaigns • Political attitudes• Political participation

• Spatial Coverage:– Description: United States– Top level: nation– Lowest level: congressional district

• Temporal Coverage: – Date: 2004

Page 71: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 72: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Study Unit -- documents a single “study” --

• Identification, Other Material(s), Note(s)• Citation• Abstract• Universe Reference• Funding Information• Purpose• Coverage • Analysis Unit• Embargo• Conceptual Component (module)• Data Collection (module)• Logical Product (module)• Physical Data Product (module)• Physical Instance (module)• Archive (module)

– Organizations (module)

Page 73: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 74: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Conceptual Component-- lists concepts and universes --

• Identification, Other Material(s), Notes• Coverage• Concept Scheme… or Reference to External Scheme

– Vocabulary – describes vocabulary used– Concept

• Label• Description• Similar Concept

– Difference– Concept Group

• Concept Reference (nestable)

• Universe Scheme … or Reference to External Scheme– Universe

• Human Readable• Machine Readable• Subuniverse

– Subuniverse

Page 75: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 76: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Data Collection• Identification, Other Material(s), Note(s)• Coverage• Methodology

– Time Method– Sampling

• Collection Event– Data Collector– Data Source– Collection Date (s)– Mode of data collection

• Question Scheme – lists actual questions• Instrument – documents question flow, conditions• Processing Event

– Control and cleaning operations– Weighting– Data Appraisal Information– Coding

Page 77: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 78: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Logical Product-- documents intellectual content of data --

• Identification, Other Material(s), Note(s)• Coverage• Category Scheme … or Reference to external category scheme

– Category• Label• Derivation (if applicable)• Definition

• Code Scheme … or Reference to external code scheme– Category Scheme Reference– Hierarchy Type– Level (in the hierarchy)– Code

• Category Reference• Value• Code (nestable)

• Variable Scheme … or Reference to external variable scheme

Page 79: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Logical ProductVariable Scheme: Variable

• Variable … or Reference to an externally documented variable

– Identification• Name

– Label– Definition– Universe Reference– Concept Reference– Question Reference – Embargo Reference– Response Unit– Analysis Unit

– Representation• Imputation• Derivation• Coding Instructions• Value Representation:

» Text» Date / Time» Numeric» Code

Page 80: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Logical ProductVariable Scheme: Variable Group

• Variable Group:– Type– Label – Definition– Universe Reference– Concept Reference– Variable Reference (lists variables in the group)– Variable Group Reference (allows nesting of groups)

• Variable Group Reference (use for externally documented Variable Group)

Page 81: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 82: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Physical Data Product-- Describes Physical Layout of Data --

• Identification, Other Material(s), Note(s)

• Logical Product Reference

• Gross Record Structure:– Records Per Case– Variable Quantity– Logical Record Reference– Physical Record Reference

• Related Logical Records

• Record Layout:– Data Item

– Variable Reference– Physical Location

– Value Location» StartPosition» Width

• Dataset (module)

Page 83: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 84: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Physical Instance-- Documents a specific data file ---

• Identification, Other Material(s), Note(s)• Citation• Coverage• Physical Data Product Reference• Data File Identification

– Location– URI

• Gross File Structure– Creation Software– Case Quantity– Overall Record Count

• Statistics– Logical Product Reference– Variable Statistics

• Variable Reference• Total Responses• Summary Statistics• Category Statistics

» Value» Statistic

Page 85: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 86: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Archive

• Identification, Other Material(s), Note(s)• Archive Specific

– Item• Location• Call Number• URI• Format• Media• Availability Status

– Access• Confidentiality Statement• Access Permission• Restrictions• Citation Requirement • Deposit Requirement• Access Conditions• Disclaimer• Contact

– Funding Information• Life Cycle Information

– Event• Type• Date• Agency • Description

• Organizations (module)

Page 87: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 88: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Organizations

• Identification• Organization

– URL– Individual

• Individual– Organization– Title– Language

• Role– Entity Reference– Organization Reference– Individual Reference– Description– Period

• Relation– Organization Reference– Individual Reference– Description– Period

• Name• Description• Location• Telephone• E-mail• Relation

Page 89: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup Example

A Survey Variable

Page 90: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Version 2.1 vs. Version 3.0 Example: A survey variable

ASCII codebook:

Page 91: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Version 2.1 vs. Version 3.0 Example: A survey variable in Version 2.1

Data Description:Variable

Page 92: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Version 2.1 vs. Version 3.0 Example: A survey variable in Version 2.1

name=“V043015”

Page 93: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Version 2.1 vs. Version 3.0 Example: A survey variable in Version 3.0

Logical Product: Variable Scheme

Data Collection: Question Scheme

Logical Product:Code Scheme

Logical Product:Category Scheme

Conceptual Component:Concept SchemeUniverse Scheme

Physical Instance:Statistics

Page 94: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Version 2.1 vs. Version 3.0 Example: A survey variable in Version 3.0

Logical ProductVariable Scheme: ID

Variable: ID

Data Collection: Question Scheme: ID

Question: ID

Logical Product:Code Scheme: ID

Code

Logical Product:Category Scheme: ID

Category: ID

Physical Instance:Statistics:

Variable StatisticCategory Statistics

Conceptual ComponentConcept Scheme:

Concept: IDUniverse Scheme:(Sub)Universe: ID

Page 95: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableConcept

Concept: Attention to

Presidential Campaign

on National TV

Conceptual Component:Concept Scheme:

Concept

Page 96: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableConcept

Page 97: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableUniverse

Conceptual Component:Universe Scheme:

(Sub)Universe

(A7:How many days in the PAST WEEK did you watch theNATIONAL network news on TV?

0-7; 8=DK; 9=RF)

Page 98: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableUniverse

Page 99: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableQuestion ID, Question Text

Data Collection:Question Scheme:

Question Item

Page 100: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableQuestion ID, Question Text

Other Response Domains:

Page 101: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableVariable name, label, type of physical representation

Logical Product:Variable Scheme:

Variable

Page 102: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableVariable name, label, type of physical representation

Other types of Representation:

Page 103: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableCategory labels, missing data information

Logical Product: Category Scheme:

Category

Page 104: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableCategory labels, missing data information

missing=“true”

Page 105: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableCategory Values

Logical Product:Code Scheme:

Code

Page 106: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableCategory Values

Page 107: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableStatistics

Physical Instance:Statistics

Variable Statistics:Category Statistic

Page 108: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey VariableStatistics

Page 109: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Markup: A Survey Variable Logical Product Module

Page 110: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 MarkupModules used in a full variable description

ConceptUniverse

Question

ValuesValue LabelsVariable nameVariable label

Statistics

Location:Physical Data

Product

Page 111: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Modular ApproachAdvantages

• Modules and schemes can be independently maintained.

• Pieces of information can be reused without being repeated.

Page 112: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Modular Approach:Reusing information

Page 113: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Variable Markup in Version 2-- carries redundant information--

Page 114: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Variable Markup in Version 3.0 Modular Approach: Reusing Information

Page 115: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0

Grouping

Page 116: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Groups

• Entirely new feature in DDI 3.0.

• Designed to document and compare related studies.

Page 117: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 118: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 119: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Group-- documents “families” of studies --

• Identification, Other Material(s), Note(s)• Citation• Abstract• Universe• Funding Information• Purpose• Coverage • Universe Reference• Conceptual Component (module)• Data Collection (module)• Logical Product (module)• Archive (module)

– Organizations (module)• Study Unit (module)• Group (module)• Comparative (module)

Page 120: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Grouping Attributes

• Set of mandatory attributes indicate the nature of the relationships among group members

• Group parameters:– Time– Instrument– Panel (population of respondents)– Geography– Datasets– Language

Page 121: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Grouping Attributes Example

Page 122: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Types of Groups

• Groups of studies may be:– Formal (“by design”):

• Designed to be compared (longitudinal, time-series, or cross-national studies)

• Documented and compared through use of Inheritance

– Informal (“ad-hoc”): • Decision to group and compare is taken post-

production, or “after the fact”.• Comparability documented in the Comparative

module

Page 123: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 1: Time-series: Same questions repeated over time, same resulting variables.

Group (Studies A-C)Temporal Coverage_G1:1991-1993Data Collection: Question SchemeLogical Product: Variable Scheme

Study ATemporal Coverage: 1991

(Replace Ref:G_1)Physical Data Product

Physical Instance: Statistics

Study BTemporal Coverage: 1992

(Replace Ref:G_1)Physical Data Product

Physical Instance: Statistics

Study CTemporal Coverage: 1993

(Replace Ref:G_1).......

Physical Data ProductPhysical Instance

Study ATemporal Coverage: 1991

(Replace Ref:G_1)……

Physical Data ProductPhysical Instance

Study BTemporal Coverage: 1992

(Replace Ref:G_1)……

Physical Data ProductPhysical Instance

Page 124: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: InheritanceAttributes “Add”, “Replace”, “Delete”.

• In a complex grouping structure inheritance paths may become quite intricate.

• ID attributes ADD, REPLACE and DELETE are introduced to resolve potential inheritance ambiguities:– ADD = [empty] -> flags element as a new addition.– REPLACE = “ReferenceType” -> referenced element

is being replaced at the lower level (“local override”).– DELETE = “ReferenceType” -> referenced element is

being deleted at the lower level.

Page 125: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 2: Time-series: Same core questions repeated over time, different topical modules added to each iteration.

Group (Studies A-C)

Data Collection: Core Questions(Q1-Q50)Logical Product: Core Variables (V1-V50)

Study A

Topical Module “Health Status”

Data Collection:

ADD: Questions (Q51A-Q80A)Logical Product:

ADD: Variables (V51A-V80A)

Study B

Topical Module “Gun Control”

Data Collection:

ADD: Questions (Q51B-Q80B)Logical Product:

ADD: Variables (V51B-V80B)

etc…

Page 126: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 3: Any group by design: some questions are not asked in some iterations.

Group (Studies A-E)

Data Collection: All Questions (Q1-Q100)Logical Product: All Variables (V1-V100)

Study A

Study BData Collection:

DELETE: Question Q55Logical Product:

DELETE: Variable V55

Group (Studies C-E)

Data Collection: DELETE: Questions Q60-Q69

Logical Product:DELETE: Variables V60-V69

Study C Study D Study E

Page 127: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 4 (SOEP, Germany): Longitudinal: Same variables, with different name each year.

(No name)

ADD: Name only

Page 128: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: InheritanceExample 5 (SOEP, Germany): Longitudinal: In 2002

variable “Income” changes currency from DM to Euro: change in question wording.

(No question)

ADD: question only

Page 129: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 5 (SOEP, Germany) continued: These variables also change names every year…

Page 130: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Formal Groups: Inheritance

Example 5 (SOEP, Germany) – the final picture: information is inherited down the hierarchy.

Page 131: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Inheritance in Formal Groups

• Simplification of DDI Instances: common metadata is only entered once.

• More efficient means of documentation: for new additions, only differences need to be specified.

• Relational information embedded in the inheritance structure: comparison becomes machine-actionable.

Page 132: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0 Modules-- Structural Overview --

DDI InstanceDDI Instance

Study UnitStudy Unit GroupGroup

Conceptual ComponentConceptual Component

Data CollectionData Collection

Logical ProductLogical Product

Physical Data ProductPhysical Data Product

Physical InstancePhysical Instance

ArchiveArchive

OrganizationsOrganizations

Conceptual Component Conceptual Component

Data CollectionData Collection

Logical ProductLogical Product

ArchiveArchive

Study UnitStudy Unit GroupGroup

ComparativeComparative

Page 133: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Comparative -- documents comparability in ad-hoc groups --

• Identification, Note(s)• Comparison Description (human-readable)• Concept Map

– Source Scheme Reference– Target Scheme Reference– Item Map

• Source Item• Target Item• Map Type • Difference

• Variable Map• Question Map• Category Map• Code Map• Universe Map

Page 134: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Using the Comparative Module

Instructions on how to use the Comparative Module and build comparison maps:

“DDI 3.0 User Guide”, pp. 45-49. http://www.ddialliance.org/DDI/ddi3

Page 135: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup

Getting started

Page 136: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Tools projects

DDI Toolkit:

• Core library for developing open source tools

• Version 1/2 <-> Version 3.0 converters• DDI 3.0 URN resolution tool• DDI 3.0 validation tool• Version 3.0 stylesheets with display and editing

layers

• Grouping tool• Concept management tool• Registry applications

Page 137: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup-- Getting started --

Software to assist in document creation:

• DeXtris:– XML browser– Converts DDI 1/2 to DDI 3.0

http://www.opendatafoundation.org/tools/dextris

Page 138: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 139: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 140: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 141: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 142: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 143: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 144: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 145: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 146: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0 Tools: Using Dextris

Page 147: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup-- Getting started --

Software to assist in document creation:

• SPSS system to DDI 3.0 converter:(See description and link on DDI 3.0 Proof of Concept

page)

http://www.ddialliance.org/DDI/ddi3/proof.html

Page 148: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup-- Getting started --

XML editors

oXygen:

• Create new DDI instance

• Edit/update DDI instance

• Validate DDI instance

• View schemas

Page 149: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Viewing Schemas in oXygen

Page 150: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI 3.0: Viewing Schemas in oXygen

Page 151: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup-- Getting started --

Other tools to assist in producing DDI 3.0 markup:

• DDI “core” template

• Version 3.0 documentation:– Module descriptions– Field level documentation– DDI Help Center

http://www.ddialliance.org/ddi3/index.html

Page 152: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

Producing DDI 3.0 markup -- Using multiple modules --

Resource:

“Getting Started with DDI 3.0”

http://www.ddialliance.org/DDI/ddi3/getting-started.html

Page 153: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0Displaying Markup

Stylesheets:

• Basic:

Web presentation in XHTML

• Enhanced:

Adds graphics for presenting frequencies

Automated calculation of valid percentages

http://www.ddialliance.org/DDI/ddi3/proof.html

Page 154: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

DDI Version 3.0Questions? Comments?

• Sanda Ionescu: [email protected]

• DDI Users Listserv:

[email protected]

http://www.ddialliance.org/codebook/listserv.html

Page 155: Introduction to DDI 3.0 Sanda Ionescu ICPSR CESSDA Expert Seminar, September 2007.

The End