DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business...

33
DDI 3.0 Conceptual Model Chris Nelson
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    3

Transcript of DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business...

Page 1: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

DDI 3.0

Conceptual Model Chris Nelson

Page 2: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Why Have a Model

• Non syntactic representation of the business domain

• Useful for identifying common constructs– Identification, versioning etc.– “patterns”

• A good basis for designing syntactic representation (e.g. XML) schemas, databases, and processing systems– Industry tools support this process (e.g. EMF)

Page 3: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Variable

Scheme

Physical

Data Product

Physical

Instance

Archive

Study

Unit

Category

SchemeQuestions

DDI 2.0

•Driven by the need to archive data

•Developed as an XML DTD

•No formal conceptual model

•No re-use of artifacts

Page 4: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

DDI 3.0 design goals – life-cycle model

The statistical production process (Secondary) use of dataArchiving

2.0

3.0

Page 5: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Variable

Scheme

Physical

Data Product

Physical

Instance

Archive

Study

Unit

Category

SchemeQuestions

DDI 2.0

Page 6: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Identification, Item Schemes, Item Scheme Associations

Component Schemes, Organisations

Group

Variable

Scheme

NCube Record

Layout

Physical

Data Product

Physical

Instance

ArchiveStudy

Unit

Data/Metadata Resource

Metadata

Report

DDI Base

Structural Metadata

Data/Metadata Management

Category

Scheme

Concept

Scheme

Data & Metadata

Structure

Question

BankInstrument

DDI 3.0

Page 7: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Identification, Item Schemes, Item Scheme Associations

Component Schemes, Organisations

Group

Variable

Scheme

NCube Record

Layout

Physical

Data Product

Physical

Instance

ArchiveStudy

Unit

Data/Metadata Resource

Metadata

Report

DDI Base

Structural Metadata

Data/Metadata Management

Category

Scheme

Concept

Scheme

Data & Metadata

Structure

Question

BankInstrument

DDI 3.0

In Line NCube Record

Layout

NCube TableLayout

NCubeLayout

Page 8: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Instrumentation module

A module in DDI 3.0 to describe survey instruments in a system independent way.

and others

To be used to drive data capturing systems or to pick up the output from these systems.

•Important metadata is entered at this stage and should be carried forward to the end data product.•Information about question flow, cues presented to the respondents etc. is important for the interpretation of the data•Often complex relationships between questions and variables.

Q

V

V

V

Page 9: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

UML Constructs as used in the DDI Conceptual Model

Page 10: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Classes and Associations (1)

Variable(from VariableStandard)

ConceptItem(from ConceptStandard)

ConceptItem(from ConceptStandard)

Variable(from VariableStandard)

0..* 0..*0..*

+conceptualSemantic

0..*

0..*

0..1

1..*

1

zero or more

zero or one

one or more

one

cardinalities

Page 11: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Classes and Associations (2)- Aggregates

CategoryScheme

CategoryItem

1..*

0..1

CategoryItem is subordinate to and “belongs to” CategoryScheme

Aggregate by reference

Aggregate by value

In the model diagrams in this presentation there is no distinction made between aggregate by reference and aggregate by value. All aggregates are shown with a open diamond.

Page 12: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Classes and Associations (3)- Unidirectional

NCubeLogicalProduct VariableDataAttribute

0..*0..* 1

+takesSemanticFrom

1

Variable is navigable from DataAttribute but not vice-versa

Page 13: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Sub Classes - Inheritance

DimensionVariable

MetadataReport(from Metadata)

Variable+documentation

0..*0..*

DimensionVariable inherits from Variable (i.e. it is a “specialisation” of Variable). Therefore DimensionVariable can have an association to MetadataReport. However, any associations from DimensionVariable are specific to DimensionaVariable and are not applicable to Variable

Page 14: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Abstract ClassesIdentifiableArtefact

id : Stringuri : Stringurn : String

Instrument(from DataCol lection)

Instrument(from DataCollection)

InternationalString(from DDI_Base)

IdentifiableArtefact

id : Stringuri : Stringurn : String

0..1 0..10..1

+description

0..1+name

0..1 0..10..1 0..1

An abstract class is drawn because it is a useful way of grouping classes, and avoids drawing a complex diagram with lots of association lines, but where it is not foreseen that the class serves any other purpose (i.e. it is always implemented as one of its sub classes).

Here Instrument inherits the attributes of Id, uri, urn.

Instrument can have a multilingual name and description.

Page 15: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

SoftwarePackage

contains metadata attribute values for:

MethodologyCollectionEventNoteUniverseOtherMaterial

QuestionItem

QuestionConstruct ComputationItem Loop

MetadataReport(from Metadata)

DataCollection

Instrument

IfThenElse

ControlConstruct

Sequence

type : String

DisplayText

StatementItem

description : StructuredString

11

+documentation

0..*

0..*

0..*

0..*

0..*0..*

0..*0..*

+thenCondition

11

+elseCondition0..10..1

0..*0..*

11

Instrument - Simplified Class Diagram

Page 16: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Question Bank - Simplified Class Diagram not in the schemas

here the multiple question item can (must) have at least one sub item which can be another multiple or a single question. At the bottom of the tree there will be only single questions

Metadata forQuestion IntentVisual AidResponse UnitAnalysis Unit

ResponseDomain

MultipleQuestionItem

sequence : Integer

QuestionText

QuestionBank

MetadataReport(from Metadata)

QuestionItem

0..*0..*

1..*

+sub question

1..*

0..*0..*

0..*

1

0..*

1

0..*

+documentation

0..*

Variable(from VariableStandard)

ConceptItem(from ConceptStandard)

0..*0..*

0..* 0..*0..*

+conceptualSemantic

0..*

Page 17: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Variable - Simplified Class Diagram metadata forDefinitionUniverseEmbargoResponseUnitAnalysisUnit

Metadata forDescription(Formal) Derivation Rules

VariableScheme

MetadataReport(from Metadata)

MetadataReport(from Metadata)

ConceptItem

QuestionItem

VariableConcatenation

DerivedVariable

+documentation0..10..1

Variable

0..*

+documentation

0..*

0..*

0..*

+conceptualSemantic

0..*

0..*0..*

+usedBy

0..*

0..10..1

0..1

+valueStoredIn

0..1

1

1..*

1

1..*

Representation

VariableRepresentation

0..10..1

0..10..1

0..10..1

Page 18: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Identifying potentially comparative data• The grouping mechanism can be used to mark up families of studies that

from the outset have been designed to be comparable.

• ...or families of studies that has been made comparable through a harmonization process.

• However, none of these mechanisms reach beyond the limit of the DDI 3.0-wrapper that binds the family of studies together.

• One of the biggest challenges for DDI 3.0 has been to define a way to describe relationships between variables across DDI-wrappers,

collections and servers.

• Use-case: “Give me more variables like this”, in other words the ability to identify potentially comparative variables across studies, collections, archives and locations.

Page 19: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Identifying potentially comparative data

• There is a mechanism in the existing DDI that to a certain degree will allow you do this. That is the ability to assign concepts from external vocabularies to variables.

Study 1

V1 V2 V3

Study 2

V4 V5 V6

External vocabulary

C1

C11 C12 C13

ConceptItem

Variable

+conceptualSemantic

0..*

0..*

0..*

0..*

Page 20: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

• In DDI 3.0 there will a more elaborated solution to the same problem, a specification of an external registry-like question-bank or classification database that will allow you to register concepts, questions and variables.

• The specification can be used to set up local question banks or question banks that are global to many organizations.

• The specification will also support statements about differences between registered variables

Study 1

V1 V2 V3

Study 2

V4 V5 V6

Identifying potentially comparative data External registry

I1 I2 I3 I4

Diff• The registry can be seen as an extension to a standard DDI document.

• ...but the specification might also include the interfaces to allow this to be set up and run as a proper registry on the Web.

Page 21: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Registries• Contains metadata that allows users/ applications to find things• The objects themselves do not need to be in the registry

– But must be accessible over the internet (preferably accessible by standardised queries and retrievable in a standardised format)

– E.g. questions in question bank category schemes variables

• Registries can have repositories to store local content• Registry standards exist and registry products are available

– But they need to be customised to support the domain(e.g. customised software that understands the DDI model and syntax implementation)

• If objects can be identified in a globally unique way, then they can be accessed and shared

Page 22: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Data Analysis

Data & Metadata

Structure

Physical

Data Product

NCube Record

Layout

Page 23: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Cube Structure - Simplified

metadata for

UniverseDefinition

can also link to a Variable

metadata for

UniverseDefinitionImputationResponseUnitAnalysisUnitPurpose

Constraint

DimensionVariable(from VariableStandard)

AttachableArtefact

CoordinateGroup

NCubeLogicalProduct

Dimension

NCubeStructure

MetadataReport(from Metadata)

Label

Measure

DataAttribute

MetadataReport(from Metadata)

Variable(from VariableStandard)

0..*

0..*

+takesSemanticFrom

1

+attachesTo

1

0..*

0..10..*

0..*

0..* 0..*

1

1..*0..1 1..*0..1

{ordered}

0..*

0..*0..*

0..*

0..1

0..*

0..1 {ordered}

1

0..1

+documentation0..*0..*

0..10..1

0..1

0..*0..*

0..10..*0..*

1

1

0..*0..*

+documentation

0..*0..*

+takes SemanticFrom11

+takesSemanticFrom

11

AttachableArtefact

Measure Dimension CoordinateGroup

Page 24: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

DataAttributeDimensionMeasure

MeasureItem AttributeItem

ItemValue

ReferencedValue

Data Structure

CubeCoordinate

EmbeddedDataValue

Variable

CoordinateVariable

1+cubeCoordinateVariable

1

NCube Record

Layout

NCube Logical

Product

Physical

Instance

Page 25: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Cube Data – Contains or Points to Data

EmbeddedDataValue

value : String

ValueLocation

startPosition : Integerwidth : IntegerdecimalPosition : IntegerdecimalSeparator : StringgroupingSeparator : Stringdelimiter : StringvariableNamesSpecified : BooleanexplicitDataType : String

ColumnPosition

columnPosition : Integer

ReferencedValue

Variable(from VariableStandard)

CoordinateVariable

1

+cubeCoordinateVariable

1

Dimension(from LogicalProduct)

DataAttribute(from LogicalProduct)

CubeCoordinate

number : Integer

1

+valueFor

1

AttributeItem

1

+valueFor

1

Measure(from LogicalProduct)

ItemValue

{TabularNCube}

11

11

MeasureItem0..*

+attribute

0..*

1

+valueFor

1

11

itemValue

Link to the Cube Structure Definition

Page 26: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

DDI 3.0 Metadata• Metadata constructs that are fairly generic and can be attached at various places in the

hierarchy.• Examples:

– Coding instructions– Description of time and geography– Citation/Abstract– Methodology etc.

• The DDI model contains a metamodel for metadata structures:– Identifies the object types to which metadata can be attached– Specifies the category/concept schemes that contains the list of valid identifiers for the object

types– Specifies the metadata reports that can be made (e. g. coding instructions, citation) in terms of

• Attributes• Value domain (e.g. format) of the attributes• Reporting hierarchy of the attribute

– Identifies to which object types the metadata report can be associated

Page 27: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Metadata Attributes

Object Identifier

Metadata Structure Definition

Identifier ComponentsItem Scheme

uses defined concepts

defines the object types to which metadata can be “attached”

specifies to which object types the

report can be “attached”

identifies the value domain of the

component

Metadata Report

Concept Scheme

concept defined inConcept

takes semantic and context

from

Target Object Type

identifies target object type of the

component

can have hierarchy

Format and Permitted Value List

Value domain

identifies target object type of the

identifierSpecifies

components for each Object (“key”

Page 28: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Metadata Structure

+takesSemanticFrom

this can be any object in the DDI model, including a specific DocumentableArtefact, thus allowing a report to be attached to or referenced from, another report

ObjectTypeScheme(from DDI_Base)

ItemScheme(from DDI_Base)

IdentifierGroup

IdentifiableObjectType(from DDI_Base)

1..*

1

1..*

1

IdentifierComponent

1..*

1

1..*

1

1

+targetClass

1

0..1

+valueDomain

0..1

AttachmentStatus

isMandatory : Boolean

MetadataStructureDefinition

1

1

1

1

IdentifiableObject

1+targetClass

1

1..*

1

1..*

1

1..*

1..*

1..*

1..*

DocumentableArtefact

Type(from DDI_Base)

Representation(from DDI_Base)

ReportStructure

1..*1..* 1..*

+attachesTo

1..*

1 1

+structureFor

1 1

ConceptItem(from ConceptStandard)

MetadataAttribute

isMandatory : Boolean

0..1

+localType

0..1

0..*

1

+child

0..*sub-structure

+parent1

0..1

+localRepresentation

0..1

1..*1..*

11

e.g. CitationCoding InstructionsUniverseAbstract

+specifies +uses

Page 29: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Metadata Set – Contains Metadata Reports

0..* 1

MetadataAttribute1

1ReportStructure

1..*

MetadataSet

MetadataAttributeValue

MetadataReport

1

1..*

1

1..*

1..*1..*

AttachmentKey1..*

+objectIdentifier

1..*

IdentifierComponentValue

1..*1..*

MaintainableArtefact(from DDI_Base)

IdentifiableArtefact(from DDI_Base)

1IdentifierComponent

0..*

1..*

IdentifiableObject

1

shows the link to the metadata structure definition

Page 30: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Modularity and grouping as a way to handle comparative data

Ques-

tions

Study

design

Variab-

les

Group

French

study

German

study

UK

study

Spanish

study

Italian

study

Extentions

Local

overrides

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Extentions

Local

overrides

Translation

Page 31: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Modularity and grouping as a way to handle multiple tables/cubes

Variables

Study

description

nCube3nCube 2nCube1 nCube5nCube4

Table

description

Table

description

Table

description

Table

description

Table

description

Group

Category

Schemes

Page 32: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Group: Logical Combination of Artifacts

0..1

Metadata forCitationPurposeAbstractUniverseOtherMaterialNote

Metadata forItemDetailsAccessOtherMaterialNote

DataCollection(from DataCollection)

StudyUnit(from StudyUnit)

Archive GroupType

time : TimeTypeinstrument : InstrumentTypepanel : PanelTypegeography : GeographyTypedataset : DataSetType

ConceptualComponentSet(from ConceptStandard)

LogicalProduct

MetadataReport(from Metadata)

QuestionItem(from DataCollection)

Group

0..*0..*

0..*0..*

0..*

1

+child0..*

subGroup

+parent

1

1..*

1

1..*

1

11

0..*0..*

0..*0..*

0..*0..*

ConceptScheme(from ConceptStandard)

CategoryScheme(from CategoryScheme)

Variable(from VariableStanda...

Study(from StudyUnit)

ComparisonStandards

1..*1..*

0..1

1..*1..*

0..*0..* 1..*1..*0..*0..*

Page 33: DDI 3.0 Conceptual Model Chris Nelson. Why Have a Model Non syntactic representation of the business domain Useful for identifying common constructs –Identification,

Thank You