IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting...

43
IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014

Transcript of IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting...

Page 1: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

IMS 2.1.3Proof of Concept for Data Capture using Metadata

Bryan FitzpatrickRapanea Consulting Limited

June 2014

Page 2: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Data Capture using Metadata

• Aim is to demonstrate designing and running a survey questionnaire based entirely on metadata

• Aim is to use DDI metadata• design the questions• organise the questions into a questionnaire• present the questionnaire• capture and save the responses

• all based entirely on the metadata

Page 3: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• DDI has metadata for Questions• a simple question goes in a Question Item

– What is your age in years?

• a complex question goes in a Multiple Question Item

– Did you do paid work last week?» Full Time or Part Time?» How many hours?

o A Multiple Question Item can contain Question Items or other Multiple Question Items

Page 4: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• Questions can link to one or more Concepts• to indicate what the question is seeking to cover

o Age, Sex, Country, Income, Occupation, ...o perhaps to qualify what is being covered

– eg Non-farm income, Tertiary qualifications

Page 5: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• Questions have:• Name

– just a multi-lingual name, not used in questionnaires• Text

– the question that is asked– can be conditional, multi-lingual, formatted

» can even have mixed language• Question Intent

– some elaboration about what is being sought» multi-lingual, formatted

• POC just uses simple unformatted multi-lingual Text

Page 6: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• Questions have Response Domains• what sort of answer is expected or valid

o Numeric domain

– can specify integer of decimal, valid formats and ranges, etco Text domain

– can specify format, lengtho Category Domain

– valid list of multi-lingual values» not really very much use

o Code Domain

– valid list of multi-lingual values with codes» a classification

Page 7: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• Questions have Response Domains• what sort of answer is expected or valid

o Date-Time Domain

– can specify formatso Geographic

– eg coordinates, other unitso Structured Mixed Response Domain

– a combination of all of the above

• all domain type can have labels and descriptions

Page 8: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Using DDI Metadata for Questionnaires

• Questions do not go directly into a questionnaireo DDI calls a questionnaire an Instrument

• questions constitute a library available for useo a “Question Bank”

• questions are selected and assembled into an Instrument

• the assembling of questions is done with Control Constructs

• an Instrument identifies a single Control Construct that builds the questionnaire

Page 9: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Control Constructs

• Control Constructs are the critical component in building a questionnaire

• they select the questions

• they control the flow of the questions– branching and looping

• they insert non-question text– “Now I want to ask you about other people in the household”

• they can compute values

• they link to Interviewer Instructionso structured DDI Interviewer Instructionso unstructured external interviewer instructions material

Page 10: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Control Constructs

• Several types of Control Constructs

• Question Construct– selects a Question Item or Multiple Question Item

• Sequence– selects a sequence of other control constructs of any type

• If-Then-Else– defines an If condition with optional ElseIf clauses (multiple) and optional

Else clause» each condition selects a single Control Construct to include

Page 11: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Control Constructs

• Several types of Control Constructs

• Loop, Repeat-Until, Repeat-While– eg to loop over people in a household

• Statement Item– inserts non-question multi-lingual text (conditional, formatted)

• Computation Item– a calculation in some language that is assigned to a Variable

Page 12: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Instrument

• Identifies a single Control Construct to assemble the questionnaire

o probably a Sequence construct

• Instruments can have an Typeo a single value taken from some Controlled Vocabulary

– a user-managed list of valid values» eg, Paper, Internet, CATI, ...

• Instruments can have multiple Software specificationso basically just identifying “software” used with instrument

– not a great deal of use

Page 13: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Instrument

• Instruments do not have any place for useful layout metadata• just the type of the layout• a fairly serious limitation

• We need quite a lot of information to do the layout• how to represent lists

o tick boxes, list boxes, combo boxes, radio buttons

• how to show flow logic• which questions to show at once, which to separate• can the respondent backtrack

• We need additional Layout Metadata• I have designed some

Page 14: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Interviewer Instructions

• A formal DDI metadata type

• Organised, structured instructions• formatted multi-lingual text

o may be conditional

• May link to external, non-DDI material• eg, PDF, Word documents

• Not used in this Proof of Concept

Page 15: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Classifications

• DDI holds Classifications as linked Code Schemes and Category Schemes

• a Category Scheme is a list of Categorieso flat list of multi-lingual names and descriptions

o eg, Country names, Occupation names, etc

• a Code Schemes selects Categories from Category Schemes, assigns a Code (not multi-lingual), and may specify a hierarchy

o a Code Scheme may select Categories from multiple Category Schemes

o multiple Code Schemes may select the same Categories

Page 16: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Code Schemes and Category Schemes

• Used for• Classifications

– a Classification is a Code Scheme

• Controlled Vocabularies– lists of standardised terms

» defined by DDI, an organisation, a local area

Page 17: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Code Schemes and Category Schemes

Page 18: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Variables

• A Variable is a container that will hold a data value• has a Name and Description (both multi-lingual)

• can be linked to a single Concept– to indicate what the data represents

• can be linked to multiple Questions– to indicate where the data comes might come from

• can have a Representation– Code, Date/Time, Numeric, Text

» with constraints on values

• can identify a Response Unit and an Analysis Unit– a population that it can apply to

Page 19: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Logical Record

• A Logical record consists of a sequence of Variables

o groups data values for a purpose

– data from a questionnaire goes into one or more Logical Records

o Logical Records can be linked

– eg, Households and Persons

o Logical Records are independent of any storage or stored format

Page 20: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Record Layouts and Physical Structures

• Map a Logical record to a physical record and an actual stored file format

• Can support a very wide range of structures and storage formats• CSV, Binary file, XML, database• multiple record types, linkages of many kinds

• POC does not actually use this• Simple CSV file maps directly from Logical record

Page 21: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Physical Instance

• Holds information about actual data sets produced• links to Physical Structures, Record Layouts, and

Logical records

• provides a central management of data from a collection

• POC uses Physical Instance to manage datao POC 2.3.3 builds on this POC to show how to use SDMX and DDI metadata

together

– produces tables from SDMX DSD using data collected with DDI» uses the Physical Instance information to find the datasets

Page 22: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC do?

• Collects survey data based entirely on metadata• builds or imports all the metadata• assembles a survey instrument (questionnaire)• presents the questionnaire in a Windows Form• collects data into Logical Records• saves the data in CSV files

• POC 2.3.3 builds on this POC• duplicates Concepts and Classifications to SDMX

o tightly-coupled set of metadata

• uses SDMX DSD to produces tables from the collected data

Page 23: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Basically POC system is a metadata creator/editor• Build, import Concepts

o build in UI, import from CSV, SDMX V2.0

• Import Classificationso import CSV, SDMX V2.0o did not implement build in UI

• Construct Questions in UIo multi-lingual text with links to Conceptso POC almost supports Multiple Question Items

– by the end of the week

Page 24: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Build Variableso links to Concepts and Questions

• Build Control Constructso Question constructs

» link to a questiono Sequence constructs

» define a sequence of other Control Constructso If-Then-Else constructs

» allows conditional questionnaire flowo POC does not support Loop, Repeat-Until, Repeat-While,

Computation Item, and Statement Item constructs

Page 25: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Define Instrumentso link to a single Control Construct

– probably to a Sequence constructo has an instrument type

– Windows Form, Web, Paper, CATI, ..– POC only supports Windows Form

• Define Logical Recordso collection of Variables

• Map Logical record to Instrumento map Variables to Questions

– uses Concept links and question links if present– allows user override in UI

Page 26: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Present the Instrumento Render the instrument in a Windows Form

– using some layout metadata I made up

o Execute the Control Constructs to select questions and manage question flow

o present questions in list box, combo box, radio buttons, text box

– depending on response Domain and some Layout Metadata» not DDI metadata, my design

o capture the responses into a Logical record

– based on the Logical Record – Instrument mapping

o present questions in language of choice

» limited choice

Page 27: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Retrieve Logical Record seto run interview multiple times to get a set of logical recordso results displayed on screen in tabular formo mapping as defined in Response Domains and Logical record to Instrument

map

• Save the Logical records in data fileo CSV fileo no actual Record Layout and Physical Structure metadata

– simplest Record Layout and Physical Structure metadata is almost empty anyway

» it is just saving the Logical Records with Variables separated by commas

Page 28: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

How does the POC do This?

• Save details of CSV data sets in Physical Instance metadata• so the data can be found for subsequent

operationso like producing tables in POC 2.3.3

Page 29: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Metadata is in a DDI Instance fileGroup

DDI Instance

Study Unit

Concepts

Code Schemes and Category Schemes

Variables

Control Constructs

Logical Records

Physical Instance

Instruments

Questions

Layout Metadata file

Layout Metadata

Page 30: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

DDI is fairly complex

• But so is designing a Questionnaire!• Constructing questionnaire involves constructing a lot of

metadata• but DDI is fairly logical

• you need to think about what you want in the questionnaire and how you want it to flowo but you need to do this if you are designing a questionnaire manually

• once you design questionnaire you get re-use advantageso easy to modify, add languageso easy to adapt to some other purposeso easy to reuse useful questions and constructso easy to capture the data

Page 31: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Let us look at my test questionnaire

• Simple questions about internet access• based on Eurostat ICT survey

• includes two If-Then-Else constructs to manage flow

• we will look at the structure firsto a good way to plan your own questionnaireo a good way to see how the DDI metadata works

Page 32: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

ICT question flow1 Country

A1 Do you have access to a computer at home?

2 Sex

3 Age

A2 Do you have access to the internet at home?

B1 When did you last use a computer?

If (within last 3 months) B2 How often on average?

C1 When did you last use the internet?

If (within last 3 months) C2 How often on average?

Page 33: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

ICT question flow1 Country ------- QC 1

A1 Do you have access to a computer at home? ------- QC 4

2 Sex ------- QC 2

3 Age ------- QC 3

A2 Do you have access to the internet at home? ------- QC 5

B1 When did you last use a computer? ------- QC 6

------- QC 7

------- QC 8

------- QC 9

If (within last 3 months) B2 How often on average?

C1 When did you last use the internet?

If (within last 3 months) C2 How often on average?

QC – Question Construct

Page 34: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

ICT question flow1 Country ------- QC 1

A1 Do you have access to a computer at home? ------- QC 4

2 Sex ------- QC 2

3 Age ------- QC 3

A2 Do you have access to the internet at home? ------- QC 5

B1 When did you last use a computer? ------- QC 6

------- QC 7

------- QC 8

------- QC 9

If (within last 3 months) B2 How often on average?

C1 When did you last use the internet?

If (within last 3 months) C2 How often on average?

QC – Question Construct

If C 1

If C 2

If C – If-ThenElse Construct

Page 35: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

ICT question flow1 Country ------- QC 1

A1 Do you have access to a computer at home? ------- QC 4

2 Sex ------- QC 2

3 Age ------- QC 3

A2 Do you have access to the internet at home? ------- QC 5

B1 When did you last use a computer? ------- QC 6

------- QC 7

------- QC 8

------- QC 9

If (within last 3 months) B2 How often on average?

C1 When did you last use the internet?

If (within last 3 months) C2 How often on average?

QC – Question Construct

If C 1

If C 2

If C – If-ThenElse Construct

Seq C 1

Seq C – Sequence Construct

Page 36: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Let us have a look at the POC

• Demo

Page 37: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

Page 38: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

• It is realistic to use the DDI metadata to design and present a survey questionnaire• The POC depended entirely on the metadata

o absolutely no knowledge about the survey built into the system

Page 39: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

• It is realistic to use the DDI metadata to design and present a survey questionnaire• The POC depended entirely on the metadata

o absolutely no knowledge about the survey built into the system

• Designing a survey questionnaire in DDI is fairly complex• So is designing one by manually

Page 40: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

• It is realistic to use the DDI metadata to design and present a survey questionnaire• The POC depended entirely on the metadata

o absolutely no knowledge about the survey built into the system

• Designing a survey questionnaire in DDI is fairly complex• So is designing one by manually

• There are some clear advantages• modification and reuse is easy• multi-lingual presentation is easy

Page 41: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

• Really need some form of Layout Metadata• DDI Instrument and Control Construct metadata

gives no guidance on layout• POC case was very simple

o but still needed some layout informationo realistic survey questionnaire needs considerable layout information

• needs more thought to design this metadata

Page 42: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

What does the POC show?

• POC used a Windows Form questionnaire• no practical use for production survey• but other questionnaire formats should be easy

o hope to have paper example (in Word) by end of weeko script for a Web Form questionnaire is straight-forward

– but Web Form system still needs to process Control Constructso script for CATI is easyo script for Blaise should be easy

• easy to have questionnaire available in multiple formats

Page 43: IMS 2.1.3 Proof of Concept for Data Capture using Metadata Bryan Fitzpatrick Rapanea Consulting Limited June 2014.

Thank you

• Questions?

• Bryan Fitzpatrick

Rapanea Consulting [email protected]

Ph +44-7789-886536