Incentivising the uptake of reusable metadata in the survey production process
-
Upload
louise-corti -
Category
Technology
-
view
209 -
download
0
Transcript of Incentivising the uptake of reusable metadata in the survey production process
Incentivising the uptake of reusable
metadata in the survey
production process
ESRA15
Reykjavik
July 2015
Louise Corti
Collections Development and
Producer Support
Why worry about metadata?
• No universal language used to document
questions and variables
• Too many bespoke systems and vocabularies
around
• Massive waste of human resource in the survey
data lifecycle
• Interoperability saves money
• Why don’t we all use the Data Documentation
Initiative (DDI)?
Show how to exploit metadata for surveys
• Challenge – to get established survey operations to recognise the benefits of reusable metadata
• Midlife study in the US (MIDUS) quite unique!
• Help funders, owners and producers ‘See the light’
• For this we need to show something very cool
• Some good experimental stuff happening
Benefits of publishing rich survey metadata
• Survey documentation systems
• Question banks
• Survey data exploration systems
• Nesstar
• SDA
• Bespoke visualisation systems
The reality
• Hard to match up Question and Variable information
• Too much manual data entry involved in publishing
• Must do better
• Gain rich reusable metadata from the survey design and production process
Survey production lifecycle
• Beset with manual processes
• Legacy systems
• Reluctancy to change or adapt systems
• Hard to embrace new ways – disruptive,
expensive
Typical process – worst case scenario
• Manual questionnaire entry
(doc/excel/database)
• Export in word format
• Deliver to survey agency
• Manual transfer to IBM Data Collection
• Export SPSS and PDF/word questionnaire
Meeting outcomes
• Great turn out and knowledge exchange!
• Quick turn around of principles into a ‘campaign’
document and a published ‘Questionnaire profile’
• Some very positive responses – shared problem
• Be an advocate!
Increasing use of XML for survey design and
publishing
Such as:
• Social science data archive published survey
metadata (DDI 2.5)
• Essex panel studies - bespoke XML Questionnaire
Specification Language for survey design
• UK LifeStudy – survey design instrument – XML
Discussing DDI implementation today
• CLOSER cohorts portal using DDI 3.2 Questionnaire
Profile
• DASHISH DDI 3.2 use
• Blaise – import by Michigan Questionnaire
Documentation System (MQDS) DDI 3
• IBM Data Collection DDI experiments
Short brochure for sharable survey products
• Work closely with data owners and producers
• Existing information on data sharing complex
• What is really expected!
• Transferrable information
• Not a bible
Sticks?
• Specifying data documentation requirements in the
commissioning tender for fieldwork
• Mapping between questions and data outputs
• Improved readable questionnaire for end users
CLOSER project
• Funded variable/question discovery service
• Long-running birth cohorts & longitudinal studies
• Drivers for project
• Harmonisation (biomedical, socio-economic)
• Capacity building
• Data Linkage
• Impact
• Discovery
• Encourage use of existing data resources
• Tools for enhancing survey metadata
Incentives for CLOSER PIs?
• Large award to get prestigious cohort studies on board £££
• Reduce burden - enhancement work done centrally
• Survey data managers
happy to be part of peer group
rewarding to to go back and look at data
liked a shared controlled vocabulary
Received training
variable to questionnaire mappings useful
liked visibility of their study in the search platform
Forward looking survey design
• Think upfront about reusability of questionnaire metadata
• New studies – new opportunities
• Legacy work to get old messy survey design metadata into a new environment – may be worth investing in
• Can make harmonisation work so much easier – XML schema allow formal linkages of variables across time, equivalence, differences etc.
Data publishers
• Survey owners/producers - documentation online
• Question banks
• Journals - supporting data with sufficient metadata
• Use DDI 3.2 Questionnaire profile, not bespoke
schemas
Self-deposit expectations?
• Peer review of data by data centres for all data
published – includes quality of metadata
• Journals – no unified standard for data description
or documentation
• Start with minimal metadata expectations:
• data collection description
• provenance
• data description: file and variable names, labels,
• relationships between tables/files
Some tips on incentivising
• Speak a common language
• On DDI, don’t drown in detail; use existing profiles
• Start with the lowest common denominator. Baby steps
• Show value – shiny interfaces and examples!
• Provide supporting tools where possible e.g. metadata entry
• Integrate into everyday workflows and research tools
CONTACT
UK Data Service
University of Essex
Wivenhoe Park
Colchester
Essex CO4 3SQ
• ……………..…..………………………..
T +44 (0)1206 872145