Alive and kicking! Keeping data re-usable in the European Values Study
-
Upload
cessda-training -
Category
Documents
-
view
498 -
download
2
description
Transcript of Alive and kicking! Keeping data re-usable in the European Values Study
Alive and kicking! Keeping data re-usable in the
European Values Study
IASSIST Cologne, May 2013
[email protected], [email protected]
GESIS, Data Archive for the Social Sciences
Overview
Data and information flow in the EVS project
Principles and workflows for managing data and
documentation in survey projects
GESIS Data Archive
Basis
Interplay between Principal Investigators (PI) and Data Archive
Agreement on submission of data and information packages
Goals
Ease access to data for a broad user community
Provide metadata for discovery, understanding, and good use of data
Preserve data and metadata for re-use and replications
Holdings
Studies, study series, and complex survey programs as ISSP, Eurobarometer,
ALLBUS, European Values Study (EVS), or election studies
Data and information created in a survey project
Total stock of data and
documentation created
Data and documentation
submitted to an archive
Further information necessary
for the project(?)
Selection processes
Management solutions for structuring data and information
Example: European Values Study (EVS)
9-year-period, 4 waves
49 countries, 125 national surveys
Cross-national, longitudinal
research program
National surveys
Waves
1981/1990/1999/2008
Longitudinal data File
1981-2008 (LdF)
Integrated Values Surveys
EVS/WVS (IVS)
Harmonization and integration process
Number of files
Size of files
Atlas of European Values
www.europeanvaluesstudy.eu/evs/evsatlas.html
Collaboration of actors involved (EVS 2008)
Data
created
processed
documented
National team
Data
standardized
harmonized
integrated
Central team
Data Archive Secondary users Principal Investigators
Data
checked
documented
preserved
released
Data
re-used
Analyses
replicated
Results
reported
Users: analyze and evaluate outcomes
Questions
Check trend questions and original
questions
ZACAT-Online Study Catalogue
Data
Analyze data, report errors, monitor
error reporting
GESIS Data Catalogue
Publications
Replicate analysis of other projects
EVS Repository
…. and detect peculiarities in
questions or problems in data
Peculiarities in question text spotted?
Project Design
Questionnaire Design
Questionnaire Translation
Data Collection
Data Documentation
Data Processing
Check question and translation
Master/field questionnaire, methodological
questionnaire, report ‘Translation History’
Check source of question
Trend question from EVS and WVS,
questions borrowed from other surveys
Identify consequences for
Countries sharing/adopting affected
language, languages belonging to a family,
further languages used in a country
EVS 2008 Data lifecycle
Data error detected?
Standardization and harmonization process: check comparability of surveys,
questions, variables cumulate data and document each step
Integrated Values
Surveys
EVS/WVS
Longitudi-nal data
File
1981-2008
Wave 2008
National data
Original data file
Wave 1999
…..
National data
…..
Retrace data processing steps across surveys: check data, syntax
files, and documentation update data and highlight problems for next wave
Error detected
Data and information created
Designated communities
Principal Investigator/Project
Secondary user
Experiences from EVS project
Data and information packages
Project package
Archive package
Selection processes
Within project
Between project and archive
Project
Archive
Total stock
Communicating with the future: Activity on two levels
Macro level
Defining workflows, file and information paths on which
necessary information is passed on
Micro level
Organizing information so that it is
re-usable (RDM, metadata,
systematic file structures)
Begin by identifying principles for structuring and documenting files in
the project (Research Data Management)
Select which information
is relevant
to whom?
A tidy house, a tidy mind!
Reference, don’t
duplicate files whenever possible
Identify and
capture “kinship
relations”
Capture process
knowledge
classes
itineraries Make changes
traceable versioning
document revisions &
annotations
minutes
protocols
The magic wand
Follow principles of good research
data management (RDM)
Use metadata to document process
and content information
Use standards wherever possible
(e.g. DDI, Dublin Core, ISO codes,
file naming conventions, etc.)
(and not the one used by the sorcerer’s apprentice)
Document
Date
created
Language Version
Format
Resource
Rights
Date
modified
English
Actor
Name
Collection
hasDate
hasModifier
creates
modifies hasAccessRights
isA
hasVersion
isA
hasCreator
hasLanguage
hasIdentifier
isPartOf
hasFormat
hasIdentifier
hasRole dc:creator
dc:created
dc:modified
dc:identifier
dc:format
dc:provenance
dc:description
dc:language
dc:accessRights
dc:collection
…
isA
Managing information flows in a collaborative, long-
term project
Which paths does information (data, documentation, other
contextual material) take from producers to users?
Two models helped us clarify processes and paths, as well as
identify helpful terminology and concepts
– Project life cycle
– Open Archival Information System (OAIS) reference model
(CCSDS 2012)
CCSDS (2012). Reference Model for an Open Archival Information System (OAIS). Recommended Practice.
http://public.ccsds.org/publications/archive/650x0m2.pdf
Project Repository
Ingest
Data processing
and enhancement
Data
Management
Temporary
Storage
Access
(project-internal
use, PIs)
Project Design Data
Dissemination
Questionnaire
Design
Questionnaire
Translation Data Collection
Data
Documentation
Data
Processing
Project life cycle: Data flow during creation of a survey
Guidelines
Data Archive
(preservation service provider)
Data
Management Access
Archival Storage
(long-term)
Preservation Planning
Administration
Ingest
Secondary
Users
(future)
Principal
Investigators
SIP AIP AIP
DIP
Project Repository
(content provider)
Ingest
Data processing
and enhancement
Data
Management
Temporary
Storage
Access
(project-internal
use, PIs)
Project and Data Archive as distributed system
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP
PIP = Project Information Package, SIP = Submission Information Package,
AIP = Archival Information Package, DIP = Dissemination Information Package
Project Design Data
Dissemination
Questionnaire
Design
Questionnaire
Translation Data Collection
Data
Processing
Data
Documentation
Staying Alive! Where we are going from here
Developing a guideline for projects
– structuring and annotating of information on the micro level
– issues to discuss with an Archive (preservation service provider)
Testing our model
– implementing our ideas in smaller projects with the aim of
making the results available to other projects
Thank you for your attention!
Evelyn Brislinger | Astrid Recker
GESIS – Leibniz Institute for the Social Sciences, Data Archive
[email protected] | [email protected]
www.gesis.org