ESDS: Using and archiving research data Laurence Horton Economic and Social Data Service UK Data...

Post on 28-Mar-2015

215 views 1 download

Tags:

Transcript of ESDS: Using and archiving research data Laurence Horton Economic and Social Data Service UK Data...

ESDS: Using and archiving research data

Laurence Horton

Economic and Social Data ServiceUK Data Archive

03 November 2008

ESDS

• national data archiving and dissemination service, running from 1 Jan. 2003 www.esds.ac.uk

ESDS holdings

Data for research and teaching purposes and used in all sectors and for many different disciplines

• official agencies - mainly central government

• individual academics - research grants

• market research agencies

• public records/historical sources

• links to UK census data

• qualitative and quantitative

• international statistical time series

• access to international data via

• links with other data archives worldwide

• history data service in-house (HDS)

• 5,000+ datasets in the

collection

• 250+ new datasets are

added each year

• 6,500+ orders for data

per year

• 60,000+ datasets

distributed worldwide

p.a.

ESDS structure• ESDS Management

– central help desk service; coherent and flexible collections development policy; central registration service; links to other ESRC resources

• ESDS Access and Preservation

– collections development strategy; ingest activities - including data and documentation processing; metadata creation; data dissemination services; long-term preservation

• Specialist data services

– ESDS Government– ESDS International– ESDS Longitudinal – ESDS Qualidata

• dedicated web sites• data and

documentation enhancements

• tailored user support• outreach and training

Kinds of data ESDS deal with• quantitative

– micro data are the coded numerical responses to surveys with a separate record for each individual respondent

– macro data are aggregate figures, for example country-level economic indicators

– data formats include SPSS, Stata and tab delimited formats

• qualitative – data include in-depth interviews, diaries, anthropological field

notes and the complete answers to survey questions – data formats include Excel, Word and RTF

• multimedia – a small number of datasets may include image files, such as

photographs, and audio files

• non-digital material – paper media could include photographs, reports, questionnaires

and transcriptions – analogue audio or audio-visual recordings

ESDS Government data• General Household Survey• Continuous Household Survey (NI)• Labour Force Survey/NI LFS• Health Survey for

England/Wales/Scotland • Family Expenditure Survey/NI FES• British/Scottish Crime Survey• Family Resources Survey • Expenditure and Food Survey • ONS Omnibus Survey

0

5

10

15

20

25

30

1979 1985 1989 1991 1993 1995 1998 2000

Percentage of women aged 18-49 cohabiting

General Household Survey

• Survey of English Housing • British Social Attitudes/Scottish Social Attitudes/Young

People’s Social Attitudes/NI Life & Times• National Travel Survey• Time Use Survey• Vital Statistics for England and Wales

ESDS Longitudinal Data • main studies that are primarily UK Research Council:

– British Household Panel Survey (BHPS)

– British Birth Cohort studies:• National Child Development Survey (NCDS)• British Cohort Study 1970 (BCS70)• Millennium Cohort Study (MCS)

– English Longitudinal Study of Ageing (ELSA)

– Longitudinal Study of Young People in England (LSYPE)

– possible forthcoming Medical Research Council population study datasets – 1946 Birth Cohort

British Birth Cohort Studies

• impact of childhood conditions on later life and understanding children and families in the UK

• national Child Development Study follows a cohort born in a single week in 1958 - data collected at birth & ages 7, 11, 16, 23, 33, 42 (7 Up TV series)

• 1970 British Cohort Study follows a cohort born in a single week in 1970 - data collected around birth & ages 5, 10, 16, 26, 29 and most recently at age 34

• Millennium Cohort Study focuses on children born in 2000/ 2001 - first sweep at 9 months, second sweep at 3 years

• wide range of social, economic, health, medical and psychological issues

Longitudinal data

• longitudinal surveys involve repeated surveys of the same individuals at different points in time

• allow researchers to analyse change at the individual level

• more complex to analyse

ESDS International data portfolio

• regularly updated macro-economic time series datasets from selected major international statistical databanks that collectively chart over 50 years of global economic, industrial and political change:

– the International Monetary Fund – the OECD – the United Nations– the World Bank – Eurostat– the International Labour Organisation– the UK Office for National Statistics

• access to micro data surveys– Eurobarometers, Latinobarometers– International Social Survey Programme– other social data via other national data archives

access for UK HE/FE only

International data themes

• economic performance and development• trade, industry and markets• employment• demography, migration and health• governance• human development • social expenditure• education• science and technology • land use and the environment

Databanks cover:

International survey data

•ESDS International at the UK Data Archive (UKDA) can help users to locate and acquire data from other archives within Europe and worldwide, using a series of reciprocal agreements with the individual institutions.

•Datasets include:

– Eurobarometer– International Social Survey Programme – World Values Survey

ESDS Qualidata

• diverse data types: in-depth interviews ; semi-structured interviews; focus groups; oral histories; mixed methods data; open-ended survey questions; case notes/records of meetings; diaries/ research diaries

• data from National Research Council (ESRC) individual and programme research grant awards

• data from ‘classic’ social science studies

• other funders/sources

Classic sociology datasets

• Peter Townsend – Poverty, old ageand Katherine Buildings

• Paul Thompson – oral history and Edwardians

• Mildred Blaxter’s ‘Mothers and Daughters’

• Ray Pahl –Hertfordshire Villages studies

• National Social Policy and Social Change Archive

Finding data

• Catalogue of holdings

– Describes study, methods and data collection

– Records all study related publications

– Lists variables for SPSS datasets

– Can download user guide free

– Link to web download of dataset

Accessing data

DOWNLOAD TO LOCAL MACHINE

• You first need to register using Athens or UK Federation.

• You agree to an End User Licence

• You specify a project for which you'd like to use data

• You download data selecting your desired format (SPSS, STATA, ASCII, RTF etc)

• You get an idea of file size

Accessing data online

• online data analysis, including

– Simple data analysis, visualisation, downloading and subsetting via Nesstar

– ESDS Qualidata Online – interview transcripts

– ESDS Government Vital Statistics online

– International macro data via Beyond 20/20 and visualisation interface

– Census data

Cross-tab

Instantly chart it

ESDS Qualidata Online

Creation of digital multimedia resources that integrate existing primary and secondary materials:

• catalogues of interview summaries • full electronic interview transcripts • thematic browsing of interview transcripts • collections of digital sound clips • contextual photos• background information and press reviews on the

original studies • details of publications based upon secondary

studies of the collections

Help for users

• help desk and web site• dedicated survey pages• JISCmail list• regularly updated web-based FAQs • programme of training courses and publicity events• news bulletins and articles • resources (links to other sites)• teaching datasets and/or exemplars• enhanced documentation e.g.

dataset and software guides statistical guides (SPSS, Stata, weighting) Variables consistent over time on specific surveys

(ESDS Government) Thematic guides

Each specialist service provides:

The Census Portal

• the Census Portal provides one-stop registration and support for access to:

– Census Dissemination Unit from MIMAS – aggregate tables/Casweb

– Census Geography Data Unit (UKBORDERS) from EDINA – boundaries data

– Census Interaction Data Service (Universities of Leeds and St Andrews) - flow data

– Samples of Anonymised Records from CCSR – micro data

– CHCC - Historical Census Collection from AHDS History

History Data Service

• particularly strong in 19th and 20th century economic and social history

• census data (1881 100% sample; 1851 2% sample; lots of local census returns)

• Great Britain Historical Database online• taxation materials• large-scale datasets of Welsh and Irish historical

statistics • electoral data (poll books for local areas) • criminal court records (e.g. a collection of datasets on

violent crimes 1600-1900) • agricultural statistics (prices, output) • surveys of Scottish witchcraft• state finance data• economic indicators/industrial production data

Secondary analysis potential

• descriptive material

• comparative research, restudy or follow-up study

• re-analysis/secondary analysis

• research design and methodological advancement

• replication of published statistics

• teaching and learning

Secondary analysis potential

• description

• comparative research, restudy or follow-up study• augment data you collect e.g. expand sample size

• re-analysis or secondary analysis

• verification

• research design and methodological advancement

• teaching and learning

Re-using qualitative data

• Archived qualitative data are a rich and unique, yet too often unexploited, source of research material.

• They offer information that can be re-analysed, reworked, and compared with contemporary data.

• In time, too, archived research materials can prove to be a significant part of our cultural heritage and become resources for historical as well as contemporary research.

• What then are the methodological, ethical and theoretical considerations relating to the secondary analysis of qualitative data?

Culture of re-use

• well-established tradition in social science of reanalysing quantitative data

• no logical intellectual reason why this should not be so for qualitative data

• however, among qualitative researchers no similar research culture

• lack of discussion of the issues involved in literature on the benefits and limitations of such approaches

• more now published, but more needed …!

Data and Methods

• often a diversity of methods and tools rather than a single one are encompassed

• types of data collected vary with the aims of the study and the nature of the sample

• samples are most often small, but may rise to 500 or more informants

• as we have seen data include interviews, group discussions, fieldwork diaries and observation notes, personal documents, photographs etc.

• created in a variety of formats: digital, paper (typed and hand-written), audio, video and photographic

Description

• describing the contemporary and historical attributes, attitudes and behaviour of individuals, societies, groups or organisations

• data created now, will in time become a unique historical resource

• providing alternative sources (the people’s voice etc.) to the public record that will be deposited in archives

Comparative research, replication or restudy

• of original research

• to compare with other data sources

• to provide comparison over time or between social groups or regions etc.

• to follow up original sample

• verification - substantiating results, although we have yet to see any evidence of re-use for this purpose (might be useful in a teaching context though)

Re-analysis

• secondary analysis

• asking new questions of the data and making different interpretations to the original researcher

• approaching the data in ways that weren't originally addressed, such as using data for investigating different themes or topics of study

• the more in-depth the material, the more possible this becomes

Research design and methodological advancement

• designing a new study or developing a methodology or research tool by studying sampling methods, data collection and fieldwork strategies and topic guides

• although researchers often publish a section on methods used, researchers' own fieldwork diaries can offer much insight into the history and development of the research

• encourage researchers to reflect on the researcher’s own experiences

‘Difficulties’ in re-using data

• practice of secondary analysis of qualitative data is not a commonplace research activity

• Major barriers cited:

– problem of the implicit nature of qualitative data collection and analysis – context and reflexivity

– lack of time to get fully acquainted with research materials created by someone else

– Constraints of informed consent

– insecurity about the exposure of one’s own research practice; IPR or threat of misinterpretation

– Lack of publicly available research data

Prerequisites for undertaking secondary analysis

• Having a rich an diverse stock of quality data sources, without excessive restriction

• Having access to original sources where possible, e.g tape recordings or full transcriptions

• Having access to data contextualising material e.g. online catalogues, lists, methodology etc.

• Have a solid foundation in “primary” data analysis – range of qualitative research methods

• Possess rudimentary skills in computer assisted qualitative data analysis software (not essential, but useful)

• Have adequate time to engage in the project

ContactContact

www.esds.ac.ukhelp@esds.ac.uk