Managing your research data

Post on 15-Jul-2015

403 views 0 download

Transcript of Managing your research data

Managing your research data

Jenny Mitcham (Digital Archivist)Lindsey Myers (Research Support Librarian)

Spring Term 2017

Information Services

Overview

• What is research data?

• What is research data management?

• Why manage research data?

• Data management planning (this is key)

• How to manage research data (best practice)

• Preserving and sharing research data

• Help and advice

What is research data?

What is data?All the information you use as an integral part of your research

This is my research data

No documentationObsolete media

Data not accessible

Cryptic names

Poorly organised

Missing information about software

Backwards compatibility not assured

Your research data

Group discussion• What data will you produce during the course of your

project?

• How will you collect or create the data? What methods/standards will you use for data creation?

• If pre-existing data is being used, where will it come from? How will it be used?

• How much data do you expect to generate?

Leave some time to record information about your research data (Q1a and Q1c) on the DMP template provided.

What is research data management?

Research data management is …

A general term covering how you organize, structure, store, and care for the information used or generated during a research project

Research data management is …

How you look after information on a day-to-day basis over the lifetime of a project

What happens to data in the longer term -what you do with it after the project concludes

Good research practice:

• organising your data

• storing and backing up your data

• choosing the right file formats

• creating documentation for your data

Research data management is …

How you look after information on a day-to-day basis over the lifetime of a project

What happens to data in the longer term -what you do with it after the project concludes

• What data do you need to keep (and share)?

• What data must not be kept (and shared)?

• Where are you going to archive your (selected) data for long-term storage/access?

Why manage your research data?

Carrots and sticks

Carrots - the benefits

Sticks - requirements

• Work efficiently and with minimum hassle over the lifetime of the project

• Save time and avoid problems in the future

• Make it easy to share your data

Carrots and sticks

Carrots - the benefits

Sticks - requirements

• University of York Research Data Management Policy

www.york.ac.uk/rdm-policy

• Funding body requirements

University requires …

Good management of research data over the lifetime of your project

Selected research data to be preserved (for a min of 10 years) and shared at the end of the your project

Research data must be:

• accurate, complete, authentic and reliable

• identifiable, retrievable and available when needed

• kept safe and secure, avoiding data loss

• kept in a manner that is compliant with legal and ethical obligations, and (if applicable) funder requirements

• disposed of securely.

University requires …

Good management of research data over the lifetime of your project

Selected research data be preserved (for a min of 10 years) and shared at the end of the your project

Sharing of research data:

a. of long-term value

b. underpinning published results

where there are no legal, ethical or commercial constraints that would prohibit sharing.

Funder requirements

Research Councils UK

“Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.

Data with acknowledged long term value should be preserved and remain accessible and usable for future research.”

RCUK Common Principles on Data Policy www.rcuk.ac.uk/research/datapolicy

Mechanisms for sharing research data

Deposit your selected data with an external service

Transfer your selected data to the University Research Data York service

• a funder data archive /repository

• a subject data archive/ repository

• a publisher data archive /repository

www.re3data.org to identify a suitable data archive or repository for your data

+ Record the dataset in PUREwww.york.ac.uk/library/info-for/researchers/data/guidance/pure-datasets

Mechanisms for sharing research data

Deposit your selected data with an external service

Transfer your selected data to the University Research Data Yorkservice

• We’ll need some descriptive metadata (PURE)

• We will store and manage access to your data for a minimum of 10 years

– a CC-BY licence is applied to open data

– data with restricted access.

Depositing your data

www.york.ac.uk/library/info-for/researchers/data/sharing/#tab-4

Data management planningplan ahead to succeed in data management

Data management plans

Create a Data Management Plan (DMP)

Tools:

• DMPonline

• York DMP template

A formal document which outlines all aspects of your data management, i.e. what you will do with data during and after your research project ends.

To include:

• Description of the data

• Data collection methods

• Ethics and IPR

• Plans for data sharing

• Strategy for long-term preservation.

Data management plans

Create a Data Management Plan (DMP)

Tools:

• DMPonline

• York DMP template

Required by most funders

AHRC “A Technical Plan should be no more than four pages long and provided for all applications where digital outputs or digital technologies are an essential part to the planned research outcomes.”

“All applications seeking research grant funding from BBSRC must submit a data management plan.”

“ESRC applicants who plan to generate data from their research must submit a data management plan…”

www.dcc.ac.uk/resources/data-management-plans/funders-requirements

Data management plans

Create a Data Management Plan (DMP)

Tools

DMPonline

https://dmponline.dcc.ac.uk

An online tool, created by the Digital Curation Centre, which is designed to help you create personalised data management plans according to the requirements stipulated by the major UK funders.

York DMP template for postgraduate research projects www.york.ac.uk/library/info-for/researchers/data/planning

Data management plans

‘In preparing for battle, I have always found that plans are useless but planningis indispensable.’

Dwight D. Eisenhower

Day-to-day data management

Organising your data (good file management)

‘What a mess’ by .pst, via Flickr: www.flickr.com/photos/psteichen/3915657914

Can you find what you need, when you need it?

In practice

Specific techniques for organising your research data, including developing plans for:

• folder structures - where to put data so you won’t lose it

• file (and folder) naming - what to call data so you know what it is

• version control - keeping track of data

Good file management practises are required to enable you to identify, locate and use your research data files now and into the future.

How not to do it...

Can you spot any problems and issues with how these files and directories are named?

This would be better...

• Use directories to help categorise data

• Make use of the folder hierarchy

Version control can get messy

Version control guidance

File and folder names should be...

• Concise and meaningful• Descriptive• Consistent

Also think about...• How you want your files to order• Version control• Using standard alphanumeric characters – avoid punctuation,

spaces etc.• Lower/upper/proper case

Happiness is:

“knowing what your file is before you double click on it”

Day-to-day data management

Storing your data (keeping your data safe)

http://blogs.ch.cam.ac.uk/pmr/2011/08/01/why-you-need-a-data-management-plan

Storage don'tsDon’t

• keep your data just on your working machine (laptop or a desktop) – it’s the perfect way to lose your data easily and permanently

• upload personal/sensitive data to services the University does not have a contract with, e.g. Dropbox -breach the Data Protection Act

My Project

Storage do’s

The University offers a range of facilities to securely store your data, helping it live a long and useful life.

The University recommends:

University filestore/s (individual or shared)

• required for guaranteed UK storage

• required for “must be kept on site”

Note: You can access it and work on your files off-campus via the VPN

www.york.ac.uk/it-services/filestore

My Project

Storage do’s

The University offers a range of facilities to securely store your data, helping it live a long and useful life.

The University recommends:

University (york.ac.uk) Google Drive

• does not guarantee UK location, but does comply with UK & EU data protection legislation

• Google Apps legal informationwww.york.ac.uk/it-services/google/policy/terms

www.york.ac.uk/it-services/services/drive

My Project

Storage do’s

University filestore has the advantage of being automatically/regularly backed up

If you are not using it …backing up should be an automatic part of your everyday research practice.

Imagine if a fire or similar disaster happened here.

How much would it cost you?

Mountbatten Building, So’ton Uni.

Storage do’s

Protecting confidential information

The University requires that any device (laptops, tablets, phones, your email) that holds sensitive or confidential information

is encrypted.

IT Services offer guidance and support to help you protect your confidential data, whether it's files that you need to share securely or a device that requires encryption.

www.york.ac.uk/it-services/security/encryption

Familiarise yourself with the University's

• Information Security Policy

• Data Protection Policy• Records Management

Policy

Information Policy & You http://bit.ly/1lKWnwr

Storing physical data guidance

A data management horror story

Highlights the things that can go wrong if you don’t manage and share your data well.

Part 1: I can’t find the data … where is it stored?

Video by NYU Health Sciences Libraries Part 1: Request for data https://youtu.be/RVZbk3GEVSw

Storing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record how you will keep your data safe on your DMP.

DMP template

Q2a. Where will you store your data?

• Is it digital or physical (e.g. print) data?• Is it sensitive or confidential data?• Are you collaborating/working with others?• Are you working off campus at all?

Q2b. How will you back-up your (digital) data?

You have … 5 minutes (and take a break)

Day-to-day data management

Choosing the right file formats

File formats: during your research

• During your research you need a file format:

– That you can work with/collect/create/analyse etc

• Can depend on hardware and software

– That fits with your research methodology and workflows

File formats: after your research

• After your research is complete you need a file format:– That is easier to share and re-use– That may last longer into the future

• Open specifications• Widely used formats• Uncompressed• ASCII formats• Exchange formats

– Does the software you are using have an option to export into a more suitable format for sharing or long term re-use?

A data management horror story … continued

Part 2: File formats … what software do I need?

Video by NYU Health Sciences Libraries

Part 2: File formats https://youtu.be/RtSv0gSbCP8

Your file formats

Individual task. Refer back to the research data you listed for Q1a. on your DMP and consider:

DMP template

Q1b. • What file formats will you use to collect, create

and analyse your data?• What file formats will you use to share your data

and keep it for the longer term?

You have … 5 minutes

Day-to-day data management

Documentation & metadata (describing your data)

Documentation and metadata

• Documentation is the contextual information required to make data intelligible and aid interpretation

• A users’ guide to your data

• Metadata is similar, but usually more structured

– Can conform to set standards

– Sometimes machine readable

Why do we need documentation and metadata?

• So you can understand it

• So other people can understand it

• So your findings are verifiable

• So others understand your methodology

• So others can repeat your methods

• To make it reusable

Will someone else understand your data if it isn’t documented?

"The single most useful thing you can do to ensure the long-term preservation of your data is to plan for it to be re-used. Imagining it being reused by someone else who has never met you and who never will meet you, will cause you to approach the creation and design of your data in a new light. ..... In short, always plan for re-use"

Professor Julian D. Richards, Director, Archaeology Data Service, University of York

A data management horror story … continued

Part 3: I can now access the data … but what do the column names mean?

Video by NYU Health Sciences Libraries

Part 3: Documentation https://youtu.be/-MIH8PkuUo4

DocumentationIn your group, look at the sample data sheet. Imagine you have just downloaded this dataset from an archive. Discuss:

• What contextual or explanatory information is missing?- anything odd about the data that needs clarifying?

• What additional documentation would you like to see supplied?- about specific items of information recorded here- about the data collection as a whole

You have … 5 minutes, then feedback

• Who created it, when and why

• Description of the item• Methodology and methods• Units of measurement• Definitions of jargon,

acronyms and code• References to related data

Documentation – what to include

www.texample.net

Your documentation

Individual task. Refer back to the research data you listed for Q1a. on your DMP and start to record:

DMP template

Q2g.

• What documentation will you need to produce to enable yourself and others working in your discipline to understand it in the future?

• What format will it be in?

You have … 5 minutes

What happens at the end of the project?

Keeping selected data

Keeping selected data

Why not keep everything?

Data appraisal for your project

Just because you can preserve, doesn’t always mean you should.

• costs of preserving data (time, technology, space, maintenance)

• risks in keeping things

e.g. Freedom of Information Act(FOI) requests

Keeping selected data

Why not keep everything?

Data appraisal for your project

Appraisal is a concept

familiar to archivists:

“the process of evaluating

records to determine which

are to be retained as

archives, which are to be

kept for specified periods

and which are to be

destroyed’

Ellis, J. (ed.) (1993) Keeping Archives. 2nd ed. Melbourne: Australian Society of Archivists, p.461.

Keeping selected data

Why not keep everything?

Data appraisal for your project

PrePARe checklisthttp://find.jorum.ac.uk/resources/10949/17171

Keeping selected data

Why not keep everything?

Data appraisal for your project

Before your project ends …

• What data should you keep (and share)?University Policy, funder and publisher requirements

• What data must not be kept (and shared)?for ethical, legal or commercial reasons

www.york.ac.uk/library/info-for/researchers/data/sharing

What happens at the end of the project?

Sharing selected data

Why share data?

Validation of research is important

http://inkouper.blogspot.co.uk/2015/09/valuable-lessons-from-sharing-and-non.html

Data sharing – concerns

• Ethical concerns

– Confidential or sensitive data

• Legal concerns

– Third party data

• Professional concerns

– Intended publication

– Commercial issues (e.g. patent protection)

• Redact or embargo if there is good reason

• Planning ahead can reduce difficulties

Data sharing – concerns

Mechanisms for sharing

Remember…

Two options are available to you.

Deposit your selected data of long-term value:

• with external services, i.e. a funder/subject/publisher repository

• with University services (contact us).

Sharing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record:

DMP template

Q4a.• What data should or shouldn’t be shared openly and

why?

• Is there anything you need to do to enable you to share your data?

Q4c.• How do you intend to share your data?

You have … 5 minutes

Further information and resources

RDM resources

RDM web pages

Research Data MANTRA

RET courses

www.york.ac.uk/rdm

RDM resources

RDM web pages

Research Data MANTRA

RET courses

http://datalib.edina.ac.uk

RDM resources

RDM web pages

Research Data MANTRA

RET courses

www.skillsforge.york.ac.uk

• Data Protection

• Integrity and Ethics

• Know your (Copy)Rights: protecting your own work and re-using other people’s

We are here to help

Research Support Team lib-research-support@york.ac.uk

IT Support Officeitsupport@york.ac.uk

What you need to do

Before your project

Plan your data management

1. Write a funder data management plan OR write a (York) DMP

2. Pull together all eligible costs

During your project

Look after your live data

3. Update the DMP

4. Organise the data

5. Store the data

6. Describe the data

7. Decide what data to keep

End of your project

Start sharing and depositing data to meet University/funder requirements

8. Deposit (and share*) the data

9. Obtain a DOI

10. Include a data access statement in publications

* Decide if data can be shared openly or if access to the data needs to be restricted.

A final word from you?

Questions?

Comments

• things you’ve learnt• what you need to investigate further

This is just the beginning …