Managing your research data

76
Managing your research data Jenny Mitcham (Digital Archivist) Lindsey Myers (Research Support Librarian) Spring Term 2017 Information Services

Transcript of Managing your research data

Page 1: Managing your research data

Managing your research data

Jenny Mitcham (Digital Archivist)Lindsey Myers (Research Support Librarian)

Spring Term 2017

Information Services

Page 2: Managing your research data

Overview

• What is research data?

• What is research data management?

• Why manage research data?

• Data management planning (this is key)

• How to manage research data (best practice)

• Preserving and sharing research data

• Help and advice

Page 3: Managing your research data

What is research data?

Page 4: Managing your research data

What is data?All the information you use as an integral part of your research

Page 5: Managing your research data

This is my research data

No documentationObsolete media

Data not accessible

Page 6: Managing your research data

Cryptic names

Poorly organised

Missing information about software

Backwards compatibility not assured

Page 7: Managing your research data

Your research data

Group discussion• What data will you produce during the course of your

project?

• How will you collect or create the data? What methods/standards will you use for data creation?

• If pre-existing data is being used, where will it come from? How will it be used?

• How much data do you expect to generate?

Leave some time to record information about your research data (Q1a and Q1c) on the DMP template provided.

Page 8: Managing your research data

What is research data management?

Page 9: Managing your research data

Research data management is …

A general term covering how you organize, structure, store, and care for the information used or generated during a research project

Page 10: Managing your research data

Research data management is …

How you look after information on a day-to-day basis over the lifetime of a project

What happens to data in the longer term -what you do with it after the project concludes

Good research practice:

• organising your data

• storing and backing up your data

• choosing the right file formats

• creating documentation for your data

Page 11: Managing your research data

Research data management is …

How you look after information on a day-to-day basis over the lifetime of a project

What happens to data in the longer term -what you do with it after the project concludes

• What data do you need to keep (and share)?

• What data must not be kept (and shared)?

• Where are you going to archive your (selected) data for long-term storage/access?

Page 12: Managing your research data

Why manage your research data?

Page 13: Managing your research data

Carrots and sticks

Carrots - the benefits

Sticks - requirements

• Work efficiently and with minimum hassle over the lifetime of the project

• Save time and avoid problems in the future

• Make it easy to share your data

Page 14: Managing your research data

Carrots and sticks

Carrots - the benefits

Sticks - requirements

• University of York Research Data Management Policy

www.york.ac.uk/rdm-policy

• Funding body requirements

Page 15: Managing your research data

University requires …

Good management of research data over the lifetime of your project

Selected research data to be preserved (for a min of 10 years) and shared at the end of the your project

Research data must be:

• accurate, complete, authentic and reliable

• identifiable, retrievable and available when needed

• kept safe and secure, avoiding data loss

• kept in a manner that is compliant with legal and ethical obligations, and (if applicable) funder requirements

• disposed of securely.

Page 16: Managing your research data

University requires …

Good management of research data over the lifetime of your project

Selected research data be preserved (for a min of 10 years) and shared at the end of the your project

Sharing of research data:

a. of long-term value

b. underpinning published results

where there are no legal, ethical or commercial constraints that would prohibit sharing.

Page 17: Managing your research data

Funder requirements

Research Councils UK

“Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.

Data with acknowledged long term value should be preserved and remain accessible and usable for future research.”

RCUK Common Principles on Data Policy www.rcuk.ac.uk/research/datapolicy

Page 18: Managing your research data

Mechanisms for sharing research data

Deposit your selected data with an external service

Transfer your selected data to the University Research Data York service

• a funder data archive /repository

• a subject data archive/ repository

• a publisher data archive /repository

www.re3data.org to identify a suitable data archive or repository for your data

+ Record the dataset in PUREwww.york.ac.uk/library/info-for/researchers/data/guidance/pure-datasets

Page 19: Managing your research data

Mechanisms for sharing research data

Deposit your selected data with an external service

Transfer your selected data to the University Research Data Yorkservice

• We’ll need some descriptive metadata (PURE)

• We will store and manage access to your data for a minimum of 10 years

– a CC-BY licence is applied to open data

– data with restricted access.

Depositing your data

www.york.ac.uk/library/info-for/researchers/data/sharing/#tab-4

Page 21: Managing your research data

Data management planningplan ahead to succeed in data management

Page 22: Managing your research data

Data management plans

Create a Data Management Plan (DMP)

Tools:

• DMPonline

• York DMP template

A formal document which outlines all aspects of your data management, i.e. what you will do with data during and after your research project ends.

To include:

• Description of the data

• Data collection methods

• Ethics and IPR

• Plans for data sharing

• Strategy for long-term preservation.

Page 23: Managing your research data

Data management plans

Create a Data Management Plan (DMP)

Tools:

• DMPonline

• York DMP template

Required by most funders

AHRC “A Technical Plan should be no more than four pages long and provided for all applications where digital outputs or digital technologies are an essential part to the planned research outcomes.”

“All applications seeking research grant funding from BBSRC must submit a data management plan.”

“ESRC applicants who plan to generate data from their research must submit a data management plan…”

www.dcc.ac.uk/resources/data-management-plans/funders-requirements

Page 24: Managing your research data

Data management plans

Create a Data Management Plan (DMP)

Tools

DMPonline

https://dmponline.dcc.ac.uk

An online tool, created by the Digital Curation Centre, which is designed to help you create personalised data management plans according to the requirements stipulated by the major UK funders.

York DMP template for postgraduate research projects www.york.ac.uk/library/info-for/researchers/data/planning

Page 25: Managing your research data

Data management plans

‘In preparing for battle, I have always found that plans are useless but planningis indispensable.’

Dwight D. Eisenhower

Page 26: Managing your research data

Day-to-day data management

Organising your data (good file management)

Page 27: Managing your research data

‘What a mess’ by .pst, via Flickr: www.flickr.com/photos/psteichen/3915657914

Can you find what you need, when you need it?

Page 28: Managing your research data

In practice

Specific techniques for organising your research data, including developing plans for:

• folder structures - where to put data so you won’t lose it

• file (and folder) naming - what to call data so you know what it is

• version control - keeping track of data

Good file management practises are required to enable you to identify, locate and use your research data files now and into the future.

Page 29: Managing your research data

How not to do it...

Can you spot any problems and issues with how these files and directories are named?

Page 30: Managing your research data

This would be better...

• Use directories to help categorise data

• Make use of the folder hierarchy

Page 31: Managing your research data

Version control can get messy

Page 32: Managing your research data

Version control guidance

Page 33: Managing your research data

File and folder names should be...

• Concise and meaningful• Descriptive• Consistent

Also think about...• How you want your files to order• Version control• Using standard alphanumeric characters – avoid punctuation,

spaces etc.• Lower/upper/proper case

Happiness is:

“knowing what your file is before you double click on it”

Page 34: Managing your research data

Day-to-day data management

Storing your data (keeping your data safe)

Page 35: Managing your research data

http://blogs.ch.cam.ac.uk/pmr/2011/08/01/why-you-need-a-data-management-plan

Page 36: Managing your research data

Storage don'tsDon’t

• keep your data just on your working machine (laptop or a desktop) – it’s the perfect way to lose your data easily and permanently

• upload personal/sensitive data to services the University does not have a contract with, e.g. Dropbox -breach the Data Protection Act

My Project

Page 37: Managing your research data

Storage do’s

The University offers a range of facilities to securely store your data, helping it live a long and useful life.

The University recommends:

University filestore/s (individual or shared)

• required for guaranteed UK storage

• required for “must be kept on site”

Note: You can access it and work on your files off-campus via the VPN

www.york.ac.uk/it-services/filestore

My Project

Page 38: Managing your research data

Storage do’s

The University offers a range of facilities to securely store your data, helping it live a long and useful life.

The University recommends:

University (york.ac.uk) Google Drive

• does not guarantee UK location, but does comply with UK & EU data protection legislation

• Google Apps legal informationwww.york.ac.uk/it-services/google/policy/terms

www.york.ac.uk/it-services/services/drive

My Project

Page 39: Managing your research data

Storage do’s

University filestore has the advantage of being automatically/regularly backed up

If you are not using it …backing up should be an automatic part of your everyday research practice.

Imagine if a fire or similar disaster happened here.

How much would it cost you?

Mountbatten Building, So’ton Uni.

Page 40: Managing your research data

Storage do’s

Protecting confidential information

The University requires that any device (laptops, tablets, phones, your email) that holds sensitive or confidential information

is encrypted.

IT Services offer guidance and support to help you protect your confidential data, whether it's files that you need to share securely or a device that requires encryption.

www.york.ac.uk/it-services/security/encryption

Familiarise yourself with the University's

• Information Security Policy

• Data Protection Policy• Records Management

Policy

Information Policy & You http://bit.ly/1lKWnwr

Page 41: Managing your research data

Storing physical data guidance

Page 42: Managing your research data

A data management horror story

Highlights the things that can go wrong if you don’t manage and share your data well.

Part 1: I can’t find the data … where is it stored?

Video by NYU Health Sciences Libraries Part 1: Request for data https://youtu.be/RVZbk3GEVSw

Page 43: Managing your research data

Storing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record how you will keep your data safe on your DMP.

DMP template

Q2a. Where will you store your data?

• Is it digital or physical (e.g. print) data?• Is it sensitive or confidential data?• Are you collaborating/working with others?• Are you working off campus at all?

Q2b. How will you back-up your (digital) data?

You have … 5 minutes (and take a break)

Page 44: Managing your research data

Day-to-day data management

Choosing the right file formats

Page 45: Managing your research data

File formats: during your research

• During your research you need a file format:

– That you can work with/collect/create/analyse etc

• Can depend on hardware and software

– That fits with your research methodology and workflows

Page 46: Managing your research data

File formats: after your research

• After your research is complete you need a file format:– That is easier to share and re-use– That may last longer into the future

• Open specifications• Widely used formats• Uncompressed• ASCII formats• Exchange formats

– Does the software you are using have an option to export into a more suitable format for sharing or long term re-use?

Page 47: Managing your research data

A data management horror story … continued

Part 2: File formats … what software do I need?

Video by NYU Health Sciences Libraries

Part 2: File formats https://youtu.be/RtSv0gSbCP8

Page 48: Managing your research data

Your file formats

Individual task. Refer back to the research data you listed for Q1a. on your DMP and consider:

DMP template

Q1b. • What file formats will you use to collect, create

and analyse your data?• What file formats will you use to share your data

and keep it for the longer term?

You have … 5 minutes

Page 49: Managing your research data

Day-to-day data management

Documentation & metadata (describing your data)

Page 50: Managing your research data

Documentation and metadata

• Documentation is the contextual information required to make data intelligible and aid interpretation

• A users’ guide to your data

• Metadata is similar, but usually more structured

– Can conform to set standards

– Sometimes machine readable

Page 51: Managing your research data

Why do we need documentation and metadata?

• So you can understand it

• So other people can understand it

• So your findings are verifiable

• So others understand your methodology

• So others can repeat your methods

• To make it reusable

Page 52: Managing your research data

Will someone else understand your data if it isn’t documented?

"The single most useful thing you can do to ensure the long-term preservation of your data is to plan for it to be re-used. Imagining it being reused by someone else who has never met you and who never will meet you, will cause you to approach the creation and design of your data in a new light. ..... In short, always plan for re-use"

Professor Julian D. Richards, Director, Archaeology Data Service, University of York

Page 53: Managing your research data

A data management horror story … continued

Part 3: I can now access the data … but what do the column names mean?

Video by NYU Health Sciences Libraries

Part 3: Documentation https://youtu.be/-MIH8PkuUo4

Page 54: Managing your research data

DocumentationIn your group, look at the sample data sheet. Imagine you have just downloaded this dataset from an archive. Discuss:

• What contextual or explanatory information is missing?- anything odd about the data that needs clarifying?

• What additional documentation would you like to see supplied?- about specific items of information recorded here- about the data collection as a whole

You have … 5 minutes, then feedback

Page 55: Managing your research data

• Who created it, when and why

• Description of the item• Methodology and methods• Units of measurement• Definitions of jargon,

acronyms and code• References to related data

Documentation – what to include

www.texample.net

Page 56: Managing your research data

Your documentation

Individual task. Refer back to the research data you listed for Q1a. on your DMP and start to record:

DMP template

Q2g.

• What documentation will you need to produce to enable yourself and others working in your discipline to understand it in the future?

• What format will it be in?

You have … 5 minutes

Page 57: Managing your research data

What happens at the end of the project?

Keeping selected data

Page 58: Managing your research data

Keeping selected data

Why not keep everything?

Data appraisal for your project

Just because you can preserve, doesn’t always mean you should.

• costs of preserving data (time, technology, space, maintenance)

• risks in keeping things

e.g. Freedom of Information Act(FOI) requests

Page 59: Managing your research data

Keeping selected data

Why not keep everything?

Data appraisal for your project

Appraisal is a concept

familiar to archivists:

“the process of evaluating

records to determine which

are to be retained as

archives, which are to be

kept for specified periods

and which are to be

destroyed’

Ellis, J. (ed.) (1993) Keeping Archives. 2nd ed. Melbourne: Australian Society of Archivists, p.461.

Page 60: Managing your research data

Keeping selected data

Why not keep everything?

Data appraisal for your project

PrePARe checklisthttp://find.jorum.ac.uk/resources/10949/17171

Page 61: Managing your research data

Keeping selected data

Why not keep everything?

Data appraisal for your project

Before your project ends …

• What data should you keep (and share)?University Policy, funder and publisher requirements

• What data must not be kept (and shared)?for ethical, legal or commercial reasons

www.york.ac.uk/library/info-for/researchers/data/sharing

Page 62: Managing your research data

What happens at the end of the project?

Sharing selected data

Page 63: Managing your research data

Why share data?

Page 64: Managing your research data

Validation of research is important

http://inkouper.blogspot.co.uk/2015/09/valuable-lessons-from-sharing-and-non.html

Page 65: Managing your research data

Data sharing – concerns

• Ethical concerns

– Confidential or sensitive data

• Legal concerns

– Third party data

• Professional concerns

– Intended publication

– Commercial issues (e.g. patent protection)

Page 66: Managing your research data

• Redact or embargo if there is good reason

• Planning ahead can reduce difficulties

Data sharing – concerns

Page 67: Managing your research data

Mechanisms for sharing

Remember…

Two options are available to you.

Deposit your selected data of long-term value:

• with external services, i.e. a funder/subject/publisher repository

• with University services (contact us).

Page 68: Managing your research data

Sharing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record:

DMP template

Q4a.• What data should or shouldn’t be shared openly and

why?

• Is there anything you need to do to enable you to share your data?

Q4c.• How do you intend to share your data?

You have … 5 minutes

Page 69: Managing your research data

Further information and resources

Page 70: Managing your research data

RDM resources

RDM web pages

Research Data MANTRA

RET courses

www.york.ac.uk/rdm

Page 71: Managing your research data

RDM resources

RDM web pages

Research Data MANTRA

RET courses

http://datalib.edina.ac.uk

Page 72: Managing your research data

RDM resources

RDM web pages

Research Data MANTRA

RET courses

www.skillsforge.york.ac.uk

• Data Protection

• Integrity and Ethics

• Know your (Copy)Rights: protecting your own work and re-using other people’s

Page 73: Managing your research data

We are here to help

Research Support Team [email protected]

IT Support [email protected]

Page 74: Managing your research data

What you need to do

Before your project

Plan your data management

1. Write a funder data management plan OR write a (York) DMP

2. Pull together all eligible costs

During your project

Look after your live data

3. Update the DMP

4. Organise the data

5. Store the data

6. Describe the data

7. Decide what data to keep

End of your project

Start sharing and depositing data to meet University/funder requirements

8. Deposit (and share*) the data

9. Obtain a DOI

10. Include a data access statement in publications

* Decide if data can be shared openly or if access to the data needs to be restricted.

Page 75: Managing your research data

A final word from you?

Questions?

Comments

• things you’ve learnt• what you need to investigate further

Page 76: Managing your research data

This is just the beginning …