Managing your research data
Jenny Mitcham (Digital Archivist)Lindsey Myers (Research Support Librarian)
Spring Term 2017
Information Services
Overview
• What is research data?
• What is research data management?
• Why manage research data?
• Data management planning (this is key)
• How to manage research data (best practice)
• Preserving and sharing research data
• Help and advice
What is research data?
What is data?All the information you use as an integral part of your research
This is my research data
No documentationObsolete media
Data not accessible
Cryptic names
Poorly organised
Missing information about software
Backwards compatibility not assured
Your research data
Group discussion• What data will you produce during the course of your
project?
• How will you collect or create the data? What methods/standards will you use for data creation?
• If pre-existing data is being used, where will it come from? How will it be used?
• How much data do you expect to generate?
Leave some time to record information about your research data (Q1a and Q1c) on the DMP template provided.
What is research data management?
Research data management is …
A general term covering how you organize, structure, store, and care for the information used or generated during a research project
Research data management is …
How you look after information on a day-to-day basis over the lifetime of a project
What happens to data in the longer term -what you do with it after the project concludes
Good research practice:
• organising your data
• storing and backing up your data
• choosing the right file formats
• creating documentation for your data
Research data management is …
How you look after information on a day-to-day basis over the lifetime of a project
What happens to data in the longer term -what you do with it after the project concludes
• What data do you need to keep (and share)?
• What data must not be kept (and shared)?
• Where are you going to archive your (selected) data for long-term storage/access?
Why manage your research data?
Carrots and sticks
Carrots - the benefits
Sticks - requirements
• Work efficiently and with minimum hassle over the lifetime of the project
• Save time and avoid problems in the future
• Make it easy to share your data
Carrots and sticks
Carrots - the benefits
Sticks - requirements
• University of York Research Data Management Policy
www.york.ac.uk/rdm-policy
• Funding body requirements
University requires …
Good management of research data over the lifetime of your project
Selected research data to be preserved (for a min of 10 years) and shared at the end of the your project
Research data must be:
• accurate, complete, authentic and reliable
• identifiable, retrievable and available when needed
• kept safe and secure, avoiding data loss
• kept in a manner that is compliant with legal and ethical obligations, and (if applicable) funder requirements
• disposed of securely.
University requires …
Good management of research data over the lifetime of your project
Selected research data be preserved (for a min of 10 years) and shared at the end of the your project
Sharing of research data:
a. of long-term value
b. underpinning published results
where there are no legal, ethical or commercial constraints that would prohibit sharing.
Funder requirements
Research Councils UK
“Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.
Data with acknowledged long term value should be preserved and remain accessible and usable for future research.”
RCUK Common Principles on Data Policy www.rcuk.ac.uk/research/datapolicy
Mechanisms for sharing research data
Deposit your selected data with an external service
Transfer your selected data to the University Research Data York service
• a funder data archive /repository
• a subject data archive/ repository
• a publisher data archive /repository
www.re3data.org to identify a suitable data archive or repository for your data
+ Record the dataset in PUREwww.york.ac.uk/library/info-for/researchers/data/guidance/pure-datasets
Mechanisms for sharing research data
Deposit your selected data with an external service
Transfer your selected data to the University Research Data Yorkservice
• We’ll need some descriptive metadata (PURE)
• We will store and manage access to your data for a minimum of 10 years
– a CC-BY licence is applied to open data
– data with restricted access.
Depositing your data
www.york.ac.uk/library/info-for/researchers/data/sharing/#tab-4
Phone, tablet or laptop out please!
Go to kahoot.it
Data management planningplan ahead to succeed in data management
Data management plans
Create a Data Management Plan (DMP)
Tools:
• DMPonline
• York DMP template
A formal document which outlines all aspects of your data management, i.e. what you will do with data during and after your research project ends.
To include:
• Description of the data
• Data collection methods
• Ethics and IPR
• Plans for data sharing
• Strategy for long-term preservation.
Data management plans
Create a Data Management Plan (DMP)
Tools:
• DMPonline
• York DMP template
Required by most funders
AHRC “A Technical Plan should be no more than four pages long and provided for all applications where digital outputs or digital technologies are an essential part to the planned research outcomes.”
“All applications seeking research grant funding from BBSRC must submit a data management plan.”
“ESRC applicants who plan to generate data from their research must submit a data management plan…”
www.dcc.ac.uk/resources/data-management-plans/funders-requirements
Data management plans
Create a Data Management Plan (DMP)
Tools
DMPonline
https://dmponline.dcc.ac.uk
An online tool, created by the Digital Curation Centre, which is designed to help you create personalised data management plans according to the requirements stipulated by the major UK funders.
York DMP template for postgraduate research projects www.york.ac.uk/library/info-for/researchers/data/planning
Data management plans
‘In preparing for battle, I have always found that plans are useless but planningis indispensable.’
Dwight D. Eisenhower
Day-to-day data management
Organising your data (good file management)
‘What a mess’ by .pst, via Flickr: www.flickr.com/photos/psteichen/3915657914
Can you find what you need, when you need it?
In practice
Specific techniques for organising your research data, including developing plans for:
• folder structures - where to put data so you won’t lose it
• file (and folder) naming - what to call data so you know what it is
• version control - keeping track of data
Good file management practises are required to enable you to identify, locate and use your research data files now and into the future.
How not to do it...
Can you spot any problems and issues with how these files and directories are named?
This would be better...
• Use directories to help categorise data
• Make use of the folder hierarchy
Version control can get messy
Version control guidance
File and folder names should be...
• Concise and meaningful• Descriptive• Consistent
Also think about...• How you want your files to order• Version control• Using standard alphanumeric characters – avoid punctuation,
spaces etc.• Lower/upper/proper case
Happiness is:
“knowing what your file is before you double click on it”
Day-to-day data management
Storing your data (keeping your data safe)
http://blogs.ch.cam.ac.uk/pmr/2011/08/01/why-you-need-a-data-management-plan
Storage don'tsDon’t
• keep your data just on your working machine (laptop or a desktop) – it’s the perfect way to lose your data easily and permanently
• upload personal/sensitive data to services the University does not have a contract with, e.g. Dropbox -breach the Data Protection Act
My Project
Storage do’s
The University offers a range of facilities to securely store your data, helping it live a long and useful life.
The University recommends:
University filestore/s (individual or shared)
• required for guaranteed UK storage
• required for “must be kept on site”
Note: You can access it and work on your files off-campus via the VPN
www.york.ac.uk/it-services/filestore
My Project
Storage do’s
The University offers a range of facilities to securely store your data, helping it live a long and useful life.
The University recommends:
University (york.ac.uk) Google Drive
• does not guarantee UK location, but does comply with UK & EU data protection legislation
• Google Apps legal informationwww.york.ac.uk/it-services/google/policy/terms
www.york.ac.uk/it-services/services/drive
My Project
Storage do’s
University filestore has the advantage of being automatically/regularly backed up
If you are not using it …backing up should be an automatic part of your everyday research practice.
Imagine if a fire or similar disaster happened here.
How much would it cost you?
Mountbatten Building, So’ton Uni.
Storage do’s
Protecting confidential information
The University requires that any device (laptops, tablets, phones, your email) that holds sensitive or confidential information
is encrypted.
IT Services offer guidance and support to help you protect your confidential data, whether it's files that you need to share securely or a device that requires encryption.
www.york.ac.uk/it-services/security/encryption
Familiarise yourself with the University's
• Information Security Policy
• Data Protection Policy• Records Management
Policy
Information Policy & You http://bit.ly/1lKWnwr
Storing physical data guidance
A data management horror story
Highlights the things that can go wrong if you don’t manage and share your data well.
Part 1: I can’t find the data … where is it stored?
Video by NYU Health Sciences Libraries Part 1: Request for data https://youtu.be/RVZbk3GEVSw
Storing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record how you will keep your data safe on your DMP.
DMP template
Q2a. Where will you store your data?
• Is it digital or physical (e.g. print) data?• Is it sensitive or confidential data?• Are you collaborating/working with others?• Are you working off campus at all?
Q2b. How will you back-up your (digital) data?
You have … 5 minutes (and take a break)
Day-to-day data management
Choosing the right file formats
File formats: during your research
• During your research you need a file format:
– That you can work with/collect/create/analyse etc
• Can depend on hardware and software
– That fits with your research methodology and workflows
File formats: after your research
• After your research is complete you need a file format:– That is easier to share and re-use– That may last longer into the future
• Open specifications• Widely used formats• Uncompressed• ASCII formats• Exchange formats
– Does the software you are using have an option to export into a more suitable format for sharing or long term re-use?
A data management horror story … continued
Part 2: File formats … what software do I need?
Video by NYU Health Sciences Libraries
Part 2: File formats https://youtu.be/RtSv0gSbCP8
Your file formats
Individual task. Refer back to the research data you listed for Q1a. on your DMP and consider:
DMP template
Q1b. • What file formats will you use to collect, create
and analyse your data?• What file formats will you use to share your data
and keep it for the longer term?
You have … 5 minutes
Day-to-day data management
Documentation & metadata (describing your data)
Documentation and metadata
• Documentation is the contextual information required to make data intelligible and aid interpretation
• A users’ guide to your data
• Metadata is similar, but usually more structured
– Can conform to set standards
– Sometimes machine readable
Why do we need documentation and metadata?
• So you can understand it
• So other people can understand it
• So your findings are verifiable
• So others understand your methodology
• So others can repeat your methods
• To make it reusable
Will someone else understand your data if it isn’t documented?
"The single most useful thing you can do to ensure the long-term preservation of your data is to plan for it to be re-used. Imagining it being reused by someone else who has never met you and who never will meet you, will cause you to approach the creation and design of your data in a new light. ..... In short, always plan for re-use"
Professor Julian D. Richards, Director, Archaeology Data Service, University of York
A data management horror story … continued
Part 3: I can now access the data … but what do the column names mean?
Video by NYU Health Sciences Libraries
Part 3: Documentation https://youtu.be/-MIH8PkuUo4
DocumentationIn your group, look at the sample data sheet. Imagine you have just downloaded this dataset from an archive. Discuss:
• What contextual or explanatory information is missing?- anything odd about the data that needs clarifying?
• What additional documentation would you like to see supplied?- about specific items of information recorded here- about the data collection as a whole
You have … 5 minutes, then feedback
• Who created it, when and why
• Description of the item• Methodology and methods• Units of measurement• Definitions of jargon,
acronyms and code• References to related data
Documentation – what to include
www.texample.net
Your documentation
Individual task. Refer back to the research data you listed for Q1a. on your DMP and start to record:
DMP template
Q2g.
• What documentation will you need to produce to enable yourself and others working in your discipline to understand it in the future?
• What format will it be in?
You have … 5 minutes
What happens at the end of the project?
Keeping selected data
Keeping selected data
Why not keep everything?
Data appraisal for your project
Just because you can preserve, doesn’t always mean you should.
• costs of preserving data (time, technology, space, maintenance)
• risks in keeping things
e.g. Freedom of Information Act(FOI) requests
Keeping selected data
Why not keep everything?
Data appraisal for your project
Appraisal is a concept
familiar to archivists:
“the process of evaluating
records to determine which
are to be retained as
archives, which are to be
kept for specified periods
and which are to be
destroyed’
Ellis, J. (ed.) (1993) Keeping Archives. 2nd ed. Melbourne: Australian Society of Archivists, p.461.
Keeping selected data
Why not keep everything?
Data appraisal for your project
PrePARe checklisthttp://find.jorum.ac.uk/resources/10949/17171
Keeping selected data
Why not keep everything?
Data appraisal for your project
Before your project ends …
• What data should you keep (and share)?University Policy, funder and publisher requirements
• What data must not be kept (and shared)?for ethical, legal or commercial reasons
www.york.ac.uk/library/info-for/researchers/data/sharing
What happens at the end of the project?
Sharing selected data
Why share data?
Validation of research is important
http://inkouper.blogspot.co.uk/2015/09/valuable-lessons-from-sharing-and-non.html
Data sharing – concerns
• Ethical concerns
– Confidential or sensitive data
• Legal concerns
– Third party data
• Professional concerns
– Intended publication
– Commercial issues (e.g. patent protection)
• Redact or embargo if there is good reason
• Planning ahead can reduce difficulties
Data sharing – concerns
Mechanisms for sharing
Remember…
Two options are available to you.
Deposit your selected data of long-term value:
• with external services, i.e. a funder/subject/publisher repository
• with University services (contact us).
Sharing your dataIndividual task. Refer back to the research data you listed for Q1a. and start to record:
DMP template
Q4a.• What data should or shouldn’t be shared openly and
why?
• Is there anything you need to do to enable you to share your data?
Q4c.• How do you intend to share your data?
You have … 5 minutes
Further information and resources
RDM resources
RDM web pages
Research Data MANTRA
RET courses
www.york.ac.uk/rdm
RDM resources
RDM web pages
Research Data MANTRA
RET courses
http://datalib.edina.ac.uk
RDM resources
RDM web pages
Research Data MANTRA
RET courses
www.skillsforge.york.ac.uk
• Data Protection
• Integrity and Ethics
• Know your (Copy)Rights: protecting your own work and re-using other people’s
What you need to do
Before your project
Plan your data management
1. Write a funder data management plan OR write a (York) DMP
2. Pull together all eligible costs
During your project
Look after your live data
3. Update the DMP
4. Organise the data
5. Store the data
6. Describe the data
7. Decide what data to keep
End of your project
Start sharing and depositing data to meet University/funder requirements
8. Deposit (and share*) the data
9. Obtain a DOI
10. Include a data access statement in publications
* Decide if data can be shared openly or if access to the data needs to be restricted.
A final word from you?
Questions?
Comments
• things you’ve learnt• what you need to investigate further
This is just the beginning …
Top Related