Introduction to Research Data Management: activities, roles and requirements

16
… because good research needs good data Introduction to Research Data Management: activities, roles and requirements 11 th DCC Regional Roadshow, London, 22 May 2012 Funded by: Michael Day Digital Curation Centre UKOLN, University of Bath [email protected] This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

description

Slides from a presentation given at the 11th Digital Curation Centre Data Management Roadshow, Imperial College London, London, UK, 22 May 2012

Transcript of Introduction to Research Data Management: activities, roles and requirements

Page 1: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Introduction to Research Data Management: activities, roles and

requirements

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

Michael DayDigital Curation Centre

UKOLN, University of [email protected]

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

Given the audience I’ll reflect on two pieces of DCC work: DAF tool, which has been used primarily by service providers or intermediaries to investigate what’s happening in terms of data management at the coalface and explore service gaps to see what support researchers need, and; Research funders policies, specifically in terms of data management and sharing plan requirements, as this is directly relevant to researchers
Page 2: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Outline

• The researcher perspective• Codes of Practice• Research funding bodies

• The institutional perspective

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• The institutional perspective• Activities, roles and requirements

Page 3: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

The researcher perspective

• Managing and sharing data is simply part of good research:• Adhering to disciplinary and/or institutional codes of practice

and policies

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

and policies• Has been practiced since the advent of modern science, but

not always consistently; data intensive research makes it even more critical

• Meeting the specific requirements of funding bodies

• Reputational risks if data management is not handled properly

Page 4: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Research codes of practice (1)

• UK Research Integrity Office Code of Practice for Research (2009)

Data management planning is an essential part of research design

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

designOrganisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form [3.12.5]

Page 5: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Research codes of practice (2)

• RCUK Code of Conduct on the Governance of Good Research Conduct (2011)

Primary data and research evidence [should be made] accessible to others for reasonable periods after the

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer)Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation [although deposit within national collections is endorsed]

Page 6: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Research funding bodies

• UK Research Councils• Help fund some data archives, e.g.:

• Archaeology Data Service, European Bioinformatics Institute, the NERC data centres, UK Data Archive

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

Institute, the NERC data centres, UK Data Archive• Support for JISC (and DCC)• RCUK Common Principles on Data Policy

• Recognises that data are a critical output of the research process

http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

Page 7: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

RCUK Principles (in a nutshell)• Publicly funded research data should be made openly available• Data with acknowledged long-term value should be preserved and

remain accessible and usable for future research • Sufficient metadata should be recorded to enable other researchers to

find and understand the research to enable re-use; published results

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

find and understand the research to enable re-use; published results should always include information on how to access the supporting data

• Recognition that there may be legal, ethical and commercial constraints• Recognition that researchers may need privileged use of data for a

limited period• All users of research data should acknowledge their sources• Appropriate to use public funds to support MRD

Page 8: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

EPSRC expectations

• Roadmap approved May 2012; compliance by May 2015

Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

available on the Internet within 12 months of data generationData not generated in digital format should be stored in a manner to facilitate it being sharedData should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third partyAdequate resources from existing funding streamsEPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions

Page 9: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Implications for researchers• Increasing number of research councils and funding bodies with data

management and sharing requirements

• Potential loss of research income if these mandates are not met

• Need to determine the costs associated with short and longer-term

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• Need to determine the costs associated with short and longer-term management and curation and to request funds as part of grant

• Responsibility for infrastructure shifting more to HEIs and less to centralised data archives, but institutional infrastructures and services are still emerging

• Need guidance - some good external support

• But also need more local support; often fragmented (need to draw upon existing channels within your institution wherever possible)

Page 10: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Institutional drivers• Safeguarding research integrity• Increasing number of FOI requests for data• Adhering to existing codes of research practice and ethics • Developing new institution-wide strategies, policies and services

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• Developing new institution-wide strategies, policies and services for data storage and management

• Increased institutional focus on research management (e.g., in response to REF)

• Benchmarking – self-assessing infrastructure and planning for improvement

• More demands but less resources to work with

Page 11: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Activities, roles, requirements (1)

• Requirements gathering• Identifying researchers’ data requirements• Developing a shared understanding of what needs to be

done (e.g., identifying where data exist, its form and scale,

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

done (e.g., identifying where data exist, its form and scale, any existing retention requirements)

• Identifying good practice within the institution (and the opposite)

• Methods: surveys, focus groups, case studies, joint R&D projects, assessment tools (e.g. DAF)

Page 12: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Activities, roles, requirements (2)

• Identifying motivations and benefits• For researchers, support services, the institution

• Identifying risks• Data loss (institution, research group, individual)

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• Data loss (institution, research group, individual)• Increased costs (lack of planning, service inefficiency, data

loss)• Legal compliance (research funder, H&S, ethics, FoI)• Reputation (institution, unit, individual)

• Identifying costs• Keeping Research Data Safe (KRDS) toolkit

Page 13: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Activities, roles, requirements (3)• Assessing institutional preparedness

• Identifying institutional stakeholders, existing data support services, gaps

• Benchmarking and planning for the future

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• Skills audit• CARDIO tool

• Policy development• Policies – approval by senior management is just the start; policies

need to be embedded in research practice and responsive to changing requirements

• Data management planning• DMP online, DCC How-to Develop a Data Management Plan guide

Page 14: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Activities, roles, requirements (4)

• Implementation and service development• Integrating where possible with existing services, e.g. IR,

CRIS, VRE, HPC, cloud services, social media, etc.• Appraisal, deciding what needs to be kept and for how long

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

• Appraisal, deciding what needs to be kept and for how long• Storage choices – no one-size-fits-all solution, e.g. Bristol’s

BluePeta petascale storage facility, Bath’s X-Drive approach, cloud approaches

• Data documentation and metadata – layered approaches: top-level discovery (core metadata, collection/experiment-level?), role of standards like DCMI, CERIF, DDI, etc.

Page 15: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Activities, roles, requirements (5)

• Data issues:• Appraisal: selection criteria, retention periods (who decides?)

• DCC How to appraise and select research data for curation guide

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

curation guide• Documentation: metadata, schema, semantics• Formats: proprietary formats, community standards, etc.• Provenance and authenticity• Citation (assignment of persistent IDs?) • Access (embargo policies?)• Licensing

• DCC How to license research data guide

Page 16: Introduction to Research Data Management: activities, roles and requirements

… because good research needs good data

Thank-you. Any questions?

11th DCC Regional Roadshow, London, 22 May 2012

Funded by:

Michael DayDigital Curation Centre

UKOLN, University of [email protected]

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

Given the audience I’ll reflect on two pieces of DCC work: DAF tool, which has been used primarily by service providers or intermediaries to investigate what’s happening in terms of data management at the coalface and explore service gaps to see what support researchers need, and; Research funders policies, specifically in terms of data management and sharing plan requirements, as this is directly relevant to researchers