Michael Day JIBS-RLUK event July 2012

37
… because good research needs good data Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012 Funded by: Introduction to Research Data Management: activities, roles and requirements Michael Day Digital Curation Centre UKOLN, University of Bath [email protected] This work is licensed under the Creative Commons Attribution-NonCommercial- ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

description

Introduction to Research Data Management by Michael Day, (UKOLN). Presentation at Demystifying Research Data: don’t be scared be prepared: A joint JIBS/RLUK event, Tuesday 17th July 17th July 2012, Brunei Gallery at SOAS (School of Oriental and African Studies), London.

Transcript of Michael Day JIBS-RLUK event July 2012

Page 1: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Introduction to Research Data Management: activities, roles and

requirements Michael Day

Digital Curation Centre

UKOLN, University of Bath

[email protected]

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.

Page 2: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Outline• Introduction • The researcher perspective

• Codes of Practice• Research funding bodies

• The institutional perspective• Research lifecycles

• Some lifecycle models• The role of the library

• Activities, roles and requirements

Page 3: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

3

Why manage research data?• Enable reuse• Research integrity• Research impact

• Linking data and publication• Making data citable

• Regulatory requirements• Controlling costs• Maximising value

Page 4: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

4

Who are the main actors?• Researchers - as creators and users• Other Data creators• Other Data (re)users• Funding bodies• Data Centres• Computer science research• Libraries• Research support/grant offices• Archivists/records managers

Page 5: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

5

What is required?• Technical infrastructure

• Storage (many options)• Tools• Discovery• Research Intelligence (RIM)

• Policy & commitment• Human infrastructure

• Researcher skills• Support services• Training

Page 6: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

6

Potential national-level actions• Building dataset discovery• Collecting data policies• Liaise with other national & international actors• Support uptake of cloud-based tools• Exploit pool of data plans• Collecting stories on data re-use• Supporting effective citation, referencing, etc• Sharing good practice

Page 7: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

The researcher perspective• Managing and sharing data is simply part of good

research:• Adhering to disciplinary and/or institutional codes of practice

and policies• Has been practiced since the advent of modern science, but

not always consistently; data intensive research makes it even more critical

• Meeting the specific requirements of funding bodies

• Reputational risks if data management is not handled properly

Page 8: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Research codes of practice (1)• UK Research Integrity Office Code of Practice for

Research (2009)Data management planning is an essential part of research design

Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form [3.12.5]

Page 9: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Research codes of practice (2)• RCUK Code of Conduct on the Governance of Good

Research Conduct (2011)Primary data and research evidence [should be made] accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer)

Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation [although deposit within national collections is endorsed]

Page 10: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Research funding bodies• UK Research Councils

• Help fund some data archives, e.g.:• Archaeology Data Service, European Bioinformatics

Institute, the NERC data centres, UK Data Archive• Support for JISC (and DCC)• RCUK Common Principles on Data Policy

• Recognises that data are a critical output of the research process

http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

Page 11: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

RCUK Principles (in a nutshell)• Publicly funded research data should be made openly available• Data with acknowledged long-term value should be preserved and

remain accessible and usable for future research • Sufficient metadata should be recorded to enable other researchers to

find and understand the research to enable re-use; published results should always include information on how to access the supporting data

• Recognition that there may be legal, ethical and commercial constraints• Recognition that researchers may need privileged use of data for a

limited period• All users of research data should acknowledge their sources• Appropriate to use public funds to support MRD

Page 12: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

EPSRC expectations• Roadmap approved May 2012; compliance by May

2015Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation

Data not generated in digital format should be stored in a manner to facilitate it being shared

Data should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third party

Adequate resources from existing funding streams

EPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions

Page 13: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Implications for researchers• Increasing number of research councils and funding bodies with data

management and sharing requirements• Potential loss of research income if these mandates are not met• Need to determine the costs associated with short and longer-term

management and curation and to request funds as part of grant• Responsibility for infrastructure shifting more to HEIs and less to

centralised data archives, but institutional infrastructures and services are still emerging

• Need guidance - some good external support• But also need more local support; often fragmented (need to draw upon

existing channels within your institution wherever possible)

Page 14: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Institutional drivers• Safeguarding research integrity• Increasing number of FOI requests for data• Adhering to existing codes of research practice and ethics • Developing new institution-wide strategies, policies and services

for data storage and management• Increased institutional focus on research management (e.g., in

response to REF) • Benchmarking – self-assessing infrastructure and planning for

improvement • More demands but less resources to work with

Page 15: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Institutional actors• Researchers

• Both as creators and users of data• PIs (e.g., have specific roles WRT grants)• Computer scientists (informaticians, data scientists)

• Administration• Research support office (e.g., grants support, research

information management)• Records managers, archivists, FOI office

• Central services• Computing services• Libraries (e.g., institutional repository)

Page 16: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Research data lifecycles

Page 17: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

17

Page 18: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

18

(e)-Research Life Cycle view of Data Curation?Formulate hypothesis / ideas, test,

experiment, observe: data creation, collection & capture

Adding value: Data linking, annotation,

visualisation, simulation

(New) knowledge extraction: data mining, modelling, analysis, synthesis

e-Infrastructure

Open access

Collaboration

Scholarly communications: data disclosure, publication, citation, discovery, re-use

Data management storage & validation: description, deposit,

self-archiving, preservation,

certification

Data processing

Data processingData processing

Data processing

Data processing

This work is licensed under a Creative Commons LicenseAttribution-ShareAlike 2.0 • Liz Lyon December 2005

Page 19: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

19

E-Science Curation Report - 2003• E-science

discipline• Appropriate for

current focus• Takes

integrated look at higher education data curation problems

• Granularity on curation activities?

Page 20: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

20

Open Archival Information System

Page 21: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

21

RDM at Oxford

Page 22: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

22

http://blogs.bath.ac.uk/research360/

Research360@Bath

• New institutional data scientist role

• Addresses EPSRC expectations (published)

• Doctoral Training Centre hubs

• Faculty-Industry focus• Faculty cascade model• Multi-team approach

Page 23: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

23

Page 24: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

24

Some library roles (in the lifecycle)• Leadership – coordinate action• Audit – who has what, where does it go?• Advice on access – data, wherever it is• Preservation (long-term access requirements)• Citability• Data/publication linking• Promoting data in teaching• Identifying skill gaps / CPD requirements

Page 25: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

25

Re-skilling for research (RLUK, 2012)• Mary Auckland identified 9 key areas with skill gaps for

subject librarians:• Ability to advise on preserving research outputs• Knowledge to advise on data management and

curation, including ingest, discovery, access, dissemination, preservation, and portability

• Knowledge to support researchers in complying with the various mandates of funders, including open access requirements

• Knowledge to advise on potential data manipulation tools used in the discipline/ subject

• Knowledge to advise on data mining• Knowledge to advocate, and advise on, the use of

metadata• Ability to advise on the preservation of project records

e.g. correspondence• Knowledge of sources of research funding to assist

researchers to identify potential funders• Skills to develop metadata schema, and advise on

discipline/subject standards and practices, for individual research projects

Page 26: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

26

http://www.dcc.ac.uk/

Understanding data requirements

Page 27: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

27

Data management planning

Page 28: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

28

Data registries• Findable, citable data has value

• Important to link publications to data (and vice versa)• Increases citations – of data & publication• Increases reuse (hence value)• But effects exist even without publication• All benefit – researcher; institution; publisher

Page 29: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

29

Tools to track impact http://total-impact.org/

Page 30: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Activities, roles, requirements (1)• Requirements gathering

• Identifying researchers’ data requirements• Developing a shared understanding of what needs to be

done (e.g., identifying where data exist, its form and scale, any existing retention requirements)

• Identifying good practice within the institution (and the opposite)

• Methods: surveys, focus groups, case studies, joint R&D projects, assessment tools (e.g. DAF)

Page 31: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Activities, roles, requirements (2)• Identifying motivations and benefits

• For researchers, support services, the institution

• Identifying risks• Data loss (institution, research group, individual)• Increased costs (lack of planning, service inefficiency, data

loss)• Legal compliance (research funder, H&S, ethics, FoI)• Reputation (institution, unit, individual)

• Identifying costs• Keeping Research Data Safe (KRDS) toolkit

Page 32: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Activities, roles, requirements (3)• Assessing institutional preparedness

• Identifying institutional stakeholders, existing data support services, gaps

• Benchmarking and planning for the future• Skills audit• CARDIO tool

• Policy development• Policies – approval by senior management is just the start; policies

need to be embedded in research practice and responsive to changing requirements

• Data management planning• DMP online, DCC How-to Develop a Data Management Plan guide

Page 33: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Activities, roles, requirements (4)• Implementation and service development

• Integrating where possible with existing services, e.g. IR, CRIS, VRE, HPC, cloud services, social media, etc.

• Appraisal, deciding what needs to be kept and for how long• Storage choices – no one-size-fits-all solution, e.g. Bristol’s

BluePeta petascale storage facility, Bath’s X-Drive approach, cloud approaches

• Data documentation and metadata – layered approaches: top-level discovery (core metadata, collection/experiment-level?), role of standards like DCMI, CERIF, DDI, etc.

Page 34: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Activities, roles, requirements (5)• Data issues:

• Appraisal: selection criteria, retention periods (who decides?)• DCC How to appraise and select research data for

curation guide• Documentation: metadata, schema, semantics• Formats: proprietary formats, community standards, etc.• Provenance and authenticity• Citation (assignment of persistent IDs?) • Access (embargo policies?)• Licensing

• DCC How to license research data guide

Page 35: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

35

Things to do …• Create policy – collaborate with others

• Growing number of policies being published (EPSRC, Wellcome Trust)

• Build on existing digital services• Examples: storage, data registry

• Learn about audit tools (DCC & others)• Learn about data & sources• Re-skill subject librarians• Bridge between publishers & researchers

Page 36: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

36

What data to keep

DCC resources

http://www.dcc.ac.uk/resources

Page 37: Michael Day JIBS-RLUK event July 2012

… because good research needs good data

Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

Funded by:

Thank-you. Any questions?

Michael Day

Digital Curation Centre

UKOLN, University of Bath

[email protected]

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.