Research Data Management at University of Hertfordshire...

29
Research Data Management at University of Hertfordshire Liz Nolan, Bill Worthington, 11 June 2013, ARMA 2013 Conference, Nottingham

Transcript of Research Data Management at University of Hertfordshire...

Page 1: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire Liz Nolan, Bill Worthington, 11 June 2013, ARMA 2013 Conference, Nottingham

Page 2: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Aims of today’s session

•  To look at one institution’s experiences in developing practices and procedures in the management of research data

•  To review our Data Management ‘journey’ - what we have learned and done so far

•  To consider what more needs to be done

•  Open discussion on ‘what is best practice’?

Page 3: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Research at the University of Hertfordshire (UH)

•  Post 1992 University

•  58th in RAE2008 (+35 places)

•  Centralised Research Grants Office

•  Research of 10 Schools belonging to 3 Research Institutes

•  300-350 bids per year; 100 - 125 awards

•  ~500 active researchers (130 dedicated research staff)

•  £8 -10m research income

•  25% RCUK, 38% EU, 2% Charities, 35% Other

Page 4: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Drivers for change

2010: •  Increased requests to Research Grants Office for

help with Data Sharing Plans; Technical Appendices, Data Management Plans

•  Demand for more storage space •  Research Information Network Event

Page 5: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

First steps

•  RGO plea to Information Hertfordshire for help •  Clear procedure agreed for assistance with Data

Management Plans

•  Now use DCC DMPonline •  https://dmponline.dcc.ac.uk/

Page 6: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Data Management Policies

•  Review of UH Data Management Policy

•  New appendix - University Guide to Research Data Management

•  http://sitem.herts.ac.uk/secreg/upr/IM12.htm

Page 7: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Bids to JISCMRD

•  JISC Managing Research Data Programme 2011-13

•  Bid written by IH and RGO, submitted in July 2011

•  First project:‘Service Oriented Toolkit for Research Data Management’

•  Second, smaller bid Spring 2012, also successful

•  Second project: Research Data Management Training in Physics and Astronomy

•  Pro Vice-Chancellor (Research) Chair of Steering Group

•  ~ £225k + ~60k JISC matched by £300 UH investment

Page 8: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Page 9: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Why RDM?

essentially: to get better value from research

- look after working data better and more efficiently

- publish and re-use data

Page 10: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Why RDM? National Policy Context

Research data generated by publicly-funded research is seen as a public good and should be available for verification and re-use {4} All UK Research Councils require their grant holders to manage and retain their research data for re-use, unless there are specific and valid reasons not to do so {5} Example: By 2015, EPSRC require all data which underpins publications arising from their funded work to be made publically available.

{4} http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx {5} http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies

Page 11: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Why RDM? Personal benefit

•  Appropriate data management planning will be necessary to attract funding •  RDM best practice protects against data loss and damage and personal/corporate

reputational damage

•  RDM best practice will save you time, inevitable inconvenience and money

•  Some of the hidden costs of RDM may be transferred to central services

•  Published data will attract citations in their own right, and credit in assessment exercises

Page 12: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Two tasks for the UH RDM team

Advocacy Making life easier

Page 13: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Where to start? Audit how much data, how kept, how shared, how organised and level of awareness re: RDM? •  ~ 600 staff invited, 12% return •  scaling the responses led to an estimate of 2PB ( PB = million GB) •  20 x more than central resources •  80-90% in the hands of well resourced STEM research groups •  remaining 200 - 400 TB held by non technical researchers

•  most data held on workstations and laptops, and local ad-hoc storage •  significant use of insecure media, mostly USB sticks •  significant use of unregulated, ‘free’ cloud services, particularly Dropbox •  data hoarding, collaboration only between trusted peers, possessive culture in many areas pockets of good practice RDM, mostly good intentions, but much risk

audits across the sector agree

Page 14: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Where to start? Gap analysis

•  lack of recognition of data as a career asset; •  lack of awareness about university services for data storage, backup and sharing; •  lack of trust in central services;

•  need for more flexible facilities for collaborative sharing of working data; •  need for facilities for long term data preservation and re-use; •  need for training and advice

Help was needed for the whole project lifecycle, from data management planning, to safekeeping and collaborative working with data, to curation and arrangements for data re-use.

Page 15: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Most major funding bodies expect a Data Management Plan (DMP) - some require it

•  UH stipulate use of https://dmponline.dcc.ac.uk

•  Shortest path to a robust, well argued DMP, packaged as a PDF

•  Researchers: “too many questions, too difficult, not relevant”

•  UH dmponline template produced to fill in many of the blanks

•  RGO + 3 x 0.5 fte RDM champions deployed to research institutes

Research Data Management at University of Hertfordshire

Data Management Planning

Page 16: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Second JISCMRD project to develop a short course in RDM

•  project won by addressing a gap in the market: physical sciences

•  modular materials, starting with the generic, moving to subject specific

•  developed ‘in situ’ by delivery and feedback on existing early career CPD and post-Graduate training programmes

•  integrated with new RDM micro site

•  materials, speakers notes, activities and presentation planning matrix will be deposited with jorum.ac.uk

Research Data Management at University of Hertfordshire

Training

Page 17: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Two way street: there is learning and knowledge transfer at each encounter •  lunchtime seminars, workshops, research groups meetings, research

management meetings, individual consultations, hi-jacked conversations

•  parts of: me, two technical consultants, two project officers, two RDM champions, two research support librarians, the CTO, the CIO

•  amounting to 4 to 5 FTE in the latter half of the projects

•  gradually, by helping people and being persistent, the messages get through and we see new demands for assistance

Research Data Management at University of Hertfordshire

Ongoing engagement

Page 18: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Interventions

We have worked at all levels: •  with individual researchers and small problems •  with our own service providers and local systems

•  with other Universities

•  with JANET

We have been good at bothering people but always with an offer of help or a constructive demand.

Page 19: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Problem: Researcher in health needed to deliver data to a funder in a ‘secured package’

•  solved using TrueCrypt, opensource, cross platform

•  indicative of widespread difficulty in using desktop encryption •  could also mitigate risk of inappropriate access to lost USB sticks

Research Data Management at University of Hertfordshire

Practical measures: encryption demystified

Outcome: guide and regular workshop, 60+ people trained so far

Page 20: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Making better use of what we have

Networked Storage is unloved and underused •  Researchers think it is not enough, too slow, too difficult to use

•  Mostly this is down to poor documentation and training

Document Management System not used at all by researchers

•  Similar to MS SharePoint, appropriate where high standard of retention and reporting is necessary, or versioning of data is needed

•  Hitherto used for conducting University business, not offered to research

Outcome: new ‘request research storage’ offer – with workflow to decide which resource to use and assistance in gaining access for external collaborators

Page 21: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Infrastructure strategy Develop a hybrid-cloud: our datacentres + elastic commercial services •  move data to where it needs to be in terms of speed and performance

•  keep working data nearby, much of the rest can be offloaded to the cloud

•  demote infrequently used data to lower performance storage •  share in the cloud

•  backup to the cloud •  data archive in the cloud Outcome: second tier of storage and active file management (next year), backup as a service, repository will use cloud based tape archive

Page 22: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Lobby for new national services

JANET brokerage: make deals with global vendors on behalf of HE •  brokerage focused on Infrastructure as a Service

•  JISCMRD voiced need for applications and vertical solution deals too

•  Storage, Backup, Repository – as a service

For example: Dropbox for teams, within our governance

Outcome: ? Unknown as yet, but Microsoft, Google, Amazon, and Dropbox are all engaged

Page 23: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Cost of robust data management •  HE datacentres: estimates of between £400 - £800 per TB per year

•  Amazon EC2, RackSpace cloud files £800 - £1000 /TB/yr

•  Amazon Glacier, Arkivum A-stor, archival storage, £300 /TB/yr

•  HE datacentres <1% failure rate, cloud datacentres: virtually nil failure rate{7}

•  4x2TB hard disc array, two year warranty ~ £600 but very high failure rate

•  on desk costs escalate from ~£200 /TB/yr, to £1700/TB/yr for malfunction, to >£4000/TB/yr in the event of data loss (e.g. quarter person year of effort)

NOTE: RCUK will pay for robust working data management via grants, and data accepted in their own archives, but Institutions are expected to pay for long term preservation and access to the rest

{7} http://datapool.soton.ac.uk/2013/03/21/cost-benefit-analysis-experience-of-southampton-research-data-producers/

Page 24: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Where are we @21 months •  not as far along as we imagined or hoped

•  but way ahead of where we were

•  rdm microsite, advice, case studies, training materials - public soon

•  embedded in early career development

•  engaged in some way with ~200 researchers, >1/3 of our research actives

•  nearly overrun with requests for storage – the word is out

•  better infrastructure and new services on the way

•  most of the knowledge retained for life on our own

Page 25: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

Lessons •  RDM is complicated and has many unanswered questions

•  doing it right is expensive, doing it wrong or not at all will cost more

•  the technical landscape is not yet mature

•  technology however, is the lesser barrier

•  cultural change is the more difficult hurdle to leap

•  researchers can be helped if you get amongst them •  policy and theoretical benefit won’t work

•  data publication needs tangible reward on a par with traditional publication

Page 26: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

RDM journey

At the start we were a source of new pain ….

…. right now we have momentum

and have won the odd heart and quite a few minds

REF = institutional preoccupation, rdm distraction

so that after the REF, when the problem really comes into focus, we will be well placed to meet it

space to continue to build on JISCMRD legacy with more:

infrastructure training advice herts.ac.uk/rdm

||

2011

2012 then came our interventions 2013

2014

but

Page 27: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

JISCMRD projects @UH - Joining up the organisation

Research infrastructure is dispersed across UH - RDM is the new glue

RDM team

Senior Management PVC Research Chair of Steering Group, CIO, PVCR on Research Committee

CIO, IT providers Case for £nnnk capital spend

Principal Investigators Directly assisted

Research Grants Office Workflow, Seminars, Conferences

Research Leaders RDM Champions recruitment, presentations

CPD and PG Training team Workshops and training

EPSRC RDM roadmap, Open Access WG

Page 28: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

more info

UH RDM projects blog www.herts.ac.uk/research-data-toolkit UH RDM microsite www.herts.ac.uk/rdm (coming soon) JISCMRD programme site http://bit.ly/195tyST Digital Curation Centre www.dcc.ac.uk/resources research-dataman@ jiscmrd@ www.jiscmail.ac.uk twitter: #jiscmrd

Page 29: Research Data Management at University of Hertfordshire ...vuh-la-risprt.herts.ac.uk/portal/files/7113985/rdtk_rgo_arma_conf... · Research Data Management at University of Hertfordshire

Research Data Management at University of Hertfordshire

contact

Liz Nolan, Manager, Research Grants Office [email protected]

Bill Worthington, RDM Projects Manager

[email protected] David Ford, Chief Technology Officer

[email protected]