Research Data Management at University of Hertfordshire...
Transcript of Research Data Management at University of Hertfordshire...
Research Data Management at University of Hertfordshire Liz Nolan, Bill Worthington, 11 June 2013, ARMA 2013 Conference, Nottingham
Research Data Management at University of Hertfordshire
Aims of today’s session
• To look at one institution’s experiences in developing practices and procedures in the management of research data
• To review our Data Management ‘journey’ - what we have learned and done so far
• To consider what more needs to be done
• Open discussion on ‘what is best practice’?
Research Data Management at University of Hertfordshire
Research at the University of Hertfordshire (UH)
• Post 1992 University
• 58th in RAE2008 (+35 places)
• Centralised Research Grants Office
• Research of 10 Schools belonging to 3 Research Institutes
• 300-350 bids per year; 100 - 125 awards
• ~500 active researchers (130 dedicated research staff)
• £8 -10m research income
• 25% RCUK, 38% EU, 2% Charities, 35% Other
Research Data Management at University of Hertfordshire
Drivers for change
2010: • Increased requests to Research Grants Office for
help with Data Sharing Plans; Technical Appendices, Data Management Plans
• Demand for more storage space • Research Information Network Event
Research Data Management at University of Hertfordshire
First steps
• RGO plea to Information Hertfordshire for help • Clear procedure agreed for assistance with Data
Management Plans
• Now use DCC DMPonline • https://dmponline.dcc.ac.uk/
Research Data Management at University of Hertfordshire
Data Management Policies
• Review of UH Data Management Policy
• New appendix - University Guide to Research Data Management
• http://sitem.herts.ac.uk/secreg/upr/IM12.htm
Research Data Management at University of Hertfordshire
Bids to JISCMRD
• JISC Managing Research Data Programme 2011-13
• Bid written by IH and RGO, submitted in July 2011
• First project:‘Service Oriented Toolkit for Research Data Management’
• Second, smaller bid Spring 2012, also successful
• Second project: Research Data Management Training in Physics and Astronomy
• Pro Vice-Chancellor (Research) Chair of Steering Group
• ~ £225k + ~60k JISC matched by £300 UH investment
Research Data Management at University of Hertfordshire
Research Data Management at University of Hertfordshire
Why RDM?
essentially: to get better value from research
- look after working data better and more efficiently
- publish and re-use data
Research Data Management at University of Hertfordshire
Why RDM? National Policy Context
Research data generated by publicly-funded research is seen as a public good and should be available for verification and re-use {4} All UK Research Councils require their grant holders to manage and retain their research data for re-use, unless there are specific and valid reasons not to do so {5} Example: By 2015, EPSRC require all data which underpins publications arising from their funded work to be made publically available.
{4} http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx {5} http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies
Research Data Management at University of Hertfordshire
Why RDM? Personal benefit
• Appropriate data management planning will be necessary to attract funding • RDM best practice protects against data loss and damage and personal/corporate
reputational damage
• RDM best practice will save you time, inevitable inconvenience and money
• Some of the hidden costs of RDM may be transferred to central services
• Published data will attract citations in their own right, and credit in assessment exercises
Research Data Management at University of Hertfordshire
Two tasks for the UH RDM team
Advocacy Making life easier
Research Data Management at University of Hertfordshire
Where to start? Audit how much data, how kept, how shared, how organised and level of awareness re: RDM? • ~ 600 staff invited, 12% return • scaling the responses led to an estimate of 2PB ( PB = million GB) • 20 x more than central resources • 80-90% in the hands of well resourced STEM research groups • remaining 200 - 400 TB held by non technical researchers
• most data held on workstations and laptops, and local ad-hoc storage • significant use of insecure media, mostly USB sticks • significant use of unregulated, ‘free’ cloud services, particularly Dropbox • data hoarding, collaboration only between trusted peers, possessive culture in many areas pockets of good practice RDM, mostly good intentions, but much risk
audits across the sector agree
Research Data Management at University of Hertfordshire
Where to start? Gap analysis
• lack of recognition of data as a career asset; • lack of awareness about university services for data storage, backup and sharing; • lack of trust in central services;
• need for more flexible facilities for collaborative sharing of working data; • need for facilities for long term data preservation and re-use; • need for training and advice
Help was needed for the whole project lifecycle, from data management planning, to safekeeping and collaborative working with data, to curation and arrangements for data re-use.
Most major funding bodies expect a Data Management Plan (DMP) - some require it
• UH stipulate use of https://dmponline.dcc.ac.uk
• Shortest path to a robust, well argued DMP, packaged as a PDF
• Researchers: “too many questions, too difficult, not relevant”
• UH dmponline template produced to fill in many of the blanks
• RGO + 3 x 0.5 fte RDM champions deployed to research institutes
Research Data Management at University of Hertfordshire
Data Management Planning
Second JISCMRD project to develop a short course in RDM
• project won by addressing a gap in the market: physical sciences
• modular materials, starting with the generic, moving to subject specific
• developed ‘in situ’ by delivery and feedback on existing early career CPD and post-Graduate training programmes
• integrated with new RDM micro site
• materials, speakers notes, activities and presentation planning matrix will be deposited with jorum.ac.uk
Research Data Management at University of Hertfordshire
Training
Two way street: there is learning and knowledge transfer at each encounter • lunchtime seminars, workshops, research groups meetings, research
management meetings, individual consultations, hi-jacked conversations
• parts of: me, two technical consultants, two project officers, two RDM champions, two research support librarians, the CTO, the CIO
• amounting to 4 to 5 FTE in the latter half of the projects
• gradually, by helping people and being persistent, the messages get through and we see new demands for assistance
Research Data Management at University of Hertfordshire
Ongoing engagement
Research Data Management at University of Hertfordshire
Interventions
We have worked at all levels: • with individual researchers and small problems • with our own service providers and local systems
• with other Universities
• with JANET
We have been good at bothering people but always with an offer of help or a constructive demand.
Problem: Researcher in health needed to deliver data to a funder in a ‘secured package’
• solved using TrueCrypt, opensource, cross platform
• indicative of widespread difficulty in using desktop encryption • could also mitigate risk of inappropriate access to lost USB sticks
Research Data Management at University of Hertfordshire
Practical measures: encryption demystified
Outcome: guide and regular workshop, 60+ people trained so far
Research Data Management at University of Hertfordshire
Making better use of what we have
Networked Storage is unloved and underused • Researchers think it is not enough, too slow, too difficult to use
• Mostly this is down to poor documentation and training
Document Management System not used at all by researchers
• Similar to MS SharePoint, appropriate where high standard of retention and reporting is necessary, or versioning of data is needed
• Hitherto used for conducting University business, not offered to research
Outcome: new ‘request research storage’ offer – with workflow to decide which resource to use and assistance in gaining access for external collaborators
Research Data Management at University of Hertfordshire
Infrastructure strategy Develop a hybrid-cloud: our datacentres + elastic commercial services • move data to where it needs to be in terms of speed and performance
• keep working data nearby, much of the rest can be offloaded to the cloud
• demote infrequently used data to lower performance storage • share in the cloud
• backup to the cloud • data archive in the cloud Outcome: second tier of storage and active file management (next year), backup as a service, repository will use cloud based tape archive
Research Data Management at University of Hertfordshire
Lobby for new national services
JANET brokerage: make deals with global vendors on behalf of HE • brokerage focused on Infrastructure as a Service
• JISCMRD voiced need for applications and vertical solution deals too
• Storage, Backup, Repository – as a service
For example: Dropbox for teams, within our governance
Outcome: ? Unknown as yet, but Microsoft, Google, Amazon, and Dropbox are all engaged
Research Data Management at University of Hertfordshire
Cost of robust data management • HE datacentres: estimates of between £400 - £800 per TB per year
• Amazon EC2, RackSpace cloud files £800 - £1000 /TB/yr
• Amazon Glacier, Arkivum A-stor, archival storage, £300 /TB/yr
• HE datacentres <1% failure rate, cloud datacentres: virtually nil failure rate{7}
• 4x2TB hard disc array, two year warranty ~ £600 but very high failure rate
• on desk costs escalate from ~£200 /TB/yr, to £1700/TB/yr for malfunction, to >£4000/TB/yr in the event of data loss (e.g. quarter person year of effort)
NOTE: RCUK will pay for robust working data management via grants, and data accepted in their own archives, but Institutions are expected to pay for long term preservation and access to the rest
{7} http://datapool.soton.ac.uk/2013/03/21/cost-benefit-analysis-experience-of-southampton-research-data-producers/
Research Data Management at University of Hertfordshire
Where are we @21 months • not as far along as we imagined or hoped
• but way ahead of where we were
• rdm microsite, advice, case studies, training materials - public soon
• embedded in early career development
• engaged in some way with ~200 researchers, >1/3 of our research actives
• nearly overrun with requests for storage – the word is out
• better infrastructure and new services on the way
• most of the knowledge retained for life on our own
Research Data Management at University of Hertfordshire
Lessons • RDM is complicated and has many unanswered questions
• doing it right is expensive, doing it wrong or not at all will cost more
• the technical landscape is not yet mature
• technology however, is the lesser barrier
• cultural change is the more difficult hurdle to leap
• researchers can be helped if you get amongst them • policy and theoretical benefit won’t work
• data publication needs tangible reward on a par with traditional publication
Research Data Management at University of Hertfordshire
RDM journey
At the start we were a source of new pain ….
…. right now we have momentum
and have won the odd heart and quite a few minds
REF = institutional preoccupation, rdm distraction
so that after the REF, when the problem really comes into focus, we will be well placed to meet it
space to continue to build on JISCMRD legacy with more:
infrastructure training advice herts.ac.uk/rdm
||
2011
2012 then came our interventions 2013
2014
but
Research Data Management at University of Hertfordshire
JISCMRD projects @UH - Joining up the organisation
Research infrastructure is dispersed across UH - RDM is the new glue
RDM team
Senior Management PVC Research Chair of Steering Group, CIO, PVCR on Research Committee
CIO, IT providers Case for £nnnk capital spend
Principal Investigators Directly assisted
Research Grants Office Workflow, Seminars, Conferences
Research Leaders RDM Champions recruitment, presentations
CPD and PG Training team Workshops and training
EPSRC RDM roadmap, Open Access WG
Research Data Management at University of Hertfordshire
more info
UH RDM projects blog www.herts.ac.uk/research-data-toolkit UH RDM microsite www.herts.ac.uk/rdm (coming soon) JISCMRD programme site http://bit.ly/195tyST Digital Curation Centre www.dcc.ac.uk/resources research-dataman@ jiscmrd@ www.jiscmail.ac.uk twitter: #jiscmrd
Research Data Management at University of Hertfordshire
contact
Liz Nolan, Manager, Research Grants Office [email protected]
Bill Worthington, RDM Projects Manager
[email protected] David Ford, Chief Technology Officer