Developing a Data Management Plan

37
Developing a Data Management Plan Martin Donnelly Digital Curation Centre University of Edinburgh AgreenSkills Annual Seminar Paris, 15 February 2017

Transcript of Developing a Data Management Plan

Developing a Data Management Plan

Martin DonnellyDigital Curation CentreUniversity of Edinburgh

AgreenSkills Annual SeminarParis, 15 February 2017

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

The Digital Curation Centre (DCC)

• The UK’s national centre of expertise in digital preservation and data management, est. 2004• Principal audience is the UK higher education sector, but we increasingly work further afield (continental Europe, North America, South Africa, Asia…)• Provide guidance, training, tools (e.g. DMPonline) and other services on all aspects of research data management and Open Science• Now offering tailored consultancy/training• Organise national and international events and webinars (International Digital Curation Conference, Research Data Management Forum)

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Background• Checklist for a Data Management Plan (v1, 2009)• A generic list of issues that a DMP could or should cover, derived from

UK funder requirements• DMPonline (2010-present)• A wizard-style, Web-based tool to help researchers and other related

professionals to produce and maintain DMPs according to funder or institutional policies

• Book Chapter (2011)• “Data Management Plans and Planning” in Pryor G (ed.) Managing

Research Data (New York, Facet)• DMPTool (2011-present)• Helped bring the US DMPTool consortium together, and provided

advice as they were starting up• EC Reviews (2016-17)• In summer 2016, I was one of two expert reviewers for first iteration

Horizon 2020 data management plans, and I’m doing it again for the next batch in February/March 2017

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Recap: what is RDM?

“the active management and appraisal of data over the lifecycle of scholarly and scientific interest”

What sorts of activities?- Planning and describing data-

related work before it takes place

- Documenting your data so that others can find and understand it

- Storing it safely during the project

- Depositing it in a trusted archive at the end of the project

- Linking publications to the datasets that underpin them

The benefits of Openness

• SPEED: The research process becomes faster

• EFFICIENCY: Data collection can be funded once, and used many times for a variety of purposes

• ACCESSIBILITY: Interested third parties can (where appropriate) access and build upon publicly-funded research resources with minimal barriers to access

• IMPACT and LONGEVITY: Open publications and data receive more citations, over longer periods

• TRANSPARENCY and QUALITY: The evidence that underpins research can be made open for anyone to scrutinise, and attempt to replicate findings. This leads to a more robust scholarly record

Data Management Plans and Planning

• Data management planning (DMP) underpins and pulls together different strands of RDM activities, often across multiple project partners

• DMP is the process of planning, describing and communicating activities carried out during the research lifecycle in order to…• Keep sensitive data safe• Maximise data’s reuse potential• Support longer-term preservation

• A data management plan is usually a short document detailing specifics of the data that will be created during a research project, together with information on how it can be accessed and utilised

• Research funders often ask for DMPs to be submitted alongside grant applications and/or developed over the course of the research project. (HEIs are increasingly asking their researchers to do this too…)

Benefits of data management planning• It is intuitive that planned activities stand a better chance of meeting their goals than unplanned ones. The process of planning is also a process of communication, increasingly important in interdisciplinary/multi-partner research. Collaboration will be more harmonious if project partners (in industry, other universities, other countries…) are on the same page• In terms of data security, if there are good reasons not to publish/share data, in whole or in part, you will be on more solid ground if you flag these up early in the process• DMP also provides an ideal opportunity to engender good practice with regard to (e.g.) file formats, metadata standards, storage and risk management practices, leading to greater longevity of data, and improved quality standards…

Limits of data management planningWhat can a plan not do? It can’t do the work for you.

The map is not the territory (Korzybski)orChalk’s no shears (Scottish saying)

It is important to remember that the human challenges in data management are often more difficult to meet than the technological ones.

So communication is vital, especially in international, multi-partner research!

What does a data management plan look like?

• Itisusuallyacoupleofpagesoutlining:

ü howdatawillbecaptured/createdü howitwillbedocumentedü whowillbeabletoaccessitü whereitwillbestoredü howitwillbebackedup,andü whether(andhow)itwillbesharedandpreservedlong-termü etc

• DMPsareoftensubmittedaspartoffundingapplications– andrequirementsvaryfromfundertofunder– buttheyareusefulwheneverresearchersarecreating(orreusing)data,especiallywheretheresearchinvolvesmultiple

partners,countries,etc…

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Roles and responsibilitiesLike RDM in general, data management planning is a hybrid activity, involving multiple stakeholder groups…• The principal investigator (usually ultimately responsible for data)• Research assistants (may be more involved in day-to-day data

management)• The institution’s funding office (may have a compliance role)• Library/IT/Legal (The library may issue PIDs, or liaise with an external

service who do this, e.g. DataCite.)• Partners based in other institutions• Commercial partners• Etc

Other stakeholders in the modern research process include governments, public services, and the general public (who fund lots of research via their taxes)

Caveat!

• It’snotnecessary– orevendesirable– foreveryresearcher(orresearchadministrator,orlibrarian,orITperson…)tobecomeanexpertineveryaspectofdatamanagement• Usefulexpertisemayalreadyexistwithintheresearchoffice,library,IT,departmentalsupportstaff,legalservicesetc,aswellasacademiccolleagueswellversedindatamanagement• Thetrickistoharnessthisandtomakeitappearseamless.Communicationandcoordination(oratleasttheappearanceof…)isincreasinglyimportant

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

European policy• Currently in the midst of an extended pilot for Horizon 2020. Other

projects can participate voluntarily, and opting in has been more popular than opting out

• Applies as minimum to research data underlying publications, plus any other data as decided by project

• Participants must:• Write a DMP as a project deliverable• Deposit data in a repository• Make it possible for others to access, mine, exploit and reuse the data• Share information on the tools needed

…unless there are compelling reasons not to do so. And these reasons should be recorded… in the DMP.

• Approach: “As open as possible, as closed as necessary”

Horizon 2020 – extended pilot (i)

As part of making research data findable, accessible, interoperable and re-usable (FAIR), a DMP should include information on:• the handling of research data during and after the end of the project• what data will be collected, processed and/or generated• which methodology and standards will be applied• whether data will be shared/made open access and• how data will be curated and preserved (including after the end of the project)

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Horizon 2020 – extended pilot (ii)• Once a project has had its funding approved and has started, you must

submit a first version of your DMP (as a deliverable) within the first 6 months of the project

• The Commission provides a DMP template, the use of which is recommended but voluntary

• The DMP needs to be updated over the course of the project whenever significant changes arise, such as (but not limited to):• new data• changes in consortium policies (e.g. new innovation potential, decision to file• for a patent)• changes in consortium composition and external factors (e.g. new consortium• members joining or old members leaving).

• The DMP should be updated as a minimum in time with the periodic evaluation/assessment of the project. If there are no other periodic reviews foreseen within the grant agreement, then such an update needs to be made in time for the final review at the latest. Furthermore, the consortium can define a timetable for review in the DMP itself

http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

DCC resources

• Guidance, e.g. “How-To Develop a Data Management and Sharing Plan”

• DCC Checklist for a Data Management Plan: http://www.dcc.ac.uk/resources/data-management-plans/checklist

• DMPonline tool: https://dmponline.dcc.ac.uk/

• Links to all DCC DMP resources via http://www.dcc.ac.uk/resources/data-management-plans

• HelpsresearcherswriteDMPs• Providesfunderquestionsandguidance• IncludesatemplateDMPforHorizon2020

• Provideshelpfromuniversities• Examplesandsuggestedanswers• Freetouse•Mature(v1launchedApril2010)• CodeisOpenSource(onGitHub)

https://dmponline.dcc.ac.uk

DMPonline: overview

Registration

Signupwithyouremailaddress,organisationand

password

Select‘otherorganisation’if

yoursisnotlisted

Creating a plan

Selectfunder(ifany)

Selectorganisationforadditionalquestions

andguidance

Selectothersourcesofguidance

Plan details: summary

SummaryofthesectionsandquestionsinyourDMP

Answering questions

Noteswhohasansweredthequestionandwhen

Progressbarupdateshowmanyquestionsremain

Sharing plans

Allowcolleaguestoread-only,read-write,orbecomeco-owners

Co-writing DMPs

Sectionsarelockedforeditingwhenthey’rebeingworkedon

bycolleagues

Exporting DMPs

Canexportasplaintext,docx,PDF,html...

Institutions can customise the tool by…

• Addingtemplates• Addingcustomguidance• Providingexampleorsuggestedanswers• Monitoringusagewithintheirorganisation• Offeringnon-Englishlanguageversions

www.dcc.ac.uk/news/customising-dmponline-admin-

interface-launches

More information

CustomisingDMPonlinewww.dcc.ac.uk/news/customising-dmponline-admin-interface-launches

http://www.screenr.com/PJHN

Getthecode,amendit,runalocalinstance,flagissues,requestfeatures...https://github.com/DigitalCurationCentre/DMPonline_v4

And finally, some sample plans

• There are lots of data management plans available on the Web. The DCC provides links to a number of sample DMPs via http://www.dcc.ac.uk/resources/data-management-plans/guidance-examples

• The US National Endowment for the Humanities (NEH) recently released over 100 of its DMPs. These are available via: http://www.neh.gov/divisions/odh/grant-news/data-management-plans-successful-grant-applications-2011-2014-now-available

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Nota bene!

• DMP is above all a communication activity, between the data collectors and their contemporaries (project partners and funders) and with future data re-users…

• Remember that there is no magic bullet, and no one-size-fits-all solution!

• Much of the benefit of data management planning lies in the process of planning, above and beyond the plans produced at the end of the process

• A DMP should be a living document. Research seldom goes entirely according to plan, and plans should be updated to reflect the reality of the research, not the other way around!

Contents

1. About the DCC2. My involvement with DMPs3. DMP: what and why?4. Who’s involved in the DMP process?5. DMP specifics in H20206. Useful resources7. A few things to note/remember8. Contacts and opportunity for questions

Thank you: any questions?

• For more information about the DCC:• Website: www.dcc.ac.uk• Director: Kevin Ashley

([email protected])• General enquiries: Alex Delipalta

([email protected]) • Twitter: @digitalcuration

• My contact details:• Email: [email protected]• Twitter: @mkdDCC• Slideshare:

http://www.slideshare.net/martindonnelly

This work is licensed under the Creative

Commons Attribution 2.5 UK: Scotland License.