Software Sustainability Institute How does software licensing fit into the RCUK expectations on...

24
Software Sustainability Institute www.software.ac. uk How does software licensing fit into the RCUK expectations on research data? http://dx.doi.org/10.6084/m9.figshare.1540765 14 September 2015, Cambridge Neil Chue Hong (@npch), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | [email protected] Slides licensed under CC-BY where indicated: Supported by Project funding from 1

Transcript of Software Sustainability Institute How does software licensing fit into the RCUK expectations on...

Page 1: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

1

How does software licensing fit into the RCUK expectations on research data?

http://dx.doi.org/10.6084/m9.figshare.1540765

14 September 2015, CambridgeNeil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | [email protected]

Slides licensed underCC-BY where indicated:

Supported by Project funding from

Page 2: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Purpose of this session

• Summarise the EPSRC research data management and open access policies, and how they relate to RCUK policy

• Explain how these are interpreted in Cambridge University

• Briefly cover other funders and REF• Provide links to further information

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 3: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

3

Science as an Open Enterprise

• “Publishing data in a reusable form to support findings must be mandatory” Chaired by Sir Geoffrey Boulton

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 4: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

4

RCUK committed to openness

Open Access

• “As bodies charged with investing public money in research, the Research Councils take very seriously their responsibilities in making the outputs from this research publicly available – not just to other researchers, but also to potential users in business, charitable and public sectors, and to the general tax paying public.”‐ http://www.rcuk.ac.uk/research/

openaccess/

Research Data

• “Publicly funded research data are a public good, produced in the public interest, which should be made openly available with as few restrictions as possible in a timely and responsible manner that does not harm intellectual property.” http://www.rcuk.ac.uk/

research/datapolicy/

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 5: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

This isn’t something new

• Open Access publications 2005: RCUK statement published 2012: Finch group roadmap published 1st April 2013: RCUK Policy on access to research publications came into force

(includes EPSRC)• Research Data policy

May 2011: EPSRC policy framework on research data published May 2012: Deadline for institutional roadmaps 1st May 2015: Full compliance with expectations came into force

• Other funders have similar policies RCUK, Wellcome Trust Horizon 2020 pilot action on open access to research data

• Participating projects will be required to develop a Data Management Plan

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 6: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Models of Open Access Publication

• “Green” OA – Deposit copy of publication in OA Archive Use the final draft author manuscript (preprint)

• Can include referee modifications• Generally can’t use final version after copyediting, layout, and proof correction.

Most (68%) of publishers support this method. Can check if allowed with the SHERPA/ROMEO tool (which is integrated into PURE)

University’s preferred model (zero cost)• “Gold” OA – Publish in Open Access Journal

Publish in Open Access journal Will have an Article Processing Charge (APC) – these vary

• Funding for APC’s cannot be requested on EPSRC grants• RCUK provides block funding to University to aid transition• Some journals have waiver schemes

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 7: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

What’s included in OA policy?

• Policy applies only to the publication of peer reviewed research articles ‐(including review articles not commissioned by publishers) and conference proceedings that acknowledge funding from the UK’s Research Councils Includes if there are other funders / commercial partners / overseas Doesn’t (currently) cover monographs, books, critical editions, volumes and

catalogues, or forms of non peer reviewed material ‐ ‐

• Embargo period enforced by journal cannot be > 6 months (for green OA) (REF Panels A + B cannot be > 12 months)

• You must acknowledge funding sources in publication using standard format: This work was supported by the Engineering and Physical Sciences Research Council

[grant number EP/H00000/1]• You must include statement on how underlying data and models can be

accessed Which leads to…

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 8: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

EPSRC Policy Framework on Research Data

• Scope: Research data is defined as recorded factual material commonly retained by

and accepted in the scientific community as necessary to validate research findings; although the majority of such data is created in digital format, all research data is included irrespective of the format in which it is created.

• Seven core principles:1. Public-funded, public good2. Legal, ethical and commercial constraints considered3. Acknowledge properly to encourage sharing4. Embargo period allowed5. Use a data management plan; preserve data with value6. Sufficient metadata for access and reuse7. Public funds can be used to preserve and manage data

• https://www.epsrc.ac.uk/about/standards/researchdata/principles/

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 9: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

What does this mean in practice?

• PIs of EPSRC-funded projects are responsible for Determining what data are important Publishing metadata describing data + access within 12 months of generation Depositing the data with sufficient metadata (publication may be restricted) Ensuring publications state how supporting research data can be accessed

• HEIs are responsible for Ensuring infrastructure and resource is in place to support research data

management and access through entire lifecycle Developing policies to clarify responsibility for activities Ensuring secure preservation of research data for 10 years from deposit / last access

• Not all data has to be preserved, nor for all time• EPSRC will take light-touch approach to monitor compliance initially

“Dipstick” checking of papers after summer break Investigation of complaints against

EPSRC-funded institutions

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 10: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

What should you be doing?

• During proposal writing stage Discuss what data might be important, and how to identify it Create a data management plan using DMPonline tool

• During project Ensure research data is hosted on an appropriate backed-up platform Ensure software is hosted in an appropriate code repository Create metadata describing generated data and software, along with access

restrictions / conditions, and record as a dataset within 12 months Deposit releases of data/software in a suitable repository which provides DOIs

e.g. DataShare, Zenodo, FigShare, Dryad, institutional repository• If data access is restricted, metadata should give reasons and conditions

• When writing a paper Ensure data and software required to substantiate paper is referenced

• ‘Creator (Publication Year): Title. Publisher. DOI’ Include brief access and funding statements Upload pre-print of paper to institutional research asset register Upload accepted paper to Cambridge Open Access servicehttp://dx.doi.org/10.6084/m9.figshare.1540765

Page 11: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Related policies

• Other research councils have similar guidelines AHRC: ADS or other repository within 3 months of project completion. Data accessible for three

years. BBSRC: DMP required. Release data at earliest opportunity, accessible for 10 years. Publications

deposited in UK PubMed Central. ESRC: DMP required. Data deposit within 3 months of end of award. MRC: DMP required. Release data at earliest opportunity, exclusive access window allowed,

accessible for 10 years. Publications deposited in UK PubMed Central. NERC: Outline DMP required. Deposit in NERC data centre as soon as possible, embargo allowed. STFC: Data management plans as part of funding proposal. Data deposit within 6 months of

related publication, embargo allowed, accessible for minimum 10 years.• REF 2020 policy for open access will impose further conditions from 1st April 2016

Applies to journal articles and conference proceedings with an International Standard Serial Number

Accepted author manuscript (post-peer review) must be deposited as soon as publication is accepted (and within 3 months of acceptance)• So you need to enter into research asset register when accepted, not when published

http://www.hefce.ac.uk/media/hefce/content/pubs/2014/201407/HEFCE2014_07.pdf

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 12: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

What about software?

• Depends on research being carried out• Deciding factor is whether the software is necessary to

validate research findings• Even if you don’t need to preserve software, it’s good

practice to make available to enable access, validation, and reuse of your results

• And sometimes it’s easier to store the code than the results• Does not prevent commercialisation and exploitation of

software, but you should make a case for why you are not open-sourcing code

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 13: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Expectations around software

• Research organisations are not expected to assume responsibility for software not produced within their own organisation Prudent to take reasonable steps to assure the continued

availability of the software you use• Preserving copy of open-source software • Use commercial software where a multi-year support

agreement is available• Use open-data formats

• Not all software must be shared If there are ethical, legal or commercial reasons

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 14: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Analysing research data using third party software

• Amy has recorded the measurements from her long-running chemistry experiment in an electronic lab notebook, and used the R and MatLab software packages to analyse her results and produce graphs which are included in her published paper.

• Since R and MatLab are both commonly-used software packages, Amy is not required to preserve the software as long as the metadata describing her research data is sufficient, and her paper explains the techniques she used. It may be useful for Amy to deposit the R/MatLab scripts that she used to analyse her results in a repository and link to this in her paper, because this will let others reuse her data and methods more easily and it is not an onerous task to complete.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 15: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Building scripts to support a workflow

• Brian has written a script which converts data from one format to another to allow him to interface two separate codes which use different input and output formats. This script is used in research work, which results in some publications.

• Brian is not expected to make the script available, as long as he has made the data that underpins the research work available and he has provided the metadata that describes it, including the formats. In this case it is of benefit to both Brian and other researchers for him to simply make the script available under an open licence. This is particularly the case if the amount of code was small, and there was no expectation that Brian would support the script after release.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 16: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Creating new software as part of a research project

• Colin has written a piece of software which implements a new algorithm for calculating a statistical index on a pre-existing dataset, and has published this algorithm in a paper along with results benchmarking it against other implementations of the statistical index.

• As the paper describes both the algorithm and compares it to other work, it is important that Colin deposits the software and makes it accessible. It will also be important for others to have access to the pre-existing dataset to enable validation of the results in the paper, which ideally will have a DOI and be openly accessible under a Creative Commons Attribution licence.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 17: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Dealing with commercially confidential objects

• Diya is undertaking research which simulates the airflow over a vehicle chassis, and has created an improved version of a commercial software model provided by an industry partner. She has then published a paper with the permission of the industry partner which broadly describes the revised model and presents the results of applying this model to a test dataset.

• In the case of ‘commercially confidential’ research data (in this case the airflow model), where a business organisation has a legitimate interest, it is not expected that the improved version produced by Diya would be made openly available. However, it would be reasonable to investigate making the revised model available subject to a suitable, legally enforceable, non-disclosure agreement to enable other researchers to verify the results published in the paper.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 18: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Exploiting software with commercial potential

• In the course of her EPSRC-funded research Erin has written some code which she believes has real commercial potential in its own right. She has written up the work and wishes to publish, but the results can only be validated by the code and Erin does not wish to jeopardise its commercial potential by disclosing it.

• Erin should seek the advice of her University’s commercialisation support office because under EPSRC’s standard grant conditions the university owns, and has the responsibility for exploiting, the intellectual property arising from EPSRC research grants. Because it is acceptable for there to be a delay in publication while arrangements are made to protect valuable IP, if the support office agrees with Erin they should ensure that suitable protection is put in place before the paper is published. It is important that the code is available to anyone who wishes to validate Erin’s research after it is published.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 19: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Faced with enormous amounts of generated data

• Feng is working on a large theoretical physics experiment which uses a piece of software to generate simulated data for an event. Each event data set consists of a very large amount of data, but a scientifically equivalent data set can be recreated as long as the initial parameters are identical.

• In some cases, it may not be possible or cost effective to preserve research data. For example, in the case of simulated data or outputs of models, it may be more effective to preserve the means to recreate the data by preserving the generating code and environment, rather than preserving the data themselves. Provided that the ability to validate published research findings is not fundamentally compromised, a deliberate decision to dispose of research data at an appropriate time is acceptable in these cases.

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 20: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Summary for PIs

• Open Access Publish in journals / proceedings which support self-archiving

(Green OA) or are open access themselves (Gold OA) Create research output record when publication is accepted Update record when publication is published

• Research Data Store your data securely Create dataset record in research outcomes system within 12

months Published papers should contain statement describing how

data can be accessed

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 21: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Summary for PIs

• Software Review what software you expect to use and produce

• A software management plan may help Record what software versions and parameters you use

• In some cases, this may mean data storage is not required Develop code in a repository Open source should be de facto choice

• Unless there are legal, ethical or commercial concerns• Enables validation of published results, in conjunction with

appropriate documentation and access to data

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 22: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Further guidance

• Guidance from Digital Curation Centre on Research Data: Summary of major funders data policies

http://www.dcc.ac.uk/resources/policy-and-legal/funders-data-policies

Developing Data Management Plans http://www.dcc.ac.uk/resources/how-guides/develop-data-planhttps://dmponline.dcc.ac.uk/

Licensing data http://www.dcc.ac.uk/resources/how-guides/license-research-data

• Guidance on role of software in EPSRC Research Data Policy here from the Software Sustainability Institute: http://www.software.ac.uk/resources/guides/epsrc-research-data-poli

cy-and-software

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 23: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Further information

• RCUK FAQ on Open Access http://www.rcuk.ac.uk/RCUK-prod/assets/documents/documents

/OpenaccessFAQs.pdf

• Acknowledging funders in publications http

://www.rin.ac.uk/our-work/research-funding-policy-and-guidance/acknowledgement-funders-journal-articles

• University Cambridge OA policy framework http://osc.cam.ac.uk/oa-policy-landscape/cambridge-open-access

-policy-framework

• University of Cambridge Open Access Service https://www.openaccess.cam.ac.uk/

http://dx.doi.org/10.6084/m9.figshare.1540765

Page 24: Software Sustainability Institute  How does software licensing fit into the RCUK expectations on research data? .

Software Sustainability Institute

www.software.ac.uk

Fellows 2016 programme

• If you are into research software then you should apply

• bit.ly/ssi-fellows• Closes 1.10.15