Crediting informatics and data folks in life science teams

26
The People Behind Research Software crediting from the informatics, technical point of view Professor Carole Goble, University of Manchester, UK Software Sustainability Institute UK ELIXIR, ISBE, FAIRDOM Views are my own Science Europe LEGS Committee: Career Pathways in Multidisciplinary Research: How to Assess the Contributions of Single Authors in Large Teams,

Transcript of Crediting informatics and data folks in life science teams

Page 1: Crediting informatics and data folks in life science teams

The People Behind Research Software

crediting from the informatics, technical point of view

Professor Carole Goble, University of Manchester, UK

Software Sustainability Institute UKELIXIR, ISBE, FAIRDOM

Views are my own

Science Europe LEGS Committee: Career Pathways in Multidisciplinary Research: How to Assess the Contributions of Single Authors in Large Teams, 1-2 Dec 2015, Brussels.

Page 2: Crediting informatics and data folks in life science teams

Team Science: Ego-System• Experimental scientists• Theoretical scientists• Modellers• Social scientists• Computer scientists

• Computational Scientists• Scientific informaticians• Specialist Tool developers• Research Software Engineers• Data engineers and curators• Service & resource providers• Infrastructure developers• System Administrators

Many software, services and public data resources are team based collaborations

Page 3: Crediting informatics and data folks in life science teams

Service vs Science in Projectsteams within teams

Biologists

Software frameworksTools, Infrastructure

Data platformsPublic data archives

BioinformaticiansComp Biologists

Local data curators

Page 4: Crediting informatics and data folks in life science teams

Informatics contribution to teamReputation, Recognition, Productivity, Respect

Contribution to the informatics– Technical publications in their own right– Software publications: citation proxies • Fosselise snapshot of authors as

contributors– Specific code and curation tracking – Usage metrics (downloads, reuse)– Comp Sci - Conferences matter– IMPACT

Page 5: Crediting informatics and data folks in life science teams

Compound, collaborative, living nature of data and software

Page 6: Crediting informatics and data folks in life science teams

Acknowledgement by research teams– “We are not the janitors” It’s not “free”.– The Craftsmen of Science– Careers, credibility and sustainability– Recognised career role of Research Software

Engineer and BioCurator– Recognition of professionalism, software and

data quality. – Reward for LABOUR.

Informatics contribution to teamReputation, Recognition, Productivity, Respect

Page 7: Crediting informatics and data folks in life science teams

*Survey of researchers from 15 UK Russell Group universities conducted by SSI between August - October 2014. 406 respondents covering representative range of funders, discipline and seniority.

92% Researchers use software*

53% Researchers develop software*

Page 8: Crediting informatics and data folks in life science teams
Page 9: Crediting informatics and data folks in life science teams

CreditBiologists Bioinformaticians

CiteLocal tool providersPublic data set providers

Page 10: Crediting informatics and data folks in life science teams

Service vs ScienceBackground vs Foreground

Data [and software] in foreground most likely cited.

Same data [and software] viewed as background not / explicitly cited though equally essential

Wynholds, et al (2012) Data, data use, and scientific inquiry: two case studies of data practices 10.1145/2232817.2232822

25% Publications that used the public Arrayexpress Archive cited it*

The invisibility of softwareesp software that is widely used, infrastructural, components or cross-discipline

*Rung, Brazma Reuse of public wide gene expression data Nature Review Genetics 2012

Page 11: Crediting informatics and data folks in life science teams

What is a Team? Credit driftImmediate team

Backgroundteam

“Foreground”informatics

Authorship Authorship?

Cited?Acknowledged

Cited?Mentioned

Ignored“Background”informatics

Cited

Page 12: Crediting informatics and data folks in life science teams

The Currency of Recognition

Person CareerPeers

FundersInstitutions Public

Resource Sustainability

credibility

Page 13: Crediting informatics and data folks in life science teams

Software mentions in the biology literature (90 articles)

Howison and Bullard 2015 The visibility of software in the scientific literature: how do scientists mention software and how effective are those mentions? J Assoc for Info Science and Technology DOI: 10.1002/asi.23538

37% citations formal87% software could be found

informal mentions very common-> poor at providing crediting information

18% software author offered preferred citation-> 32% who cited it ignored it

24% journals had a citation policy Legal License attribution obligations ignored

Page 14: Crediting informatics and data folks in life science teams

Team reciprocity rules

Download and Go. No.

Jam for Everyone.

Page 15: Crediting informatics and data folks in life science teams

sciencecodemanifesto.org

Page 16: Crediting informatics and data folks in life science teams

1. Software and Data Research Objects into the Publishing Workflow

informal mentions replaced by formal

Page 17: Crediting informatics and data folks in life science teams

http://ivory.idyll.org/blog/2015-authorship-on-software-papers.html

Page 18: Crediting informatics and data folks in life science teams

*http://arxiv.org/pdf/1407.5117v3.pdf

• Research Object-specific credit models– Software, data, models….– Credit based on use: downloads, reusability, reuse, FAIR

• Contribution: Credit distribution, propagation, dividends– Transitive credit maps (Katz and Smith)* , CReDIT**

• Use: Credit trajectories: tracing, tracking, mining– Recovery from literature, identifier and provenance infrastructure,

standards, data/software level metrics services (Datacite), repositories, machine readable and processable metadata.

3. Credit networks &

credit currency

**http://casrai.org/CRediT

http://depsy.org/

Page 19: Crediting informatics and data folks in life science teams

2. Stop conflating credit with Authorship

ContributionRoles

Usage

Liz Allen: CreDiT

Page 20: Crediting informatics and data folks in life science teams

4. Research units and credit models that reflect software

Not Publish. Release paradigm. Portfolio paradigm.

Jennifer Schopf, Treating Data Like Software: A Case for Production Quality Data, JCDL 2012

Evolving Multi-stewardedMulti-authoredMulti-platform

ReproducibleExecutable papers

ConnectedBody of work

Compound, Aggregated

Page 21: Crediting informatics and data folks in life science teams

https://dx.doi.org/10.1111/febs.13237

https://doi.org/10.15490/seek.1.investigation.56 Living

Snapshot

http://www.fair-dom.org

Page 22: Crediting informatics and data folks in life science teams

01/05/2023 22

An “evolving manuscript” would begin with a pre-publication, pre-peer review “beta 0.9” version of an article, followed by the approved published article itself, [ … ] “version 1.0”.

Subsequently, scientists would update this paper with details of further work as the area of research develops. Versions 2.0 and 3.0 might allow for the “accretion of confirmation [and] reputation”.

Ottoline Leyser […] assessment criteria in science revolve around the individual. “People have stopped thinking about the scientific enterprise”.

http://www.timeshighereducation.co.uk/news/evolving-manuscripts-the-future-of-scientific-communication/2020200.article

Page 23: Crediting informatics and data folks in life science teams

Ramps vs RevolutionsTechnical ramps• Machinery, tools, platforms,

repositories

Process ramps• Research processes and Publisher

workflows

Social ramps• Rules and policies• Adoption by stakeholders

– interventions & automations• Recognition by stakeholders

Credit is like love not moneyCitations and across discipline boundaries.

Within discipline more like dividends.

All research products and all scholarly labour are equally valued

(except by institutional promotion, funding review and REF committees)

Public software and data resources are not free.

Stewardship costs and needs crediting

Publishers adapt to “Publications” that are dynamic Research Objects

(still need to snapshot)

Page 24: Crediting informatics and data folks in life science teams

http://www.software.ac.uk/software-credit

Page 25: Crediting informatics and data folks in life science teams

https://www.force11.org/group/software-citation-working-group

Page 26: Crediting informatics and data folks in life science teams

Links• FAIRDOM

– http://www.fair-dom.org• SEEK Platform

– http://www.seek4science.org• Research Objects

– http://www.researchobject.org• Software Sustainability Institute

– http://www.software.ac.uk• Software Carpentry

– http://www.software-carpentry.org• Force11

– http://www.force11.org