Shaping the practice and measuring the impact of Open Science Gentner Day 2014 Patricia Herterich...

21
Shaping the practice and measuring the impact of Open Science Gentner Day 2014 Patricia Herterich Humboldt-Universität zu Berlin & CERN

Transcript of Shaping the practice and measuring the impact of Open Science Gentner Day 2014 Patricia Herterich...

Shaping the practice and measuring the impact of Open Science

Gentner Day 2014Patricia Herterich

Humboldt-Universität zu Berlin & CERN

What is Open Science?

Open Source

Open Access

Open Data & Code

Open Science

Benefits of Open Science

For scientific communities:• Reproducibility of

research results• Leveraging web-based

tools to facilitate scientific collaboration

For society:• Public availability &

reusability of scientific data

• Public accessibility & transparency of scientific communication

Towards Open Science in HEP

Open Source

Open Access

Open Data & Code

Open Science

Next big challenge on the way to Open Science

Open Data - shaping the practices

• Hypothesis:– Data publication is painful and time consuming

“The challenge is that the research community is still hesitant when it comes to sharing material. While researchers are busy with research and publishing, sharing research data is often not on their agenda, especially because data preservation and sharing are not considered relevant to career promotion and research assessment.” [Libby Bishop, Veerle Van den Eyden]

Open Data - shaping the practices

• Hypothesis:– Shared data is not easy to discover

“Although data sharing is well advanced, I have encountered problems with discovering data. […] Some of my research could have been improved or accelerated by better data discoverability.” [Carolin Liefke]

Open Data - shaping the practices

• Hypothesis:– Researchers need incentives for data sharing such

as citation metrics“If all researchers could see that they are cited and attributed for their data publication and that their sharing is considered in their promotion committees this would make an important incentive” [Heather Piwowar]

My research question

If researchers are given:– Easy tools to publish data & code,– Search tools to discover data & code,– Metrics for re-use of data & code,

will it have an impact on Open Science, in particular in High-Energy Physics, both for the experimental and theory community?

research data

publication

discovery of/ access

to research data

research data (re-)

use

metrics / impact of research

data

A virtuous circle of Open Data…

Open Research Data in HEP• First steps towards data preservation, re-use

and open access

Example of tools and services

Example of tools and services

Example of tools and services

Example of tools and services

Example of tools and services

• CERN Open Data Portal

The CERN Open Data Portal - Content

• CMS Primary Datasets (half the data collected in 2010, AOD files incl. all the information needed for analysis)

• CMS Derived Datasets • CMS Tools for analysis• ALICE reconstructed data• ALICE analysis tools• Coming: ATLAS and LHCb masterclasses

research data

publication

discovery of/ access

to research data

research data (re-)

use

metrics / impact of research

data

A virtuous circle of Open Data?

Measuring the impact - qualitative

• Usability testing & user interviews to:– Understand the needs

of users– Improve the usability of

planned and developed services and tools

Measuring the impact - quantitative

• Impact indicators for the developed services and tools to see if and how they are used and can be improved such as– Number of dataset/code submissions– Number of citations to datasets/code

• Integration of more and more sources

Sources• Schäfer, Angela et al. (2011). Ten Tales of Drivers & Barriers in Data Sharing.

ZENODO. doi:10.5281/zenodo.8308• Gezelter, Dan (2009). What, exactly, is Open Science?. The OpenScience Project.

http://www.openscience.org/blog/?p=269• Saracevic, T. (2000). Digital Library Evaluation: Toward an Evolution of Concepts.

Library Trends, 49(3), 350-369• Corrall, S. & Brewerton, A. (1999). The new professional's handbook : your guide to

information services management. London: Library Association Pub.• Markless, S. & Streatfield, D. (2013). Evaluating the impact of your library. London:

Facet.• http://opendata.cern.ch/ and https://github.com/cernopendata/opendata.cern.ch• Data policies:

– CMS: https://cms-docdb.cern.ch/cgi-bin/PublicDocDB/RetrieveFile?docid=6032– ATLAS: https://

twiki.cern.ch/twiki/pub/AtlasPublic/AtlasPolicyDocuments/A78_ATLAS_Data_Access_Policy.pdf– LHCb: https://cds.cern.ch/record/1543410/files/LHCb-PUB-2013-003.pdf

Sources• Data shown:

– ATLAS Collaboration ( 2013 ). Data from Figure 7 from: Measurements of Higgs boson production and couplings in diboson final states with the ATLAS detector at the LHC. HepData. http://doi.org/10.7484/INSPIREHEP.DATA.26B4.TY5F

– ATLAS Collaboration ( 2013 ). Data from Figure 7 from: Measurements of Higgs boson production and couplings in diboson final states with the ATLAS detector at the LHC. HepData. http://doi.org/10.7484/INSPIREHEP.DATA.RF5P.6M3K

– ATLAS Collaboration ( 2013 ). Data from Figure 7 from: Measurements of Higgs boson production and couplings in diboson final states with the ATLAS detector at the LHC. HepData. http://doi.org/10.7484/INSPIREHEP.DATA.A78C.HK44

– Dumont, B., Fuks, B., Wymant, C. (2014) MadAnalysis 5 implementation of CMS-SUS-13-011: search for stops in the single lepton final state at 8 TeV. http://doi.org/10.7484/INSPIREHEP.DATA.LR5T.2RR3