Incentives for Open Science: Attribution, …...Incentives for Open Science: Attribution,...
Transcript of Incentives for Open Science: Attribution, …...Incentives for Open Science: Attribution,...
Incentives for Open Science:
Attribution, Recognition,
Collaboration
Sünje Dallmeier-Tiessen (CERN)
@European Research Council, Sept 2014
Email: [email protected]
Open Science
Royal Society, 1665 and 2012
Why we care about Open Access
and Open Data
Is it really happening?
Datasets are…
• Not shared or lost
• Difficult to discover and access
• Difficult to understand > context missing
Nature, 2009
Current state
• Advent of data repositories, data journals and data article types
• But no effective way to link between datasets, software and articles
• No widely used method to cite datasets
• No way to attribute a dataset (and any other scholarly material) to a
researcher unambiguously
How do I find the data used in the paper I am
reading? Has it been processed? If so - how?
This dataset is great: has the author shared more?
Why should I bother sharing my data? No one will
see it.
How persistent identifiers help
• Persistent, trustworthy link
• Who wrote/shared this article/dataset?
• What is the dataset supplementary to this article?
• Who cited this article, dataset…?
• Unique identification globally
• Used by publishers, data centers, libraries,
funders
• “openness” allows 3rd party services to build on
• Discovery services
• Metrics
Digital Object Identifiers (DOI names) offer a solution
Mostly widely used identifier for scientific articles
Researchers, authors, publishers know how to use them
Put datasets on the same playing field as articles
Dataset
Yancheva et al (2007). Analyses
on sediment of Lake Maar.
PANGAEA.
doi:10.1594/PANGAEA.587840
URLs are not persistent
(e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).
DOI names for citations
Slides by courtesy of Dr. Jan Brase, DataCite
Ecoli outbreak in Germany
immediately published the sequence data with a DOI
So that others could use the results
Force11- Data Citation Principles
Author, Publication Year, Dataset Title, Data Repository,
Version, Unique Identifier
- should include a persistent method for identification that
is machine actionable and globally unique
- should facilitate identification of, access to, and
verification of the specific data that support a claim.
www.force11.org
ACROSS DISCIPLINES Some examples
Geoscience
www.pangaea.de
Archaeology
ww
w.a
dc.a
c.u
k
Data citation including a DOI
ww
w.a
dc.a
c.u
k
AN EXAMPLE STORY High-Energy Physics
A collaborative discovery: Higgs
CERN, 2013
Research data on our community platform
www.inspirehep.net
Counting reuse: citations to data
www.inspirehep.net
Counting reuse: citations to data Tracking reuse: where is my impact?
www.inspirehep.net
Referenced Data
arXiv: 1311.1113
Who gets the credit for sharing data?
CERN, 2013
Who gets the credit for sharing data?
Cranmer, 2014
Using author IDs for attributing credit
Excerpt from publication list on
Excerpt from publication list on
Facilitates easy reporting:
“one click” to all your research output
We can list your data here – independent
if it sits in a repository or data journal
Identifiers for authors
• They help distinguishing the individual scientists
• Who is who
• Who wrote, shared, published data, software, papers…
• Similar names, changing names, name variations
• High mobility in a global research world – cannot rely on
affiliation, email address etc.
Trustworthy author identifiers:
• Unique on a global scale
• Interoperable, open
• Across disciplinary boundaries
Persistent Identifier for Open Access and
Open Data
To make Open Science “a reality” it needs the tools
to incentivize it:
Make data a “first class citizen”
• As a citable object
• Attributable to the right person
What’s next
• Enhance existing (community) services to allow
seamless workflows that incentivize Open
Science and Open Data
• Enable “integrators” to easily implement such
services
• Provide better support to researchers, i.e. how to
cite data, how to get and use an ORCID
• Make “data products” count: data articles,
datasets in a repository, analysis code,
documentation, data reviews, data citations, …
Development