Momentum of Open Research Data: now in 5-d!
-
Upload
heather-piwowar -
Category
Education
-
view
553 -
download
0
description
Transcript of Momentum of Open Research Data: now in 5-d!
Heather Piwowar @researchremix Postdoc with NESCent and Dryad, at Duke and UBC
SFU Research Data Repository Project Launch
October 2012
Momentum ofopen research data:
now in 5-D!
some photos NC, SA
http://www.metmuseum.org/toah/ho/09/euwf/ho_24.45.1.htm
http://www.flickr.com/photos/jsmjr/62443357/
http://www.flickr.com/photos/camilleharrington/3587294608/
http://www.flickr.com/photos/rkuhnau/3318245976/
http://www.flickr.com/photos/conformpdx/1796399674/
http://www.flickr.com/photos/rkuhnau/3317418699/
http://www.flickr.com/photos/zemlinki/261617721/
http://www.flickr.com/photos/tracenmatt/3020786491/
http://www.flickr.com/photos/the-o/2078239333/
http://www.flickr.com/photos/ryanr/142455033/
http://www.flickr.com/photos/75166820@N00/5318468/
MOMENTUM
5 dimensions
- repositories- research- policies- tools- environment
- repositories- research- policies- tools- environment
Discipline repositoryDatatype repositoryJournal repositoryInstitutional repository...
Institutional repository:https://circle.ubc.ca/
Discipline repository:http://datadryad.org/
Datatype repository:http://www.ncbi.nlm.nih.gov/genbank/(example: http://www.ncbi.nlm.nih.gov/nuccore/192496?report=genbank )
Journal supplementary information:http://www.nature.com/nature/journal/v429/n6990/suppinfo/nature02564.html
Lab website:http://www.bx.psu.edu/~ross/dataset/DatasetHome.html
"Data paper"http://www.biomedcentral.com/bmcresnotes/
Catch-all data repository:http://figshare.com/
X X
X
X
X
XX
X
X
X
X
X
X
X
X X
X
XX
XX
X
What’s best?It depends.We don’t know.
It depends.
http://www.flickr.com/photos/jo-h/2688026447/
- repositories- research- policies- tools- environment
Citation boost
Gleditsch et al. 2003. Posting Your Data: Will You Be Scooped or Will You Be Famous?, International Studies Perspectives 4(1): 89–97.
Piwowar et al. 2007. Sharing Detailed research data is associated with increased citation Rate. PLoS ONE.
Ioannidis et al. Repeatability of published microarray gene expression analyses. Nature Genetics 41, 149 - 155
Pienta et al. 2010. NSR Social Science Secondary Use. Michigan IR.
Henneken et al. 2011. Linking to Data – Effect on Citation Rates in Astronomy. ESO.
Sears 2011. Data Sharing Effect on Article Citation rate in Paleoceanography. AGU.
~70% in multivariate analysis
Amount shared and withheld
0.05
0.10
0.15
0.20
0.25
0.30
0.35
Year article published
Pro
po
rtio
n o
f a
rtic
les w
ith
da
tase
ts f
ou
nd
in
GE
O o
r A
rra
yE
xp
ress
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Proportion of articles with shared datasets, by year
Across time
19%
Piwowar and Chapman. Journal of Informetrics 2010
Odds Ratio
0.25 0.50 1.00 2.00 4.00
OA journal & previous GEO-AE sharing
0.95Amount of NIH funding
Journal impact factor and policy
Higher Ed in USA
Cancer & humans
Multivariate nonlinear regression with interactions
Amount of reuse
Type of reuse
Cost/benefit
Traditional research funding:$400k = 16 papers
At Dryad cost levels,at similar levels of reuse to GEO, $400k would facilitate 1000 reuse papers
A stellar Scientific ROI is in easy reach.
2) more impact per funding dollar
Piwowar, Vision, Whitlock (2011) Data archiving is a good investment. Nature 473, 285
http://researchremix.wordpress.com/2011/05/19/nature-letter/
- repositories- research- policies- tools- environment
Journal requirements
“An inherent principle of publication is that others should be able to replicate and build upon the authors' published claims. Therefore, a condition of publication in a Nature journal is that authors are required to make materials, data and associated protocols available in a publicly accessible database …”
http://www.nature.com/authors/editorial_policies/availability.html
http://www.nature.com/nature/journal/v453/n7197/index.html
journal data sharing policy
JDAP<< Journal>> requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as << list of approved archives here >>. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.
High-impact journals
tend to have
a strong data-sharing
policy
Articles published in journals with a strong data-sharing policy are more likely to have publicly
available datasets
NSF data management requirement
NSF biosketch
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#;#<2//#.5*#=,+5#>2*4?,-3##
Do not publicize
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9#:/54351#2*#;2//#<5#=4/851##<0#>0#?8-15+#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9#:/54351#2*#;2//#<5#=4/851##<0#>0#:+,>,?,-#,+#*5-8+5#@,>>2A55#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#,*;5+3#<4-#=82/1#,-#>0#?,+@#>,+5#5432/0#
!"#
$!"#
%!"#
&!"#
'!"#
(!"#
)*+,-./0#1234.+55#
6# 6# 758*+4/# 6# 6# )*+,-./0#4.+55#
9:/54351#;#<2//#.5*#=,+5#>2*4?,-3##
Do not publicize
http://www.nsf.gov/pubs/policydocs/pappguide/nsf08_1/gpg_2.jsp
NSF Biosketchstarting January:
Publications to Products
- repositories- research- policies- tools- environment
DataUp
DMP Tool
RunMyCode
http://www.flickr.com/photos/pixscapes/4331070047
In 2009, 116 articles cited ORNL DAAC data.
Finding these articles took 70-80 hours
across at least 12 resourcesall chosen from a deep understanding of this specific research domain
then the full text of all the hits were manually reviewed
Valerie Enriquez interview with James Kidderhttp://openwetware.org/wiki/DataONE:Notebook/Reuse_of_repository_data
http://www.flickr.com/photos/quinnanya/2055471833
altmetrics.org/tools
ImpactStoryaltmetric.comPLoS article-level metricsReader MeterScience Card
CC-BY-NC by maniacyak on flickrhttp://www.flickr.com/photos/maniacyak/3432589472
impact flavour
http://dx.doi.org/10.5061/dryad.18
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/3131/utilization
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/3131/utilization
http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/3131/utilization
ImpactStory.org
- repositories- research- policies- tools- environment
Open Access
Reproducibility
Big Data
- repositories- research- policies- tools- environment
GET EXCITEDand
MAKE THINGS
Open up your data while you are doing it :)
http://www.flickr.com/photos/myklroventine/892446624/
thank you!Todd Vision: PI of Dryad
Jason Priem: cofounder of ImpactStory
Also: Mike Whitlock, Jonathan Carlson, Estephanie Sta MariaThe open science online community and those who release their articles, datasets and photos openly.
blog: ResearchRemix.wordpress.com@researchremix