Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and...
-
Upload
lynette-owen -
Category
Documents
-
view
212 -
download
0
Transcript of Data Repositories and Science Gateways for Open Science Presenter: Roberto Barbera – UNICT and...
Data Repositories and Science Gateways for Open Science
Presenter: Roberto Barbera – UNICT and INFN
EGI Community Forum Bari – 11 November 2015
2
Outline
Introductory concepts, definitions and driving considerations
A viable approach to Open Science
Summary and conclusions
3
The Scientific Method
• Examples of IR: • Classical
Mechanics• Newton’s
Gravitation Theory
• Examples of DR: • General
Relativity• Standard
Model of Particle Physics
G. Galilei
4
The Pillars of the Scientific Method
• Repeatability• The closeness of agreement between independent
results obtained with the same method on identical test material, under the same conditions (same operator, same apparatus, same laboratory and after short intervals of time)
• Affected by random errors
• Reproducibility• The closeness of agreement between independent
results obtained with the same method on identical test material but under different conditions (different operators, different apparatus, different laboratories and/or after different intervals of time)
• Affected by systematic errors
Is science really reproducible ?
5
Challenges in irreproducible research(http://www.nature.com/nature/focus/reproducibility/index.html)
6
The “reproducibility crisis”
18
Out of 18 microarray papers, results
from 10 could not be reproduced
1. Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 142. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html3. Bjorn Brembs: Open Access and the looming crisis in science https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950
7
Repeatability and Reproducibility are not all
8
How e-Infrastructures support the (e-)Scientific Method
Data InfrastructuresOpen Access Doc. Repos.
Data Repos.
Sem
an
tic-w
eb
en
rich
men
t of
lin
ked
data
Data
pre
serv
ati
on
HTC
/HP
C C
luste
rsG
rid
s,
Clo
ud
s
Challenge: «walk» across the knowledge path both ways
Open Science
10
An INFN approach to Open Science:the “grand” view
Digital Repository of Research Products(pilot: www.openaccessrepository.it)
arX
iv
CN
RS
&T D
LC
INEC
A
VQ
R
INFN
M
ult
im
ed
ia
SINGLE – MANDATORY - DEPOSITSCIENCE PRODUCTS REPRODUCIBILITY
ORCID
INFN
G
ray
Lit
.S
CO
AP
3
The INFN Open Access Repository(www.openaccessrepository.it)
papers
data
Automatic ingestion in place from:
federatedauthentication
12
Alternative reputation systems:possibility to add researcher ID’s
13
Examples of document and data resources
Data stored on:
14
Example of software resources: the ALICE Virtual Research Environment
15
Example of research “package”
16
The OAR Knowledge Workflow
17
The OAR Knowledge Workflow:ALEPH data search & discovery
18
1. From OAR it is possible to select an “analysis” as simply as any other resources in the archive
The OAR Knowledge Workflow:ALEPH “packages” inspection
2. Clicking on RUN PAGE, the researcher can either reproduce or extend that particular analysis using a Catania Science Gateway
19
The OAR Knowledge Workflow:ALEPH data analysis (1/2)
The Science Gateway collects from the OAR, and allows user browse, the metadata associated to the dataset(s) needed to run that particular analysis
20
The OAR Knowledge Workflow:ALEPH data analysis (2/2)
Data are retrieved from
Using the JSAGA adaptor for all OCCI-compliant cloud-middleware, the Science Gateway starts a dedicated VM already configured with the all the experiment software
Both the CHAIN-REDS Cloud Testbed and the EGI Federated Cloud can be used as e-Infrastruc-tures
Jobs run both on and
21
Remember: repeatability and reproducibility are not all
Reusability and «extensibility» matter!
22
1. From within the CHAIN-REDS Science Gateway entitled researchers can start VMs already configured to re-use/extend ALICE data analyses
2. The VMs are deployed both on the CHAIN-REDS Cloud Testbed and on the EGI Federated Cloud using the features of the EGI AppDB
Reusability of ALICE data with the CHAIN-REDS Science Gateway (1/3)
23
Reusability of ALICE data with the CHAIN-REDS Science Gateway (2/3)
1. The VM is available tor a customizable amount of time during which the user has full access to the dataset(s) and analysis algorithm(s) and source code(s) of the experiment
2. The user can access the VM using different protocols (e.g., SSH, VNC); clicking on the SSH or VNC icons the user can directly access the VM instantiated on the cloud from within the Science Gateway
24
Reusability of ALICE data with the CHAIN-REDS Science Gateway (3/3)
New stable analyses (and their results), generated running the VM,
may be registered in the OAR (with DOIs) to further extend the analysis catalogue shared within the Virtual
Research Community
25
“Who’s this science of ?”
How to provide authorship to research products?
26
ORCID (www.orcid.org – becoming a “de facto” standard)
More than 1.74 million ORCID IDs so far
27
ORCID: search & link your works in/from DataCite
28
ORCID: add your research products to your profile
v
<a
29
Summary and conclusions
Open Science vision can be implemented only if the “openness” paradigm becomes pervasive in research
Science outputs’ reproducibility, but also re-usability and extensibility, are key to walk through the “knowledge path” in both directions
The INFN Open Access Repository is a pilot knowledge preservation repository meant to serve both researchers and citizen scientists
What makes the INFN OAR different from other repositories is: Its capability to connect to Science Gateways and exploit
cloud resources worldwide to easily reproduce/extend scientific analyses
Its capability to provide full authorship (and hence credit, reputation and visibility) for all products of a scientist this is key for a correct evaluation of research (…and of researchers)
30
Authors
R. Barbera (University of Catania and INFN, Italy)
S. Bianco (INFN LNF, Italy) T. Boccali (INFN Pisa, Italy) C. Carrubba (University of Catania, Italy) G. Inserra (University of Catania, Italy) M. Maggi (INFN Bari, Italy) D. Menasce (INFN Milano Bicocca, Italy) R. Ricceri (University of Catania, Italy)
31
Thank you !