2012 03-28 Wf4ever, preserving workflows as digital research objects
-
Upload
stian-soiland-reyes -
Category
Technology
-
view
564 -
download
0
description
Transcript of 2012 03-28 Wf4ever, preserving workflows as digital research objects
![Page 1: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/1.jpg)
Wf4Ever: Preserving workflows as digital Research Objects
EGI Community Forum 2012, Workflow Systems workshop
Leibniz Supercomputing Centre, Münich, 2012-03-28
Stian Soiland-Reyes myGrid, University of Manchester
![Page 2: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/2.jpg)
2
My background
myExperiment - Web 3.0 virtual environment, library and social network for workflows
~5000 registered users
~2200 workflows
~21 different systems
Taverna - Scientific Workflow Management System
~85000 downloads
~EU projects: SCAPE, BioVeL, HELIO,
e-Lico, VPH-SHARE, EGI-INSPiRE….
http://www.myexperiment.org/
http://www.taverna.org.uk/
![Page 3: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/3.jpg)
“A biologist would rather share their toothbrush than their gene name”
Mike Ashburner and others Professor in Dept of Genetics,
University of Cambridge, UK
![Page 4: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/4.jpg)
“Facebook for Scientists” ...but different to Facebook!
A repository of research methods
A social network of people and things
A Social Virtual Research Environment
A probe into researcher behaviour
Open source (BSD) Ruby on Rails app
REST and SPARQL, Linked Data
Influenced BioCatalogue, MethodBox and SysMO-SEEK
myExperiment currently has 5378 members, 292 groups, 2273 workflows, 534 files and 217 packs
http://www.myexperiment.org/
![Page 6: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/6.jpg)
http://www.wf4ever-project.org/
Workflow Preservation
Research Objects
Provenance
Recommendation
Astronomy and Genomics
![Page 7: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/7.jpg)
7
» Scientific workflows enable automation of scientific methods and encourage best practices to be shared
» Workflows need to be preserved for
› Reuse, fundamental for incremental scientific development
› Method reproducibility, key for credit and publication
» Workflow preservation is complex!
» Heterogeneous types of information need to be aggregated, including workflows and related resources forming research objects
» Research objects need to be trusted and understandable n years from now
» Social aspects need to be addressed in order to support reuse in scientific communities
Challenges Wf4Ever
Preservation of scientific workflows in data-intensive science
![Page 8: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/8.jpg)
Reusable. The key tenet of Research Objects is to support the sharing and reuse of data, methods and processes.
Repurposeable. Reuse may also involve the reuse of constituent parts of the Research Object.
Repeatable. There should be sufficient information in a Research Object to be able to repeat the study, perhaps years later.
Reproducible. A third party can start with the same inputs and methods and see if a prior result can be confirmed.
Replayable. Studies might involve single investigations that happen in milliseconds or protracted processes that take years.
Referenceable. If research objects are to augment or replace traditional publication methods, then they must be referenceable or citeable.
Revealable. Third parties must be able to audit the steps performed in the research in order to be convinced of the validity of results.
Respectful. Explicit representations of the provenance, lineage and flow of intellectual property.
The R.* dimensions
Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
![Page 9: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/9.jpg)
9
Forms of decay Wf4Ever
Workflow Decay • Service decay
• Flux/decay/unavailability
• Data decay • Formats/ids/standards
• Infrastructure decay • platform/resources
Experiment Decay • Methodological changes
• New technologies
• New resources/components
• New data
![Page 10: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/10.jpg)
10
Preservation, Conservation, Recreating
Preserving Archived Record Fixed Snapshots Review Rerun & Replay
Conserving Active Instrument Live Rerun & Reuse Repair & Restore
Recreating Archived Record Active Instrument Live Rebuild Recycle Repurpose
![Page 11: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/11.jpg)
11 http://www.gridworkflow.org/kwfgrid/gwes/docs/
Flux
Flux
Flux
Redo
Decay at different abstraction levels Workflow Decay
![Page 12: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/12.jpg)
12
Research objects
![Page 13: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/13.jpg)
13 13 13
Research Objects as Social Objects
![Page 14: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/14.jpg)
14
Research Object model core (simplified) http://purl.org/wf4ever/ro#
ro:Resource ro:ResearchObject
ro:Manifest
ro:AggregatedAnnotation
ore:aggregates
ro:annotatesAggregatedResource
wfdesc:Workflow
ore:isDescribedBy
Note: This figure shows a simplified view of the RO core.
RO specification: http://wf4ever.github.com/ro/
![Page 15: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/15.jpg)
15
Research Object model core http://purl.org/wf4ever/ro#
![Page 16: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/16.jpg)
16
RO model: Workflow Description http://purl.org/wf4ever/wfdesc#
![Page 17: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/17.jpg)
17
Workflow Provenance (wfprov) http://purl.org/wf4ever/wfprov#
![Page 18: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/18.jpg)
18
Technical infrastructure
• Models Semantic Web Encoding
• Research Object
• Annotation
• Provenance
• Evolution and Versioning
• Services Web APIs, REST services
• Foundational, Extension, User
• APIs, Architecture
• Principles • Map into standards
• Adopt standards
• Lightweight components
• Ecosystem • Command line
• Portal
• Third party systems
![Page 19: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/19.jpg)
19
Foundation Services
Extension Services
User Clients
Services The Wf4Ever Proposal
![Page 20: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/20.jpg)
20
Lifecycle Services
Storage Services
Wf4Ever Reference Implementation
Access & Usage Clients
Data Management & Analysis Services
Stability
Evaluation
Completeness
Evaluation Recommender
RO Portal RO Manager Tool
RO Digital Library
ROBox
Dropbox Client
Prototype, Dec 2011
Taverna Workflow
Mgmt System
![Page 21: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/21.jpg)
21
Year 1 (Dec 2010 Dec 2011) Roadmap
» Exploration (2011)
Problem specification and requirements identification
Better understanding of workflow preservation needs from the domains (what does it mean to preserve a scientific workflow?)
Proofs of concepts
Preliminary models, components, and integrated reference implementation
Result identification
![Page 22: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/22.jpg)
22
Year 2 (Dec 2011 Dec 2012) Roadmap
Realization/validation (2012) › Validate the models, architectures and software in practice
› Distributed components with different access/security arrangements – forming REST APIs and specifications
› RO Content Campaign: Generate 1000s of ROs
› First productization phase: Stable releases of models and reference implementation
› Decay monitoring and notification (why my wf is no longer stable), reacting to decay, attribution and credit support beyond recommendation. Detailed use of provenance
› Execution and interoperability support (SHIWA integration)
![Page 23: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/23.jpg)
23
Year 3 (Dec 2012 Dec 2013) Roadmap
» Exploitation (2013)
› Final productization phase
› Deployment in user environments and systems, enhanced with workflow preservation capabilities
› RO-enabled myExperiment
› RO-enabled Galaxy
› RO-enabled dataVerse
› … and more!
› Deployment in publishers e.g. Elsevier, Digital Science, GigaScience
![Page 24: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/24.jpg)
24
Collaborations and impact
» SHIWA – Sharing Interoperable Workflows
» Publishers/journals: Elsevier, GigaScience (by BGI)
» OpenPHACTS (nanopublications)
» SCAPE (dataset preservation)
» BioVel (biodiversity - species preservation!)
» Dataverse (data repository)
» Galaxy (workflow system for genomics)
» GenomeSpace (data integration platform)
![Page 25: 2012 03-28 Wf4ever, preserving workflows as digital research objects](https://reader033.fdocuments.in/reader033/viewer/2022060108/554e748cb4c90545698b4c2c/html5/thumbnails/25.jpg)
25
Thank you!
Any Questions?
http://www.wf4ever-project.org/
This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California,
94041, USA.