The Astronomy challenge: How can workflow preservation help?

20
The Astronomy challenge: How can workflow preservation help? Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios Santander-Vela and Wf4Ever team. Instituto de Astrofísica de Andalucía – CSIC SHIWA Summer School. 3 July 2012

description

Susana Sánchez, Jose Enrique Ruíz , Lourdes Verdes-Montenegro, Julian Garrido , Juan de Dios Santander-Vela and Wf4Ever team. Instituto de Astrofísica de Andalucía – CSIC SHIWA Summer School. 3 July 2012. The Astronomy challenge: How can workflow preservation help?. Summary. - PowerPoint PPT Presentation

Transcript of The Astronomy challenge: How can workflow preservation help?

Page 1: The Astronomy challenge: How can  workflow  preservation help?

The Astronomy challenge: How can workflow preservation help?

Susana Sánchez, Jose Enrique Ruíz, Lourdes Verdes-Montenegro, Julian Garrido, Juan de Dios Santander-Vela and Wf4Ever team.

Instituto de Astrofísica de Andalucía – CSIC

SHIWA Summer School. 3 July 2012

Page 2: The Astronomy challenge: How can  workflow  preservation help?

2

Summary

» Introduction to AMIGA group

» The astronomy challenge and context of Wf4Ever project

» How Wf4ever tools can help the astronomers

» Our astronomy use case

Page 3: The Astronomy challenge: How can  workflow  preservation help?

3

AMIGA GroupIntroduction to AMIGA group

AMIGAAnalysis of the interstellar Medium of Isolated GAlaxies

An international collaboration coordinated from the IAA-CSICP.I. Lourdes Verdes-Montenegrohttp://amiga.iaa.es

• Statistical baseline of isolated galaxies to compare with the behavior of

galaxies in denser environments

• Multi-λ study of ~1000 galaxies:

• Need of intensive and complex analysis of multidimensional data

Page 4: The Astronomy challenge: How can  workflow  preservation help?

4

Past (or current?) situationThe astronomy challenge and context of Wf4Ever project

Analysis

Python

Science Ready Data

Page 5: The Astronomy challenge: How can  workflow  preservation help?

5

The astronomy challenge and context of Wf4Ever project

LHC – Tier 1

Antenna Correlation

http://www.almaobservatory.org

Page 6: The Astronomy challenge: How can  workflow  preservation help?

6

A disruptive change in the methodology is neededThe astronomy challenge and context of Wf4Ever project

GRID

Super Computer

Cloud

Analysis

Tools for:• Sharing• Inspecting• Visualizing• Discovering• Searching

Page 7: The Astronomy challenge: How can  workflow  preservation help?

7

The efficient use of the data and the production of reliable science.The astronomy challenge and context of Wf4Ever project

» Keywords: efficient use of data and reliable science

› Scientific Workflows:

• Enable automation and expose the flow of scientific methods

• Encourage best practices in packing the experiment

• Provide a way to share the experiment

› But more is needed:

• Reusability, fundamental for incremental scientific development

• Reproducibility, key for reliable science

Preserve the data and the scientific methodWorkflow preservation

Page 8: The Astronomy challenge: How can  workflow  preservation help?

8

The astronomy challenge and context of Wf4Ever projectWf4Ever project

Wf4Ever - Preservation of scientific workflows in data-intensive science

Page 9: The Astronomy challenge: How can  workflow  preservation help?

9

The aim of Wf4EverThe astronomy challenge and context of Wf4Ever project

Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplines

• Encapsulate the scientific methodology (the workflows and all the associated information) in an artefact called Research Object.

• Archival, classification and indexing of the research object in scalable semantic repositories, providing advanced access and recommendation capabilities based on monitoring and metrics to evaluate similarities, decay, quality, stability, completeness.

• Creation of scientific communities to collaboratively share, reuse and evolve Research Objects stimulating the development of new scientific knowledge

• Use Cases:• Astronomy (IAA)• Genome-wide Analysis and Biobanking (LUMC)

Page 10: The Astronomy challenge: How can  workflow  preservation help?

10

-Inputs (files or reference)

-Scripts

-Workflows

-Documentation

-Web services

Sharing the experiment

How Wf4ever tools can help the astronomers scientists

Completeness

Rating

Downloads 36

Citations [2]

Re-used [1]

Comments [4]

Keywords [galaxies][catalogs]

[Previous version | Next version]

Page 11: The Astronomy challenge: How can  workflow  preservation help?

11

How to import the experiment

How Wf4ever tools can help the scientists

INTEROPERABILITY

Page 12: The Astronomy challenge: How can  workflow  preservation help?

12

How to import the experiment

How Wf4ever tools can help the scientists

Extract !

Page 13: The Astronomy challenge: How can  workflow  preservation help?

13

Describing the experiment

How Wf4ever tools can help the scientists

Page 14: The Astronomy challenge: How can  workflow  preservation help?

14

Visualizing the experiment

How Wf4ever tools can help the scientists

A good understanding of the

experiment is key for the

reusability and repurposability

Page 15: The Astronomy challenge: How can  workflow  preservation help?

15

Checking the live experiment

How Wf4ever tools can help the scientists

Stability: Changes made by different kind of users on the RO, can improve it or make it worse

Completeness: It contains all the resources needed to be run, published, shared or repeated

Runnable

Publishable

Shareable

Repeatable

Service up, software working, etc.

The output can be reproduced

Enough annotation.

Service up, software working and all well commented

Page 16: The Astronomy challenge: How can  workflow  preservation help?

16

Checking the published experiment

How Wf4ever tools can help the scientists

Decay: The health of the RO: state of the services (up or down), of the applications (updated or deprecated), permissions to access the input data

Decay Information

Last check was performed 2 days ago and returned one error:

The service SDSS-DR7, needed by the workflow Calculate_galaxy_distances is down

Check now Try to repairTracking: Rating by other users, who used the RO, comments, etc.

Rating

Downloads 36

Citations [2]

Re-used [1]

Comments [4]

Page 17: The Astronomy challenge: How can  workflow  preservation help?

17

Discovering an experiment

How Wf4ever tools can help the scientists

Search by keywords

Past ratingsRatings of similar users

Page 18: The Astronomy challenge: How can  workflow  preservation help?

18

Functionalities of the RO and tools

How Wf4ever tools can help the scientists

REUSABILITY AND REPURPOSABILITY

• Annotations for the description of the whole and its components

• Visualization of the relationships between the components

• Versioning of the whole or its components

SHARING

• Restricted access on data and processes

• Ensuring authorship and allow to RO to be cited

REPRODUCIBILITY

• Completeness checking

• Decay monitoring and notification, reacting to decay

• Execution and interoperability support

DISCOVERY

• Semantic discovery of ROs, processes, web services

• Recommendation capabilities.

Page 19: The Astronomy challenge: How can  workflow  preservation help?

19

Why I am here as student

Our astronomy Use Case

Deploy web-services-based workflows for analysis of multidimensional data on heterogeneous e-infrastructures

ASKAP Cubes Prof. Kevin Vinsen

• Standards for publishing and accessing

astronomical data (Service Oriented

Architecture)• Data provider analysis service provider

Cloud

Super Computer

GRID

V I R T U A L O B S E R V AT O R Y