Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC...

24
Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute

Transcript of Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC...

Page 1: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Pegasus: Mapping Scientific Workflows onto the Grid

Ewa Deelman

Center for Grid Technologies

USC Information Sciences Institute

Page 2: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Pegasus Acknowledgements

Ewa Deelman, Carl Kesselman, Saurabh Khurana, Gaurang Mehta, Sonal Patil, Gurmeet Singh, Mei-Hui Su, Karan Vahi (Center for Grid Computing, ISI)

James Blythe, Yolanda Gil (Intelligent Systems Division, ISI)

Collaboration with Miron Livny (UW Madison) http://pegasus.isi.edu Research funded as part of the NSF GriPhyN, NVO

and SCEC projects and EU-funded GridLab

Page 3: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Outline

Workflow Management in Grids Pegasus, Planning for Execution in Grids Applications Using Pegasus In-time planning Future Research Directions

Page 4: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Grid Applications

Increasing in the level of complexity Use of individual application components Reuse of individual intermediate data products (files) Description of Data Products using Metadata Attributes

Execution environment is complex and very dynamic Resources come and go Data is replicated Components can be found at various locations or staged in on

demand

Separation between the application description the actual execution description

Page 5: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

FFT

FFT filea

/usr/local/bin/fft /home/file1

transfer filea from host1://home/filea

to host2://home/file1

ApplicationDomain

AbstractWorkflow

ConcreteWorkflow

ExecutionEnvironment

host1 host2

Data

Data

host2

App

licat

ion

Dev

elop

men

t and

Exe

cutio

n P

roce

ss

DataTransfer

Resource SelectionData Replica Selection

Transformation InstanceSelection

ApplicationComponentSelection

Retry

Pick different Resources

Specify aDifferentWorkflow

Failure RecoveryMethod

Abstract Workflow

Generation

ConcreteWorkflow

Generation

Page 6: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Why Automate Workflow Generation?

Usability: Limit User’s necessary Grid knowledge Monitoring and Directory Service Replica Location Service

Complexity: User needs to make choices

Alternative application components Alternative files Alternative locations

The user may reach a dead end Many different interdependencies may occur among

components Solution cost:

Evaluate the alternative solution costs Performance Reliability Resource Usage

Global cost: minimizing cost within a community or a virtual organization requires reasoning about individual user’s choices in light of

other user’s choices

Page 7: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

GriPhyN’sExecutable Workflow Construction

Build an abstract workflow based on VDL descriptions (Chimera)

Build an executable workflow based on the abstract workflows (Pegasus)

Execute the workflow (Condor’s DAGMan)

AbstractWorfklow

Concrete Workflow

Jobs

Chimera Pegasus DAGMan

VDL

RLS TC MDS

Page 8: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

VDL and Abstract Workflow

d1

d2

ba

cb

VDL descriptions

User request data file “c”

d1 d2

ba cAbstract Workflow

Page 9: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Condor’s DAGMan Developed at UW Madison (Livny) Executes a concrete workflow Makes sure the dependencies are followed Execute the jobs specified in the workflow

Execution Data movement Catalog updates

Provides a “rescue DAG” in case of failure

Page 10: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Pegasus:Planning for Execution in Grids

Maps from abstract to concrete workflow Algorithmic and AI-based techniques

Automatically locates physical locations for both components (transformations) and data

Finds appropriate resources to execute Reuses existing data products where applicable Publishes newly derived data products

Chimera virtual data catalog Provides provenance information

Page 11: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Information ComponentsUsed by Pegasus

Globus Monitoring and Discovery Service (MDS) Locates available resources Finds resource properties

Dynamic: load, queue length Static: location of gridftp server, RLS, etc

Globus Replica Location Service Locates data that may be replicated Registers new data products

Transformation Catalog Locates installed executables

Page 12: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Example Workflow Reduction

Original abstract workflow

If “b” already exists (as determined by query to the RLS), the workflow can be reduced

d1 d2ba c

d2b c

Page 13: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Mapping from abstract to concrete

Query RLS, MDS, and TC, schedule computation and data movement

Execute d2 at B

Move b from A

to B

Move c from B

to U

Register c in the

RLS

d2b c

Page 14: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Montage Montage (NASA and NVO) Deliver science-grade

custom mosaics on demand

Produce mosaics from a wide range of data sources (possibly in different spectra)

User-specified parameters of projection, coordinates, size, rotation and spatial sampling.

Mosaic created by Pegasus based Montage from a run of the M101 galaxy images on the Teragrid.

Page 15: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Small Montage Workflow

~1200 nodes

Page 16: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Montage Acknowledgments Bruce Berriman, John Good, Anastasia Laity,

Caltech/IPAC Joseph C. Jacob, Daniel S. Katz, JPL http://montage.ipac. caltech.edu/

Testbed for Montage: Condor pools at USC/ISI, UW Madison, and Teragrid resources at NCSA, PSC, and SDSC.

Montage is funded by the National Aeronautics and Space Administration's Earth Science Technology Office, Computational Technologies Project, under Cooperative Agreement Number NCC5-626 between NASA and the California Institute of Technology.

Page 17: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Applications Using Chimera, Pegasus and DAGMan

GriPhyN applications: High-energy physics: Atlas, CMS (many) Astronomy: SDSS (Fermi Lab, ANL) Gravitational-wave physics: LIGO (Caltech, AEI)

Astronomy: Galaxy Morphology (NCSA, JHU, Fermi, many others,

NVO-funded) Biology

BLAST (ANL, PDQ-funded) Neuroscience

Tomography for Telescience(SDSC, NIH-funded)

Page 18: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Current System

Original Abstract Workflow

Current Pegasus

Pegasus(Abstract Workflow)

DAGMan(CW))

Co

ncre

te W

orfklo

w

Workflow Execution

Page 19: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

time

Levels ofabstraction

Application-level

knowledge

Logicaltasks

Tasksbound toresources

and sent forexecution

User’sRequest

Relevantcomponents

Fullabstractworkflow

Partialexecution

Not yetexecuted

executed

Workflow refinement

Taskmatchmaker

Workflow repair

Policyinfo

Workflow Refinement and execution

Page 20: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Incremental Refinement Partition Abstract workflow into partial

workflows

PW A

PW B

PW C

A Particular PartitioningNew Abstract

Workflow

Page 21: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Meta-DAGMan

Pegasus(A)

Pegasus(B)

Pegasus(C)

DAGMan(Su(A))

Su(B)

Su(C)

DAGMan(Su(B))

DAGMan(Su(C))

Pegasus(X) –Pegasus generates the concrete workflow and the submit files for X = Su(X)

DAGMan(Su(X))—DAGMan executes the concrete workflow for X

Page 22: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Conclusions

Pegasus maps complex workflows onto the Grid

Uses Grid information services to find resources, data and executables

Reduces the workflow based on existing intermediate products

Used in many applications Part of GriPhyN’s Virtual Data Toolkit

Page 23: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Future Directions

Investigate various scheduling techniques Investigating fault tolerance issues Enable flexible interactions between workflow

refiners (GriPhyN-wide scope: Pegasus, DAGMan)

http://pegasus.isi.edu

GGF10 workshop on workflow management GGF Workflow management research group

[email protected]

Page 24: Pegasus: Mapping Scientific Workflows onto the Grid Ewa Deelman Center for Grid Technologies USC Information Sciences Institute.

Ewa Deelman Information Sciences Institute

Summary:

The Grid Now Syntax-based matchmaking

of resources to job requirements

Condor matchmaker Attribute based discovery

and selection Scheduling of jobs based on

Grid-able users that specify job execution sequences and computing requirements

Scripting languages Workflow languages, Task graphs

Explicit mappings from task to jobs, simple job brokers

Explicit service negotiation and recovery strategies

The Future Grid Knowledge-based reasoning

about resources enables Semantic matchmaking Aggregate resource reasoning

Task-level reasoning to plan and schedule jobs and resources

More agility and coordination Wide range of users can specify

high level requirements in a mixed-initiative mode

Mapping of high-level requirements to details required for execution

End-to-end resource negotiation and adaptive strategies to accommodate failure