Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK...

40
Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing slides! also: Viktor Yarmolenko, Wei Zheng, … and Anastasios Gounaris for presenting it!

Transcript of Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK...

Page 1: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Exploring Issues with Workflow Scheduling on the Grid

Rizos SakellariouUniversity of Manchester, UK

with thanks to:

Henan Zhao and Ewa Deelman for providing slides!

also: Viktor Yarmolenko, Wei Zheng, …

and Anastasios Gounaris for presenting it!

Page 2: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Workflow applications are widely considered a common use case of Grids

LIGO (Pegasus team, ISI)

(large-scale)myGrid, Manchester

(small size)

Page 3: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Modelling the problem…• A workflow is a Directed Acyclic Graph (DAG)

• Scheduling DAGs onto resources is well studied in the context of homogeneous systems – less so, in the context of heterogeneous systems (mostly without taking into account any uncertainty).

• Needless to say that this is an NP-complete problem.

• Are workflows really any type of DAGs or a special type of DAGs? We don’t really know… (some workflows are clearly not DAGs – only DAGs considered here…)

Page 4: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

DAG scheduling

• An order by which tasks will be executed needs to be established (eg., red, yellow, or blue first?)

• Resources need to be chosen for each task (some resources are fast, some are not so fast!)

• The cost of moving data between resources should not outweigh the benefits of parallelism.

Page 5: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Does the order matter?

• If task 6 takes comparatively longer to run, we’d like to execute task 2 just after task 0 finishes (perhaps before tasks 1, 3, 4, 5).

0

6

5 4 3 2 1

8 7

9

Follow the critical path! This is not really new!

Page 6: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Our methodology…

• Revisit the DAG scheduling problem for heterogeneous systems…

• Start with simple static scenarios…– Even this problem is not well understood, despite

the fact that there have been more than 30 heuristics published… (check the proceedings of the Heterogeneous Computing Workshop for a start…)

• Try to build on existing knowledge, as we obtain a good understanding of each step!

Page 7: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Outline of Part I

1. Static DAG scheduling onto heterogeneous systems (i.e., we know computation & communication a priori)

2. Introduce uncertainty in computation times.

3. Handle multiple DAGs at the same time.

[1] Rizos Sakellariou, Henan Zhao. A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems. Proceedings of the 13th IEEE Heterogeneous Computing Workshop (HCW’04) (in conjunction with IPDPS 2004), Santa Fe, April 2004, IEEE Computer Society Press, 2004.

 

[2] Rizos Sakellariou, Henan Zhao. A low-cost rescheduling policy for efficient mapping of workflows on grid systems. Scientific Programming, 12(4), December 2004, pp. 253-262.

 

[3] Henan Zhao, Rizos Sakellariou. Scheduling Multiple DAGs onto Heterogeneous Systems. Proceedings of the 15th Heterogeneous Computing Workshop (HCW'06) (in conjunction with IPDPS 2006), Rhodes, Apr. 2006, IEEE Computer Society Press.

Page 8: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

The starting point for a model…A DAG, 10 tasks, 3 machines

(assume we know execution times, communication costs)

0

6

5 4 3 2 1

8 7

9

18 12 9 11 14

1000 15

19 16 27 23 23

11 17 13

Task M1 M2 M3

0 37 39 27

1 30 20 24

2 21 21 28

3 35 38 31

4 27 24 29

5 29 37 20

6 22 24 30

7 37 26 37

8 35 31 26

9 33 37 21

Page 9: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

A simple idea…

Assign nodes to the fastest machine!

Heuristics that take into account the whole structure of the DAG are needed…

0

8

1

4

3

9

5

7 6

2

Makespan is > 1000!

Communication betweennodes 4 and 8 takes way too long!!!

Page 10: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

HEFT – a minor change leads to different schedules (~15%): 0

10 20 30 40 50 60 70 80 90100110120130140150160170

Makespan: 143 Makespan: 164

0

8

2

1

3

9

4

7 6

5

0

8

1

2

3

9

4

7

6

5

Still, if we consider the whole DAG…

H.Zhao,R.Sakellariou. An experimental study of the rank functionof HEFT. Proceedings of EuroPar’03.

Page 11: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Hmm…

• This was a rather well defined problem…

• This was just a small change in the algorithm…

• Yet, with big variations in the outcome.

• What about different heuristics?

• What about more generic problems?

Page 12: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

DAG scheduling: A Hybrid HeuristicTrying to find out why there were such differences in the outcome of HEFT…we observed problems with the order… to address those problems we came up with a Hybrid Heuristic… it worked quite well!

Phases:1. Rank (list scheduling)2. Create groups of independent tasks3. Schedule independent tasks

• Can be carried out using any scheduling algorithm for independent tasks, e.g. MinMin, MaxMin, …

• A novel heuristic (Balanced Minimum Completion Time)

R.Sakellariou, H.Zhao. A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems. Proceedings of the IEEE Heterogeneous Computing Workshop (HCW 04) , 2004.

Page 13: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Hmm…• Yes, but, so far, you have

used static task execution times… in practice such times are difficult to specify exactly…

• There is an answer for run-time deviations: adjust at run-time…

• But:

don’t we need to understand the static case first?

Page 14: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Characterise the Schedule• Spare time indicates the maximum time that a node,

i, may delay without affecting the start time of an immediate successor, j.

• Slack indicates the maximum time that a node, i, may delay without affecting the overall makespan.

• The idea: keep track of the values of the slack and/or the spare time and reschedule only when the delay exceeds slack…(selective rescheduling)

R.Sakellariou, H.Zhao. A low-cost rescheduling policy for efficient mapping of workflows on grid systems. Scientific Programming, 12(4), December 2004, pp. 253-262.

Page 15: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Example

FT(4)=32.5, DAT(4,7)=40.5, ST(7)=45.5 →Spare_Time(4)=5

Slack(8)=0;

Slack(7)=Slack(8)+Spare_Time(7)=0;

Slack(5)=Slack(8)+Spare_Time(5)=6

Page 16: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Lessons Learned…(simulation and deviations of up to 100%)

• Heuristics that perform better statically, perform better under uncertainty.

• By using the metrics on spare time, one can track the amount of deviation of the makespan from the static estimate. Then, we can minimise the number of times we reschedule, still achieving good results.

Page 17: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Moving on… to multiple DAGs

• It is really ideal to assume that we have exclusive usage of resources…

• In practice, we may have multiple DAGs competing for resources at the same time…

Henan Zhao, Rizos Sakellariou. Scheduling Multiple DAGs onto Heterogeneous Systems. Proceedings of the 15th Heterogeneous Computing Workshop (HCW'06) (in conjunction with IPDPS 2006), Rhodes, Apr. 2006, IEEE Computer Society Press.

Page 18: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Scheduling Multiple DAGs:Approaches

• Approach 1: Schedule one DAG after the other with existing DAG scheduling algorithms– Low resource utilization & long overall makespan

• Approach 2: Still one after the other, but do some backfilling and fill the gaps– Which DAG to schedule first? The one with longest

makespan or the one with shortest makespan?

• Approach 3: Alternate between DAGs (either round-robin or using some other form of priority).– Much better than Approach 1 & 2.

Page 19: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

But, is makespan optimisation a good objective when scheduling multiple

DAGs?

Page 20: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Mission: Fairness

In multiple DAGs:

• Users perspective: “I want my DAG to complete execution as soon as possible”.

• System perspective: “I would like to keep as many users as possible happy; I would like to increase resource utilisation (and income)”.

Let’s be fair to users!

(The system may want to take into account different levels of quality of service agreed with each user)

Page 21: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Lessons Learned… Open questions…

• It is possible to achieve reasonably good fairness without affecting makespan.

• An algorithm with good behaviour in the static case appears to make things easier in terms of achieving fairness…

• What is fairness?• What should be the behavior when run-time changes

occur?• What about different notions of Quality of Service

(e.g., based on SLAs…)

Page 22: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Questions still unanswered…

• What are the representative DAGs (workflows) in the context of Grid computing?

• Extensive evaluation / analysis (theoretical too) is needed. Not clear what is the best makespan we can get (it is not easy to find the critical path…)

• What are the uncertainties involved? How good are the estimates that we can obtain for the execution time / communication cost? Performance prediction is hard…

• How ‘heterogeneous’ our Grid resources really are?

Page 23: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Workflows are not generic DAGs• Bioinformatics workflows are really small (10s of

nodes)• There are scientific workflows with thousands of

nodes (Montage, LIGO, SCEC), but they have a rather regular structure.

• Experience from joint work with the Pegasus team indicates that there may not be much to gain from sophisticated heuristics (paper to be published based on the earlier studies below)

•James Blythe, S. Jain, Ewa Deelman, Yolanda Gil, Karan Vahi, Anirban Mandal, Ken Kennedy: Task scheduling strategies for workflow-based applications in grids. CCGRID 2005: 759-767

•Rizos Sakellariou, Henan Zhao. A Hybrid Heuristic for DAG Scheduling on Heterogeneous Systems. Proceedings of the 13th IEEE Heterogeneous Computing Workshop (HCW’04) (in conjunction with IPDPS 2004), Santa Fe, April 2004, IEEE Computer Society Press, 2004.

Page 24: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Part II

But, there is more (than just shortening the makespan) when scheduling DAGs (workflows)!

Page 25: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Efficient data handling

•(Similar order of intermediate/output files)

– If not enough disk space: failures occur

• Solution: – Determine which data are no longer needed and

when

– Add nodes to the workflow to cleanup data along the way

– Take into account disk space onto resources

• Benefits: simulations show up to 57% space improvements for LIGO-like workflows

Workflow input data is staged dynamically, new data products are generated during execution For large workflows 10,000+ input files

“Scheduling Data-Intensive Workflows onto Storage-Constrained Distributed Resources”, A. Ramakrishnan, G. Singh, H. Zhao, E. Deelman, R. Sakellariou, K. Vahi, K. Blackburn, D. Meyers, and M. Samidi, CCGrid 2007

Page 26: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

44% Improvement in footprint for Montage workflow

(when adding cleanup nodes)

Page 27: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

LIGO Inspiral Analysis Workflow

Small Workflow: 164 nodesFull Scale analysis: 185,000 nodes and 466,000 edges 10 TB of input data and 1 TB of output data

LIGO workflow running on OSG

“Optimizing Workflow Data Footprint” G. Singh, K. Vahi, A. Ramakrishnan, G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown, S. Fairhurst, D. Meyers, G. B. Berriman , J. Good, D. S. Katz, Scientific Programming.

Page 28: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

LIGO Workflows

26% ImprovementIn disk space Usage

50% slower runtime

Page 29: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

LIGO Workflows

56% improvementin space usage

3 times slower in runtime

“Optimizing Workflow Data Footprint” G. Singh, K. Vahi, A. Ramakrishnan, G. Mehta, E. Deelman, H. Zhao, R. Sakellariou, K. Blackburn, D. Brown, S. Fairhurst, D. Meyers, G. B. Berriman , J. Good, D. S. Katz, Scientific Programming.

Page 30: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Lesson Learned…

When scheduling workflows, one may want to trade performance with storage requirements to make it feasible to complete the execution of a workflow!

Page 31: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Part IIIBut, there are other issues related to performance that have to do with:

the workflow execution environment and the queuing mechanisms of

traditional systems!

Page 32: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Ewa Deelman, [email protected] www.isi.edu/~deelman pegasus.isi.edu

Scientific AnalysisW

orkf

low

Evo

lutio

n Select the Input Data

Map the Workflow onto Available Resources

Execute the Workflow

Construct the Analysis

Workflow Template

Workflow Instance

Executable Workflow

Tasks to be executed

Grid ResourcesSlide Courtesy: Ewa Deelman, [email protected] www.isi.edu/~deelman pegasus.isi.edu

Scheduling

Page 33: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Execution Environment

Slide Courtesy: Ewa Deelman, [email protected] www.isi.edu/~deelman pegasus.isi.edu

DAGMan

Condor Q

Pegasus

Condor-G

Condor-C

Condor-G

GR

AM PBS

LSF

Condor

GridFTP

HTTPStorage

GR

AM PBS

LSF

Condor

GridFTP

HTTP

Storage

GR

AM PBS

LSF

Condor

GridFTP

HTTP

Storage

GR

AM PBS

LSF

Condor

GridFTP

HTTP

Storage

LOCAL SUBMIT HOST

Page 34: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Queues are evil…

Is Advance Reservation a solution?

Page 35: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Might be… For sure, there are several challenges with respect to workflows: e.g., given a user-specified deadline

how can we make reservations for individual tasks?

Henan Zhao, Rizos Sakellariou. Advance Reservation Policies for Workflows. Proceedings of the 12th Workshop on Job Scheduling Strategies for Parallel Processing, 2006.

Page 36: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

Advance Reservation provides still a limited level of service!

Can we think of a model where:

•users specify their constraints,

•make an agreement (legally binding contract) with the resource owner (Service Level Agreement:SLA)

•it’s up to the system to do the scheduling (based on the SLAs) to honour the agreement.

Viktor Yarmolenko, Rizos Sakellariou. An Evaluation of Heuristics for SLA-based parallel job scheduling. High Performance Grid Computing Workshop, IPDPS, 2006.

http://www.gridscheduling.org

Viktor Yarmolenko, Rizos Sakellariou. Towards Increased Expressiveness in Service Level Agreements. Concurrency and Computation: Practice and Experience, 2007.

Page 37: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

SLA based job scheduling• SLA based job scheduling can offer the levels of service

currently missing:– It happens all the time in the real-world!

• But, there are several key challenges to address:– Build appropriate protocols (legally binding), behaviour

models, etc. for negotiation and re-negotiation– Pricing Policies (income, penalties, etc…)– Manage complexity– Regulation, monitoring, dispute resolution…– Convince the users to change attitudes!

• Scheduling the SLAs doesn’t appear to be the biggest challenge… But:– How to schedule workflows using SLAs (how to deal with co-

allocation problems, for instance) is a big challenge!– Needs extensive evaluation!

Page 38: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

To summarize…• Understanding the basic static scenarios and having

robust solutions for those scenarios helps the extension to more complex cases…

• Pretty much everything here is addressed by heuristics. Their evaluation requires extensive experimentation: Still:– No agreement about how DAGs (workflows) look like.

– No agreement about how heterogeneous resources really are.

• There are indications that sophisticated DAG scheduling may not be very relevant for workflows. But, there are optimization problems that relate to:– Data handling, Licences?, Budget?, (or multiple criteria)…

and, above all…

Page 39: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

What is the way to ease the constraints imposed by the

traditional queue-based models for job scheduling?

Page 40: Exploring Issues with Workflow Scheduling on the Grid Rizos Sakellariou University of Manchester, UK with thanks to: Henan Zhao and Ewa Deelman for providing.

I’d be happy to hear from anyone with interests in these problems.

You are also welcome to come and visit us in Manchester!