Overhead Supercomputing 2011

18
Workflow Overhead Analysis and Optimizations Weiwei Chen, Ewa Deelman Information Sciences Institute University of Southern California {wchen,deelman}@isi.edu WORKS11, Nov 14 2011, Seattle WA

description

Overhead study of workflows

Transcript of Overhead Supercomputing 2011

Page 1: Overhead Supercomputing 2011

Workflow Overhead Analysis and Optimizations

Weiwei Chen, Ewa Deelman Information Sciences Institute

University of Southern California {wchen,deelman}@isi.edu

WORKS11, Nov 14 2011, Seattle WA

Page 2: Overhead Supercomputing 2011

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Page 3: Overhead Supercomputing 2011

Introduction • Workflow Optimization • Scheduling • Reducing Runtime • Reducing and Overlapping Overheads

• Overheads • Benefits • Workflow modeling and simulation • Performance evaluation • New optimization methods

Fig  1  System  Overview  

Page 4: Overhead Supercomputing 2011

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Page 5: Overhead Supercomputing 2011

Modeling Overheads

Workflow Events: •  Workflow Engine Start •  Workflow Engine Finished

Workflow engine delay Queue delay Runtime

Postscript delay

Job Events: •  Job Release •  Job Submit •  Job Execute •  Job Terminate •  Postscript Start •  Postscript Terminate

Makespan

 1  h1p://pegasus.isi.edu/wms/docs/3.1/monitoring_  debugging_stats.php#ploAng_staBsBcs  

Page 6: Overhead Supercomputing 2011

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Page 7: Overhead Supercomputing 2011

Cumulative Overhead (O1)

O1(workflow engine delay)=10+10+10=30 O1(queue delay)=10+20+10=40 O1(data transfer delay)=10 O1(postscript delay)=10+20+10=40

•  O1 simply adds up a similar type of overheads of all jobs.

Page 8: Overhead Supercomputing 2011

Cumulative Overhead (O2)

O2(workflow engine delay)=20  O2(queue delay)=30.

O2(data transfer delay)=10. O2(postscript delay)=40

•  O2 subtracts from O1 the overlaps of the same type of overhead.

Page 9: Overhead Supercomputing 2011

Cumulative Overhead (O3)

O3(workflow engine delay)=20  O3(queue delay)=20  

O3(data transfer delay)=10 O3(postscript delay)=30  

•  O3 subtracts the overlap of dissimilar overheads from O2

Page 10: Overhead Supercomputing 2011

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Page 11: Overhead Supercomputing 2011

Experiments •  Environments:

•  Amazon EC2 •  FutureGrid

•  Applications: •  Biology: Epigenomics, Proteomics, SIPHT •  Earthquake science: Broadband, CyberShake •  Astronomy: Montage •  Physics: LIGO

•  Optimizations: •  Job Clustering •  Resource Provisioning

Data  are  available  at  h1p://pegasus.isi.edu/workflow_gallery/  

•  HPCC •  Other clusters

•  Data Pre-Staging •  Throttling

Page 12: Overhead Supercomputing 2011

Experiments

Page 13: Overhead Supercomputing 2011

Distribution of Overheads

Page 14: Overhead Supercomputing 2011

Job Clustering

With job clustering, the cumulative overheads decrease greatly due to the decreased number of jobs.

without clustering

with clustering

without clustering

without clustering

with clustering

with clustering

•  Merging small jobs into a clustered job  

Percentage(%)=cumulative overhead(seconds) / makspan(seconds)

Page 15: Overhead Supercomputing 2011

Resource Provisioning

O3 and O2 have shown more obviously that the portion of runtime has been increased than O1.

without provisioning

without provisioning

without provisioning with

provisioning with provisioning

with provisioning

•  Deploy pilot jobs as placeholders  

Page 16: Overhead Supercomputing 2011

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Page 17: Overhead Supercomputing 2011

Conclusions and Future Work

Conclusions •  Overhead Analysis •  A complete view of these three metrics

Future Work •  More optimization methods. •  Dynamic provisioning

Page 18: Overhead Supercomputing 2011

Q & A

•  Pegasus Group: http://pegasus.isi.edu/ •  FutureGrid: https://portal.futuregrid.org/ •  Scripts are available at http://isi.edu/~wchen/techniques.html •  Data are available at http://pegasus.isi.edu/workflow_gallery/