Overhead Supercomputing 2011
-
Upload
weiwei-chen -
Category
Technology
-
view
414 -
download
1
description
Transcript of Overhead Supercomputing 2011
Workflow Overhead Analysis and Optimizations
Weiwei Chen, Ewa Deelman Information Sciences Institute
University of Southern California {wchen,deelman}@isi.edu
WORKS11, Nov 14 2011, Seattle WA
Outline
• Introduction • Overhead modeling • Cumulative overhead • Experiments and evaluations • Conclusions and future work
Introduction • Workflow Optimization • Scheduling • Reducing Runtime • Reducing and Overlapping Overheads
• Overheads • Benefits • Workflow modeling and simulation • Performance evaluation • New optimization methods
Fig 1 System Overview
Outline
• Introduction • Overhead modeling • Cumulative overhead • Experiments and evaluations • Conclusions and future work
Modeling Overheads
Workflow Events: • Workflow Engine Start • Workflow Engine Finished
Workflow engine delay Queue delay Runtime
Postscript delay
Job Events: • Job Release • Job Submit • Job Execute • Job Terminate • Postscript Start • Postscript Terminate
Makespan
1 h1p://pegasus.isi.edu/wms/docs/3.1/monitoring_ debugging_stats.php#ploAng_staBsBcs
Outline
• Introduction • Overhead modeling • Cumulative overhead • Experiments and evaluations • Conclusions and future work
Cumulative Overhead (O1)
O1(workflow engine delay)=10+10+10=30 O1(queue delay)=10+20+10=40 O1(data transfer delay)=10 O1(postscript delay)=10+20+10=40
• O1 simply adds up a similar type of overheads of all jobs.
Cumulative Overhead (O2)
O2(workflow engine delay)=20 O2(queue delay)=30.
O2(data transfer delay)=10. O2(postscript delay)=40
• O2 subtracts from O1 the overlaps of the same type of overhead.
Cumulative Overhead (O3)
O3(workflow engine delay)=20 O3(queue delay)=20
O3(data transfer delay)=10 O3(postscript delay)=30
• O3 subtracts the overlap of dissimilar overheads from O2
Outline
• Introduction • Overhead modeling • Cumulative overhead • Experiments and evaluations • Conclusions and future work
Experiments • Environments:
• Amazon EC2 • FutureGrid
• Applications: • Biology: Epigenomics, Proteomics, SIPHT • Earthquake science: Broadband, CyberShake • Astronomy: Montage • Physics: LIGO
• Optimizations: • Job Clustering • Resource Provisioning
Data are available at h1p://pegasus.isi.edu/workflow_gallery/
• HPCC • Other clusters
• Data Pre-Staging • Throttling
Experiments
Distribution of Overheads
Job Clustering
With job clustering, the cumulative overheads decrease greatly due to the decreased number of jobs.
without clustering
with clustering
without clustering
without clustering
with clustering
with clustering
• Merging small jobs into a clustered job
Percentage(%)=cumulative overhead(seconds) / makspan(seconds)
Resource Provisioning
O3 and O2 have shown more obviously that the portion of runtime has been increased than O1.
without provisioning
without provisioning
without provisioning with
provisioning with provisioning
with provisioning
• Deploy pilot jobs as placeholders
Outline
• Introduction • Overhead modeling • Cumulative overhead • Experiments and evaluations • Conclusions and future work
Conclusions and Future Work
Conclusions • Overhead Analysis • A complete view of these three metrics
Future Work • More optimization methods. • Dynamic provisioning
Q & A
• Pegasus Group: http://pegasus.isi.edu/ • FutureGrid: https://portal.futuregrid.org/ • Scripts are available at http://isi.edu/~wchen/techniques.html • Data are available at http://pegasus.isi.edu/workflow_gallery/