Application Scheduling on Distributed Resources
description
Transcript of Application Scheduling on Distributed Resources
Application Scheduling on Distributed Resources
Francine Berman
U. C. San Diego
and
NPACI
• Computational Grid becoming increasingly prevalent as a computational platform
• Focus is on using distributed resources as an ensemble– clusters of workstations
– MPPs
– remote instruments
– visualization sites
– storage archives
The Computational Grid
Programming the Grid
• How do we write Grid programs?
• How do we achieve program performance?
• First try: extend MPP programs ...
Programming the Grid
MPP Programming Model– processors, network are
uniform– single administrative
domain– “machine” is typically
dedicated to user
Grid Programming Model– resources are distributed,
heterogeneous– grid may comprise multiple
administrative domains– resources are shared by
multiple users
Achieving Program Performance
MPP programs achieve
performance by – dedicating resources
– careful staging of computation and data
– considerable coordination
Computational Grids are
dynamic– load and availability of
resources vary with time
and load
– both system and application
behavior hard to predict
Grid Programming Challenge: How can programs leverage the deliverable performance of the Grid at execution time?
Scheduling
• Scheduling is fundamental to performance
• On the Computational Grid, scheduling mechanism must
– perceive the performance impact of system resources on the application
– adapt to dynamic conditions
– optimize application schedule for Grid at execution time
Whose Job Is It?• Application scheduling can be performed by
many entities
– Resource scheduler, job scheduler, program developer, system administrator, user, application scheduler
PSE
Config.object
program
wholeprogramcompiler
Source appli-cation
libraries
Realtimeperf
monitor
Dynamicoptimizer
Grid runtime system
negotiation
Softwarecomponents
Service negotiator
Scheduler
Performance feedback
Perfproblem
Grid Application Development System
Scheduling and Performance
• Achieving application performance can conflict with system performance goals
– Resource Scheduler -- perf measure is utilization
– Job Scheduler -- perf measure is throughput
– System Administrator -- focuses on system perf
• Goal of scheduling application is to promote application performance over performance of other applications and system components
– Application Scheduler -- perf measure is app.-specific
• Everything in the system is evaluated in terms of its impact on the application.
• performance of each system component can be considered as a measurable quantity
• forecasts of quantities relevant to the application can be manipulated to determine schedule
• This simple paradigm forms the basis for AppLeS.
Self-Centered Scheduling
AppLeS
Joint project with Rich Wolski
• AppLeS = Application-Level Scheduler
• Each application has its own self-centered AppLeS agent.
• Custom application schedule achieved through– selection of potentially efficient resource sets
– performance estimation of dynamic system parameters and application performance for execution time frame
– adaptation to perceived dynamic conditions
AppLeS Architecture• AppLeS incorporates
– application-specific information– dynamic information– prediction
• Each AppLeS schedule is customized for its application and envt.
• AppLeS scheduler promotes performance as defined by the user– execution time– convergence– turnaround time
NWS(Wolski)
UserPrefs
AppPerf
Model
PlannerResource Selector
Application
Act.Grid/cluster resources/
infrastructure
Network Weather Service (Wolski)
• The NWS provides dynamic resource information for AppLeS
• NWS is stand-alone system
• NWS – monitors current system state
– provides best forecast of resource load from multiple models
Sensor Interface
Reporting Interface
Forecaster
Model ModelModel
The Role of Prediction• Is monitoring enough for scheduling?
Fast Ethernet Bandwidth at SDSC
0
10
20
30
40
50
60
70
Time of Day
Meg
abits
per
Sec
ond
Measurements
Tue Wed Thu Fri Sat Sun Mon Tue
13:30
Monitoring vs. Forecasting• Monitored data provides a snapshot of what has
happened, forecasting tells us: what will happen?.• Last value is not always the best predictor...
Mean Square Error PerformanceSDSC Ethernet
0
20
40
60
80
100
120
140MSE
Monitored data
iii Commpt
OperAreaT
Using Forecasting in Scheduling
• How much work should each processor be given?
• Jacobi2D AppLeS solves equations for Area
N N Areai
P1 P2 P3
Fast Ethernet Bandwidth at SDSC
0
10
20
30
40
50
60
70
Time of Day
Meg
abits
per
Sec
ond
Measurements
Exponential SmoothingPredictions
Tue Wed Thu Fri Sat Sun Mon Tue
Good Predictions Promote Good Schedules
• Jacobi2D experiments
0
1
2
3
4
5
6
7
Execu
tion T
ime (
secon
ds)
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
Problem Size
Comparison of Execution Times
Compile-time Blocked
Compile-time Irregular Strip
Runtime
SARA: An AppLeS-in-Progress
• SARA = Synthetic Aperture Radar Atlas– application developed at JPL
and SDSC
• Goal: Assemble/process files for user’s desired image– thumbnail image shown
to user
– user selects desired bounding box for more detailed viewing
– SARA provides detailed image in variety of formats
Simple SARA• Simple SARA focuses on obtaining remote data quickly
• Code developed by Alan Su
ComputeServer
DataServer
DataServer
DataServer
Computation servers
and data servers are
logical entities, not
necessarily different
nodes
Network shared by variable number of users
Computation assumed to be done at compute servers
Simple SARA AppLeS
• Focus on resource selection problem: Which site can deliver data the fastest?
– Data for image accessed over shared networks
– Data sets 1.4 - 3 megabytes, representative of SARA file sizes
– Servers used for experiments• lolland.cc.gatech.edu
• sitar.cs.uiuc
• perigee.chpc.utah.edu
• mead2.uwashington.edu
• spin.cacr.caltech.edu
via vBNS
via general Internet
Which is “Closer”?
• Sites on the east coast or sites on the west coast?
• Sites on the vBNS or sites on the general Internet?
• Consistently the same site or different sites at different times?
Which is “Closer”?
• Sites on the east coast or sites on the west coast?
• Sites on the vBNS or sites on the general Internet?
• Consistently the same site or different sites at different times?
Depends a lot on traffic ...
Simple SARA Experiments
• Ran back-to-back experiments from remote sites to UCSD/PCL
• Wolski’s Network Weather Service provides forecasts of network load and availability
• Experiments run during normal business hours mid-week
Preliminary Results• Experiment with larger data set (3 Mbytes)
• During this time-frame, general Internet provides data mostly faster than vBNS
• Experiment with smaller data set (1.4 Mbytes)• During this time frame, east coast sites provide
data mostly faster than west coast sites
More Preliminary Results
9/21/98 Experiments• Clinton Grand Jury webcast commenced at trial 62
Distributed Data Applications
• SARA representative of larger class of distributed data applications
• Simple SARA template being extended to accommodate– replicated data sources– multiple files per image– parallel data acquisition– intermediate compute sites– web interface, etc.
. . .ComputeServers
DataServers
Client
Distributed Data Applications
Move the computationor move the data?
Which computeservers to use?
Which serversto use for multiplefiles?
A Bushel of AppLeS … almost
• During the first “phase” of the project, we’ve focused on developing AppLeS applications
– Jacobi2D
– DOT
– SRB
– Simple SARA
– Genetic Algorithm
– CompLib
– INS2D
– Tomography, ...
• What have we learned?
Lessons Learned From AppLeS
Compile-time Blocked Partitioning
Run-time AppLeS Non-
Uniform Strip Partitioning
• Dynamic information is critical.
Lessons Learned from AppLeS
• Program execution and parameters may exhibit a range of performance
Lessons Learned from AppLeS
• Knowing something about the “goodness” of performance predictions can improve scheduling
Execution time
0
50
100
150
200
250
300
350
Small Medium Large
Problem Size
Tim
e (s)
SuperAppLeSAppLeSMentat
SOR CompLib
Lessons Learned from AppLeS
• Performance of application sensitive to scheduling policy, data, and system characteristics
Achieving Performance on the Computational Grid
Adaptivity a fundamental paradigm for achieving performance on the Grid.
• AppLeS uses adaptivity to leverage deliverable resource performance
• Performance impact of all components considered
• AppLeS agents target dynamic, multi-user distributed environments
Related Work• Application Schedulers
– Mars, Prophet/Gallop, VDCE• Scheduling Services
– Globus GRAM• Resource Allocators
– I-Soft, PBS, LSF, Maui Scheduler, Nile• PSEs
– Nimrod, NEOS, NetSolve, Ninf• High-Throughput Schedulers
– Condor• Performance Steering
– Autopilot, SciRun
Current AppLeS Projects• AppLeS Templates
– distributed data applications
– parameter sweeps
– master/slave applications
– data parallel stencil applications
• Performance Prediction Engineering– scheduling with quality of information
• accuracy• lifetime• overhead
A B C
AppLeS Projects• Real World Scheduling– Contingency Scheduling
• scheduling during execution
– Imperfect Scheduling• scheduling with
– partial information
– poor information
– dynamically changing information
– Multischeduling• resource economies
• scheduling “social structure”
X
The Brave New World• “Grid-aware” programming will require comprehensive development and execution environment
– Adaptation will be fundamental paradigm
PSE
Config.object
program
wholeprogramcompiler
Source appli-cation
libraries
Realtimeperf
monitor
Dynamicoptimizer
Grid runtime system
negotiation
Softwarecomponents
Service negotiator
Scheduler
Performance feedback
Perfproblem
Grid Application Development System
Project Information• Thanks to NSF, NPACI,
Darpa, DoD, NASA
• AppLeS Corps:– Francine Berman
– Rich Wolski
– Walfredo Cirne
– Henri Casanova
– Marcio Faerman
– Markus Fischer
– Jaime Frey
• AppLeS Home Page: http://www-cse.ucsd.edu/groups/hpcl/apples.html
– Jim Hayes
– Graziano Obertelli
– Jenny Schopf
– Gary Shao
– Shava Smallen
– Alan Su
– Dmitrii Zagorodnov