Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski...

49
Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan

Transcript of Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski...

Page 1: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Scheduling From the Perspective of the Application

By Francine Berman & Richard Wolski

Presenter:Kun-chan Lan

Page 2: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Outline of the talk

• Overview

• Case study

• Application-centric scheduling

• AppleS Project

• Result

• Conclusion

Page 3: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Overview..

• Why scheduling is important in metacomputing system– Better utilization of resource– Performance efficiency

• Application-centric scheduling– Everything is evaluated in terms of its impact

on the application

Page 4: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Overview..

• Metacomputing– Aggregation of distributed and high-

performance resources on coordinated networks, for performance required to address modern scientific problems

– Heterogeneity(administrative domain, software/hardware architecture, protocol etc)

– contention

Page 5: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Parallel computing vs. Metacomputing

• Performance oriented

• Aggregation of resources from a single site(a mutli-processor machine)

• Communicate via dedicated devices like switch,share-memory etc.

• Homogeneous(hardware/software infrastructure, administrative domain etc)

• Performance oriented

• aggregation of resource from multiple sites

• Communicate via a distributed network

• Heterogeneous resources

• A software infrastructure required to coordinate distributed networks into a communication substrate

Page 6: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Scheduling for parallel computing

• Multiprocessor nodes generally have uniform capabilities

• Usually there is a centralized system scheduler

• Processors are dedicated to tasks of a single application -- No contention

Page 7: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Scheduling for Metacomputing

• Resources are often managed by separate schedulers which are not coordinated – no single system scheduler

• Data conversion between sides• Overlapping of communication and computation

to amortize network communication• Separate optimized algorithm for tasks on

different machine

Page 8: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Outline of the talk

• Overview

• Case study

• Application-centric scheduling

• AppleS

• Result

• Conclusion

Page 9: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Case 1: CLEO/NILE

Page 10: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

CLEO

• A high energy physics project• Each collision detected by CLEO is called an

event• Each event is recorded and passed to a program

called “pass2” to computer offline the physical properties of the particles

• Records computed by “pass2” are read and compressed by another program for certain frequently-accessed fields

• One terabyte of data being generated per year

Page 11: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Nile..

• A by-product of CLEO• Each CLEO’s collaborating institution is a site• Goal

– provide a scalable, fault-tolerant, heterogeneous system of hundreds of commodity workstations, with access to a distributed database in excess of 100~TB

– Resources(CLEO data) are spread across the United States and Canada at 24 collaborating institutions

– resource can be accessed and used transparently from anywhere by any member of the CLEO collaboration

Page 12: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Nile..

• Not specific to CLEO, can be used by any application that is easily parallelizable

• Currently implemented in CORBA/JAVA• Three components

– Nile Control System(NCS)– Data Repository– User Interface

• Interconnecting networks include ATM,FDDI and Ethernet

Page 13: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Nile..

Page 14: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Nile..

• NCS:– Site manager:

• Interface between NCS and clients

• Receive job requests

• For each job request, create a job manager, store the job context into Job Database and place the job into queue

• stateless

Page 15: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Nile..

• NCS:• Job DB:

– Store the state of job

• Resource DB:– Maintain the state of available hardware resources at local

site

• Data Location Manager:– Translate logical data specification in the job profile to a

set of corresponding physical data objects, which can be used to determine the suitable hosts to run the sub jobs

Page 16: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Nile..

• NCS:• Job Manager

– Divide a single job into a set of sub jobs which can be executed in parallel

– Monitor the state of sub-jobs

– Collect and assemble the results, and pass them back the site manager

• Planner– Produce an execution plan consisting of a list of sub-

jobs,each having a host machine and a set of data objects

Page 17: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Characteristics of CLEO/NILE

• The quantity of data for the problem is so large that no single site can provide all the resources needed

• Efficient resource allocation is crucial• Execution sites and network interconnection are

heterogeneous• Some resources are shared by other application, so

performance might vary greatly based on contention for resources

Page 18: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

CASE 2: 3-D REACT

• Try to predict the energy level of reaction using quantum mechanics

• Simulate a hydrogen-deuterium reaction

• Essentially calculating the solution to a six-dimensional Schrodinger equation, and can be decomposed into three tasks– LHSF(local hyper-spherical surface function)

– Log-D(logarithmic derivative propagation): use the result of LHSF as input

– ASY:an asymptotic analysis on the matrices generated during the Log-D calculation

DHDDH 2

Page 19: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Scheduling 3D-REACT

• Distribute 3D-REACT two computation units– Cray C90 in SDSC– 64-node Intel Paragon in CalTech

• The problem is divided into smaller sub-domains of 5-20 surface function per sub-domains, so LHSF and Log-D can be executed concurrently

• First C90 calculate the LHSF for a given sub-domain, and then the result is passed to Paragon which will calculate the log-D portion of that sub-domain

• While Paragon is calculating the first sub-domain, C90 can start calculating the second sub-domain

• After all the sub-domains are considered, the ASY will determine whether the calculation should stop

Page 20: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.
Page 21: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Characteristics of 3D-REACT

• The algorithm implemented by a task is optimized for the machine to which it has assigned– Eg. The Log-D implementation used in C90 is different than that used

in Paragon

• Computation and communication can be pipelined to amortize communication delays

• Data might need to be converted into different format when being transferred between different sites– Eg. The floating point needed to be converted when C90 sends data to

Paragon

• Scheduling is critical for performance– Each of the sub-tasks (LHSF/Log-D/ASY) can be execute on either

machine

Page 22: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Outline of the talk

• Overview

• Case study

• Application-centric scheduling

• AppleS

• Result

• Conclusion

Page 23: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Generalization of Application-Centric scheduling

• Each application develop a schedule to optimize its own performance without regard to the performance goals of other applications which share the system

• Each application-centric schedule for different application is unrelated

• However, there are still some commonalities which underly application-centric program development

Page 24: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Components of Application-Centric scheduling..

• Performance criteria/metrics

• Dynamic system state

• Application-specific resource locality

• Application performance characteristics

• User preferences

• Prediction

Page 25: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Performance criteria/metrics

• Performance criteria/metrics vary with the application– Eg. to minimize execution time

• 3D-REACT: by maximizing speedup over a single-machine implementation

• NILE: by distributing analysis of independent events

– Some common metrics• Execution time• Speedup• Cost of execution cycle

– User will attempt to optimize the usages of same resource for different performance criteria at the same time

Page 26: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Dynamic system state

• Mixture of dedicated and non-dedicated resources– Should wait until the dedicated resources become

available, or

– Should execute the application with lesser performance on the non-dedicated resource currently available

• Requirement of dynamic assessment of– Current system state

– Resource loads

– Short-term, but accurate prediction

Page 27: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Application-specific Resource Locality

• Applications seek to use “close” resources?

• “Closeness” is a function of what the application requires from a resource as well as the resource’s capability– “Distance” of resources: the resource

performance deliverable to application

• Is X and Y close?

X Y

taskX taskY

Page 28: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Application Characteristics

• Implementation-dependent and implementation-independent

• Some common categories of attributes– Task-specific implementation characteristics

• Computation paradigm,number/size of data structure, data communication pattern, memory requirement, etc.

– Inter-task communication characteristics• Data format for each task,pipeline size,communication

regularity and frequency, etc.

– Application structure information• Input/output requirement,iteration pattern, etc.

Page 29: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

User Preferences

• Not necessary directly related to application performance

• Act as a filter over the possible resources and implementation available to the user

Page 30: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Role of Prediction

• Prediction tells you– Potential communication and computation behavior of the application– Potential availability and load of resource– Potential performance of the application with respect to candidate

schedules

• Sources of prediction– App-specific or app-independent benchmark– Statistical analysis– Sensed or sampled data– Analytical model

Page 31: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Process of scheduling an application

1. Use user preference to filter out infeasible schedules

2. Use application-specific and dynamic information to develop an schedule

3. Use individual notion of performance and resource locality to evaluate the schedule

4. Predict the performance of candidate schedules

5. Compare and determine the “best schedule” that can be implemented on the available resources

Page 32: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Outline of the talk

• Overview

• Case study

• Application-centric scheduling

• AppleS

• Result

• Conclusion

Page 33: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

AppleS(Application-level Scheduler)

• Each application will have its own AppleS agent(a customized scheduler for each application)

• What does AppleS do?– Select resources

– Determine a performance-efficient schedule

– Implement that schedule with respect to the appropriate resource management system

• AppleS is NOT a resource management system: it rely on systems such as Globus,Legion

Page 34: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Organization of an AppleS agent

Page 35: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

components of AppleS• Resource Selector:

– choose and filter different resource combination

• Planner– Generate a description of a resource-dependent schedule from a given

resource combination

• Performance estimator– Generate an estimate for candidate schedules according to the user’s

performance metric

• Coordinator– Choose the “best” schedule

• Actuator– Implement the “best” schedule on the target resource management

system

Page 36: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Input of AppleS: Information Pool

• Network Weather Service– Dynamic information of system state and forecast of resource

load

• Heterogeneous Application Template(HAT)– information for the structure, characteristics and

implementation of application and its tasks

• Model– Used for performance estimation, planning and resource

selection

• User Specification(US)– Information on user’s criteria for performance, execution

constraint, preference for implementation, etc

Page 37: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Using AppleS

1. User provide information to AppleS via HAT and US2. Coordinator uses this information to filter out

infeasible/possibly-bad schedules3. Resource selector identify promising sets of resource, and

prioritize them based the logical “distance” between resources

4. Planner computes a potential schedule for each viable resource configuration

5. Performance estimator evaluates each schedule in terms of the user’s performance objective

6. Coordinator chooses the best schedule and then implements it with Actuator

Page 38: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.
Page 39: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Using AppleSExample: 3D-REACT

1. Assuming implementations of LHSF and Log-D are available for several architectures

2. HAT: specify the computation-to-communication ratios for LHSF and Log-D, degree of overlap that is possible between the two, etc. for each implementation

3. Resource selector determine viable pairs of resources

4. Planner identify a set of candidate schedules

5. Performance estimator calculate the transfer unit size between LHSF and Log-D for each candidate schedule

6. Coordinator sends the best schedule to the Actuator

Page 40: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Outline of the talk

• Overview

• Case study

• Application-centric scheduling

• AppleS

• Result

• Conclusion

Page 41: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Jacobi2D code..

• a distributed data-parallel two dimensional Jacobi iterative solver

• commonly used to solve the finite-difference approximation to Poisson's equation

• Variable coefficients are represented as elements of a two-dimensional grid

• At each iteration, the new value of each grid element is defined to be the average of its four nearest neighbors during the previous iteration

Page 42: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.
Page 43: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

..Jacobi2D code

• Typically, the Jacobi computation is parallelized by partitioning the grid into rectangular regions, and then assigning each region to a different processor

• Parallelism vs. communication overhead

P0 is twice as fast as processor P1 or P2

Page 44: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

FDDI

RS600

Alpha workstation

Page 45: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Three partition methods

• HPF Uniform/Blocked– each processor is assigned (at compile-time) a

relatively equal-sized square region of the grid to compute

• Non-Uniform Strip– uses good static estimates for resource

performance and uses resource selection to select a resource set from the total resources

• AppleS

Page 46: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.
Page 47: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Memory availability

• Adding two IBM SP-2 node with 128M memory into resource pool

• dedicated access to the two SP-2 nodes and the link between them

• the best partitioning is to split the grid evenly between the two SP-2 nodes as long as neither partition exceeded the available real memory on each node

Page 48: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

A lot of pageswapping

Page 49: Scheduling From the Perspective of the Application By Francine Berman & Richard Wolski Presenter:Kun-chan Lan.

Conclusion

• Performance-efficient schedule must exploit the concurrency of independent application task as well as factor in the impact of resource contention/diversity/autonomy

• AppleS: http://apples.ucsd.edu/, still a working-in-progress

• Related work: MARS: http://www.uni-paderborn.de/pc2/projects/mol/mars.htm

• CLEO: http://www.lns.cornell.edu/public/CLEO/• 3D-REACT:

http://www.cacr.caltech.edu/Publications/techpubs/CASA/cacr123/web4.htm