The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will...

34
The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego
  • date post

    20-Jan-2016
  • Category

    Documents

  • view

    214
  • download

    0

Transcript of The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will...

Page 1: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

The AppLeS Project: Harvesting the Grid

Francine Berman

U. C. San Diego

Page 2: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Computing Today

data archives

networks

visualizationinstruments

MPPs

clusters

PCs

Workstations

Wireless

Page 3: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

The Computational Grid• The Computational Grid

– ensemble of heterogeneous, distributed resources

– emerging platform for high-performance computing

• The Grid is the ultimate “supercomputer”– Grids aggregate an enormous number of resources

– Grids support applications that cannot be executed at any single site

Moreover,

– Grids provides increased access to each kind of resource (MPPs, clusters, etc.)

– Grid can be accessed at any time

Page 4: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Computational Grids• Computational Grids you know and love:

– Your computer lab

– The Internet

– The PACI partnership resources

– Your home computer, CS and all the resources you have access to

• You are already using the Computational Grid …

Page 5: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Programming the Grid I• Basics

– Need way to login, authenticate in different domains, transfer files, coordinate execution, etc.

Grid Infrastructure

Resources

Globus

Legion

Condor

NetSolve

PVM

Application Development Environment

Page 6: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Programming the Grid II

• Performance-oriented programming

– Need way to develop and execute performance-efficient programs

– Program must achieve performance in context of • heterogeneity • dynamic behavior• multiple administrative domains

This can be extremely challenging.

– What approaches are most effective to achieve Grid application execution performance?

Page 7: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Performance-oriented Execution

• Careful scheduling required to achieve application performance in parallel and distributed environments

• For the Grid, adaptivity right approach to leverage deliverable resource capacities

UCSD

UTK

OGI

File Transfer time to SC’99 in Portland [Alan Su]NWS prediction

Page 8: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

• Fundamental components:– Application-centric performance model

• Provides quantifiable measure of Grid resources in terms of their potential impact on the application

– Prediction of deliverable resource performance at execution time

– User’s preferences and performance criteria

These components form the basis for AppLeS.

Adaptive Application Scheduling

Page 9: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

What is AppLeS ?• AppLeS = Application Level Scheduler

– Joint project with Rich Wolski (U. of Tenn.)

• AppLeS is a methodology– Project has investigated adaptive application scheduling using

dynamic information, application-specific performance models, user preferences.

– AppLeS approach based on real-world scheduling.

• AppLeS is software– Have developed multiple AppLeS-enabled applications which

demonstrate the importance and usefulness of adaptive scheduling on the Grid.

Page 10: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

How Does AppLeS Work?• Select resources

• For each feasible resource set, plan a schedule

– For each schedule, predict application performance at execution time

– consider both the prediction and its qualitative attributes

• Deploy the “best” of the schedules wrt user’s performance criteria – execution time– convergence– turnaround time

AppLeS + application• Resource selection• Schedule Planning• Deployment

Grid InfrastructureNWS

Resources

Page 11: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Network Weather Service (Wolski, U. Tenn.)

• The NWS provides dynamic resource information for AppLeS

• NWS is stand-alone system

• NWS – monitors current system state

– provides best forecast of resource load from multiple models

Sensor Interface

Reporting Interface

Forecaster

Model 2 Model 3Model 1

Page 12: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

AppLeS Example: Jacobi2D

• Jacobi2D is a regular, iterative SPMD application

• Local communication using 5 point stencil

• AppLeS written in KeLP/MPI by Wolski

• Partitioning approach = strip decomposition

• Goal is to minimize execution time

Page 13: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Jacobi2D AppLeS Resource Selector

• Feasible resources determined according to application-specific “desirability” metrics

– memory

– application-specific benchmark

• Desirability metric used to sort resources

• Feasible resource sets formed from initial subset of sorted desirability list

– Next step is to plan a schedule for each feasible resource set

– Scheduler will choose schedule with best predicted execution time

Page 14: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Execution time for ith strip

where load = predicted percentage of CPU time available (NWS)comm = time to send and receive messages factored by

predicted BW (NWS)

AppLeS uses time-balancing to determine best partition on a given set of resources

Solve

for

loadipt

OperComp

CommCompAreaT

unloadedi

iiii

))((

Jacobi2D Performance Model and Schedule Planning

p T T T 2 1

}{ iArea

P1 P2 P3

Page 15: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Jacobi2D Experiments• Experiments compare

– Compile-time block [HPF] partitioning

– Compile-time irregular strip partitioning [no NWS, no resource selection]

– Run-time strip AppLeS partitioning

• Runs for different partitioning methods performed back-to-back on production systems

• Average execution time recorded

• Distributed UCSD/SDSC platform: Sparcs, RS6000, Alpha Farm, SP-2

Page 16: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Jacobi2D AppLeS Experiments• Representative Jacobi 2D

AppLeS experiment• Adaptive scheduling leverages

deliverable performance of contended system

• Spike occurs when a gateway between PCL and SDSC goes down

• Subsequent AppLeS experiments avoid slow link

0

1

2

3

4

5

6

7

Execu

tion T

ime (

secon

ds)

1000

1100

1200

1300

1400

1500

1600

1700

1800

1900

2000

Problem Size

Comparison of Execution Times

Compile-time Blocked

Compile-time Irregular Strip

Runtime

Page 17: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Work-in-Progress: AppLeS Templates

• AppLeS-enabled applications perform well in multi-user environments.

• Methodology is right on target but …

– AppLeS must be integrated with application –-- labor-intensive and time- intensive

– You generally can’t just take an AppLeS and plug in a new application

Page 18: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Templates• Current thrust is to develop AppLeS templates which

– target structurally similar classes of applications

– can be instantiated in a user-friendly timeframe

– provide good application performance

AppLeS Template

ApplicationModule

PerformanceModule

SchedulingModule

DeploymentModule

AP

I

AP

I

AP

I

Network Weather Service

Page 19: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Case Study: Parameter Sweep Template

• Parameter Sweeps = class of applications which are structured as multiple instances of an “experiment” with distinct parameter sets

• Independent experiments may share input files

• Examples:

– MCell

– INS2DApplication Model

Page 20: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Example Parameter Sweep Application: MCell

• MCell = General simulator for cellular microphysiology

• Uses Monte Carlo diffusion and chemical reaction algorithm in 3D to simulate complex biochemical interactions of molecules

– Molecular environment represented as 3D space in which trajectories of ligands against cell membranes tracked

• Researchers plan huge runs which will make it possible to model entire cells at molecular level.

– 100,000s of tasks

– 10s of Gbytes of output data

– Would like to perform execution-time computational steering , data analysis and visualization

Page 21: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

PST AppLeS

• Template being developed by Henri Casanova and Graziano Obertelli

• Resource Selection:

– For small parameter sweeps, can dynamically select a performance efficient number of target processors [Gary Shao]

– For large parameter sweeps, can assume that all resources may be used

Platform Model

Cluster

User’s hostand storage

network links

storage

MPP

Page 22: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

• Contingency Scheduling: Allocation developed by dynamically generating a Gantt chart for scheduling unassigned tasks

• Basic skeleton1. Compute the next scheduling event

2. Create a Gantt Chart G

3. For each computation and file transfer currently underway, compute an estimate of its completion time and fill in the corresponding slots in G

4. Select a subset T of the tasks that have not started execution

5. Until each host has been assigned enough work, heuristically assign tasks to hosts, filling in slots in G

6. Implement schedule

Scheduling Parameter Sweeps

1 2 1 2 1 2

Network links

Hosts(Cluster 1)

Hosts(Cluster 2)

Com

puta

tion

Tim

e

Resources

G

Schedulingevent

Schedulingevent

Com

puta

tion

Page 23: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Parameter Sweep Heuristics• Currently studying scheduling heuristics useful for parameter sweeps in

Grid environments

• HCW 2000 paper compares several heuristics– Min-Min [task/resource that can complete the earliest is assigned first]

– Max-Min [longest of task/earliest resource times assigned first]

– Sufferage [task that would “suffer” most if given a poor schedule assigned first, as computed by max - second max completion times]

– Extended Sufferage [minimal completion times computed for task on each cluster, sufferage heuristic applied to these]

– Workqueue [randomly chosen task assigned first]

• Criteria for evaluation:

– How sensitive are heuristics to location of shared input files and cost of data transmission?

– How sensitive are heuristics to inaccurate performance information?

Page 24: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Preliminary PST/MCell Results• Comparison of the performance of scheduling heuristics when it is

up to 40 times more expensive to send a shared file across the network than it is to compute a task

• “Extended sufferage” scheduling heuristic takes advantage of file sharing to achieve good application performance

Max-min

Workqueue

XSufferage

Sufferage

Min-min

Page 25: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Preliminary PST/MCell Results with “Quality of Information”

Page 26: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

AppLeS in Context• Application Scheduling

– Mars, Prophet/Gallop, MSHN, etc.

• Grid Programming Environments– GrADS, Programmers’ Playground, VDCE, Nile, etc.

• Scheduling Services– Globus GRAM, Legion Scheduler/Collection/Enactor

• Resource and High-Throughput Schedulers– PBS, LSF, Maui Scheduler, Condor, etc.

• PSEs– Nimrod, NEOS, NetSolve, Ninf

• Performance Monitoring, Prediction and Steering– Autopilot, SciRun, Network Weather Service, Remos/Remulac, Cumulus

• AppLeS project contributes careful and useful study of adaptive scheduling for Grid environments

Page 27: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

New Directions• Quality of Information

– Adaptive Scheduling

– AppLePilot / GrADS

• Resource Economies– Bushel of AppLeS

– UCSD Active Web

• Application Flexibility– Computational Steering

– Co-allocation

– Target-less computing

Page 28: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Quality of Information

• How can we deal with imperfect or imprecise predictive information?

• Quantitative measures of qualitative performance attributes can improve scheduling and execution– lifetime

– cost, overhead

– accuracy

– penalty

B CAtim

e

Page 29: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Using Quality of Information• Stochastic Scheduling: Information about the variability of the

target resources can be used by scheduler to determine allocation

– Resources with more performance variability assigned slightly less work

– Preliminary experiments show that resulting schedule performs well and can be more predictable

SOR Experiment [Jenny Schopf]

Page 30: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

Current “Quality of Information” Projects

• “AppLepilot”

– Combines AppLeS adaptive scheduling methodology with “fuzzy logic” decision making mechanism from Autopilot

– Provides a framework in which to negotiate Grid services and promote application performance

– Collaboration with Reed, Aydt, Wolski

– Builds on the software being developed for GrADS

Page 31: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

GrADS: Building an Adaptive Grid Application Development and Execution Environment• System being developed as performance economy

– Performance contracts used to exchange information, bind resources, renegotiate execution environment

• Joint project with large team of researchers

Ken KennedyAndrew ChienRich WolskiIan FosterCarl Kesselman

Jack DongarraDennis GannonDan ReedLennart JohnssonFran Berman

PSE

Config.object

program

wholeprogramcompiler

Source appli-cation

libraries

Realtimeperf

monitor

Dynamicoptimizer

Grid runtime System

(Globus)

negotiation

Softwarecomponents

Scheduler/Service

Negotiator

Performance feedback

Perfproblem

Grid Application Development System

Page 32: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

In Summary ...

• Development of AppLeS methodology, applications, templates, and models provides a careful investigation of adaptivity for emerging Grid environments

• Goal of current projects is to use real-world strategies to promote performance– dynamic scheduling – qualitative and quantitative modeling– multi-agent environments– resource economies

Page 33: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.

• AppLeS Corps:– Fran Berman, UCSD– Rich Wolski, U. Tenn– Jeff Brown– Henri Casanova– Walfredo Cirne– Holly Dail– Marcio Faerman

– Jim Hayes– Graziano Obertelli– Gary Shao– Otto Sievert– Shava Smallen– Alan Su

• Thanks to NSF, NASA, NPACI, DARPA, DoD

• AppLeS Home Page: http://apples.ucsd.edu

Page 34: The AppLeS Project: Harvesting the Grid Francine Berman U. C. San Diego This presentation will probably involve audience discussion, which will create.