New Development in the AppLeS Project or User-Level Middleware for the Grid Francine Berman...

23
New Development in the AppLeS Project or User-Level Middleware for the Grid Francine Berman University of California, San Diego
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of New Development in the AppLeS Project or User-Level Middleware for the Grid Francine Berman...

New Development in the AppLeS Project

orUser-Level Middleware for the Grid

Francine Berman University of California, San Diego

The Evolving Grid

Applications

Resources

In the beginning, there were applications and resources, and it took ninja programmers andmany months to implement the applications on the Grid …

The Evolving Grid

Grid Middleware

Applications

Resources

Applications

Resources

And behold, there were services, and programmers saw that it was good (even though their performancewas still often less than desirable) …

The Evolving Grid

Grid Middleware

Applications

Resources

User-Level Middleware

Grid Middleware

Applications

Resources

Applications

Resources

… and it came to pass that user-level middleware was promised to promote the performance of Grid applications, and the users rejoiced …

The Middleware Promise• Grid Middleware

– Provides infrastructure/services to enable usability of the Grid

– Promotes portability and retargetability

• User-level Middleware– Hides the complexity of the Grid for

the end-user

– Adapts to dynamic resource performance variations

– Promotes application performance

Grid Middleware

Applications

Resources

User-Level Middleware

How Do Applications Achieve Performance Now?

• AppLeS = Application-Level Scheduler

– Joint project with R. Wolski

– AppLeS + application = self-scheduling Grid application

– AppLeS-enabled applications adapt to dynamic performance variations in Grid Resources

Grid Middleware

Resources

AppLeS-enabled

applications

AppLeS Architecture

Grid Middleware

Resources

AppLeS-enabled

applications

Schedule Deployment

Resource Discovery

Resource Selection

SchedulePlanning

and Performance

Modeling

DecisionModel

accessible resources

feasible resource sets

evaluatedschedules

“best” schedule

From AppLeS-enabled applications to User-Level Middleware

Grid Middleware

Applications

Resources

User-Level Middleware

Grid Middleware

Resources

AppLeS-enabled

applications

AppLeS agent integrated within application

AppLeS User-Level Middleware• Focus is development of templates which

– target structurally similar classes of applications– can be instantiated in a user-friendly timeframe– provide good application performance

ApplicationModule

SchedulingModule

DeploymentModule

Grid Middleware and Resources

AppLeS Template Architecture

APST – AppLeS Parameter Sweep Template

• Parameter Sweeps = class of applications which are structured as multiple instances of an “experiment” with distinct parameter sets

• Joint work with Henri Casanova• First AppLeS Middleware package to be distributed to users• Parameter Sweeps are common application structure used in

various fields of science and engineering– Most notably: Simulations (Monte Carlo, etc.)

• Large number of tasks, no task precedences in the general case easy scheduling ?– I/O constraints

– Need for meaningful partial results

– multiple stages of post-processing

APST Scheduling Issues

• Large shared files, if any, must be stored strategically• Post-processing must minimize file transfers• Adaptive scheduling necessary to account for changing environment

• Contingency Scheduling: Allocation developed by dynamically generating a Gantt chart for scheduling unassigned tasks between scheduling events

• Basic skeleton1. Compute the next scheduling event

2. Create a Gantt Chart G

3. For each computation and file transfer currently underway, compute an estimate of its completion time and fill in the corresponding slots in G

4. Select a subset T of the tasks that have not started execution

5. Until each host has been assigned enough work, heuristically assign tasks to hosts, filling in slots in G

6. Implement schedule

Scheduling Approach

1 2 1 2 1 2

Network links

Hosts(Cluster 1)

Hosts(Cluster 2)

Tim

e

Resources

Com

puta

tion

G

Schedulingevent

Schedulingevent

Com

puta

tion

Scheduling Heuristics

Self-scheduling Algorithms• workqueue• workqueue w/ work stealing• workqueue w/ work duplication• ...

Gantt chart heuristics:• MinMin, MaxMin• Sufferage, XSufferage• ...

Scheduling Algorithmsfor PS Applications

Easy to implement and quickNo need for performance predictionsInsensitive to data placement

More difficult to implementNeeds performance predictionsSensitive to data placement

Simulation results (HCW ’00 paper) show that: • heuristics are worth it• Xsufferage is good heuristic even when predictions are bad• complex environments require better planning (Gantt chart)

NetSolveGlobus

Legion

NWS NinfIBP

Condor

APST Architecture

transport API execution API

metadata API

scheduler API

Grid Resourcesand Middleware

APST Daemon

GASS IBP

NFS

GRAM NetSolve

Condor, Ninf, Legion,.. NWS

WorkqueueGantt chart heuristic algorithms

Workqueue++ MinMin MaxMinSufferageXSufferage

APST Client ControllerinteractsCommand-line

client

Metadata Bookkeeper

Actuator

Scheduler

triggers

transfer execute query

store

actuate actuatereport retrieve

APST• APST being used for

– INS2D (NASA Fluid Dynamics application)

– MCell (Salk, Molecular modeling for Biology)

– Tphot (SDSC, Proton Transport application)

– NeuralObjects (NSI, Neural network simulations)

– CS simulation Applications for our own research (Model validation, long-range forecasting validation)

• Actuator’s APIs are interchangeable and mixable– (NetSolve+IBP) + (GRAM+GASS) + (GRAM+NFS)

• Scheduler API allows for dynamic adaptation• No Grid software is required

– However lack of it (NWS, GASS, IBP) may lead to poorer performance

• More details in SC’00 paper

APST Validation Experiments

University of Tennessee, Knoxville

NetSolve+

IBP

University of California, San Diego

GRAM+

GASS

Tokyo Institute of Technology

NetSolve+

NFS

NetSolve+

IBP

APST DaemonAPST Client

APST Test Application – MCell

• MCell = General simulator for cellular microphysiology

• Uses Monte Carlo diffusion and chemical reaction algorithm in 3D to simulate complex biochemical interactions of molecules

• Focus of new multi-disciplinary ITR project– Will focus on large-scale

execution-time computational steering , data analysis and visualization

Experimental ResultsExperimental Setting:

Mcell simulation with 1,200 tasks:• composed of 6 Monte-Carlo simulations• input files: 1, 1, 20, 20, 100, and 100 MB

4 scenarios:

Initially(a) all input files are only in Japan(b) 100MB files replicated in California(c) in addition, one 100MB file replicated in Tennessee(d) all input files replicated everywhere

workqueue

Gantt-chart algs

New Directions: “Mega-programming”

• Grid programs– Can reasonably obtain some

information about environment (NWS predictions, MDS, HBM, …)

– Can assume that login, authentication, monitoring, etc. available on target execution machines

– Can assume that programs run to completion on execution platform

• Mega-programs– Cannot assume any

information about target environment

– Must be structured to treat target device as unfriendly host (cannot assume ambient services)

– Must be structured for “throwaway” end devices

– Must be structured to run continuously

Success with Mega-programming

• Seti@home– Over 2 million users– Sustains teraflop computing

• Can we run non-embarrassingly parallel codes successfully at this scale?– Computational Biology, Genomics …– Genome@home

Genome@home• Joint work with Derrick

Kondo

• Application template for peer-to-peer platforms

• First algorithm (Needleman-Wunsch Global Alignment) uses dynamic programming

• Plan is to use template with additional genomics applications

• Being developed for “web” rather than Grid environment

G T A A G

A 0 0 1 1 0

T 0 1 0 1 1

A 0 0 2 2 1

C 0 0 1 2 2

C 0 0 1 2 2

G 1 0 1 2 3

Optimal alignments determined by traceback

Mega-programs• Provide the algorithmic counterpart for very large scale platforms

– peer-to-peer platforms, Entropia, etc.– Condor flocks– Large “free agent” environments

– Globus– New platforms: networks of low-level devices, etc.

• Different computing paradigm than MPP, Grid

GlobusLegion

free agents …Entropia Condor

Algorithm2DNAAlignment

Algorithm1

Genome@home

• Coming soon to a computer near you:– Release of APST v0.1

by SC’00– Release of AMWAT

(AppLeS Master/ Worker Application Template) v0.1 by Jan ‘01

– First prototype of genome@home: 2001

– AppLeS software and papers: http://apples.ucsd.edu

• Thanks!– NSF, NPACI, NASA

• Grid Computing Lab:– Fran Berman ([email protected])– Henri Casanova– Walfredo Cirne– Holly Dail– Marcio Faerman– Jim Hayes– Derrick Kondo– Graziano Obertelli– Gary Shao– Otto Sievert– Shava Smallen– Alan Su– Renata Teixeira– Nadya Williams– Eric Wing– Qiao Xin