Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

16
Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rhône-Alpes GRAAL Research Team Join work with DIET TEAM Distributed Interactive Engineering Toolbox DIET Batch and Simbatch: a quick glance

description

Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes GRAAL Research Team Join work with DIET TEAM. D istributed I nteractive E ngineering T oolbox. DIET Batch and Simbatch: a quick glance. RPC and Grid Computing: Grid RPC. Request. S2 !. A, B, C. - PowerPoint PPT Presentation

Transcript of Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Page 1: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Jean-Sébastien GayLIP ENS Lyon, Université Claude Bernard Lyon 1

INRIA Rhône-AlpesGRAAL Research Team

Join work withDIET TEAM

Distributed Interactive Engineering Toolbox

DIET Batch and Simbatch:a quick glance

Page 2: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

RPC and Grid Computing: Grid RPC

AGENT(s)

S1 S2 S3 S4

A, B, CAnswer (C)

S2 !

Request

Op(C, A, B)

Client

Page 3: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Outline

1. Introduction

2. Diet-Batch

3. Simbatch

4. Conclusion and perspectives

Page 4: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

DIET Architecture

LA

MA

LA

LALA

Server front end

Master Agent

Local Agent

Client

MA

MA

MA

MA

JXTA

FAST libraryApplicationModeling

Systemavailabilities

LDAP NWS

Page 5: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

MA

SeD_parallel

FrontalNFS

LSF PBS Loadleveler

GLUE

SeD_batchSeD_seq

Parallel and batch submissions - 1/2

• Parallel & sequential jobs → transparent for the user

• Submit a parallel job→ system dependent

NFS: copy the code? MPI: LAM, MPICH?

batch system dependent Numerous batch systems

(homogenization?) Batch schedulers behaviour

(queues, scripts, etc.) Information about the

internal scheduling process Monitoring

& Performance prediction SGEOAR

LA

Page 6: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Parallel and batch submissions - 2/2

• 2 API Client side

Request for seq, // resolution or let DIET choose the best Server side

Script with generic mnemonics DIET_NAME_FRONTALE, DIET_NB_NODES, DIET_BATCH_NODESFILE

A program that must end with a call to diet_submit_call()

• Experiments

Page 7: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Performance prediction with batch system

• During the submission stage Need to know when the task will begin/end Need to decide how many processors will be used Need performance prediction!

• Three means Use a probabilistic tool Ask the batch system (only available for MAUI and OAR 2.0) Use a simulator

Page 8: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Batch scheduler overview

• Portable Batch System (PBS) First Come First Served (FCFS)

• OAR (v. 1.6) Conservative BackFilling (CBF)

• Torque + Maui Only torque: FCFS Maui

3 scheduling policies: BESTFIT, FIRSTFIT (CBF), GREEDY

• Sun Grid Engine (SGE) FCFS

• Loadleveler 3 scheduling policies: FCFS, CBF, GANG Possibility to plug external schedulers

EASY Maui (should soon become the standard scheduler)

Page 9: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Grid simulator overview

• Data replication: ChicSim :

I. Foster PARallel Simulation Environment for Complex Systems

OptorSim: W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino JAVA

• Grid-economy GridSim:

R.Buyya(Nimrod/G) JAVA Quite similar to Simgrid

• Non-specialized toolkit Simgrid

H. Casanova, A. Legrand and M. Quinson C

Page 10: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

… and their drawbacks

• Minimal support for batch schedulers

• Sometimes lack of functionalities to create them

• Often difficult to reuse Example: OptorSim

• No parallel tasks available Backfilling impossible Lack of realism

Page 11: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Simbatch in a nutshell

• Goals Cluster simulation for enhancing realism Prediction tool for DIET

• API for clients Description of the platform in XML files Use of the API in the deployment.xml file

Example 1: Creating a batch process on the host « Frontal »• <process host=“Frontale” function=“SB_batch” />

Example 2: Creating a resource• <process host=“Node1” function=“SB_node” />

Each batch must be described in simbatch.xml A specific load can be simulated for each batch

• API for developers Algorithms are plug-ins Reusable functions

Find the first matching slot in a Gantt chart• slot_t * find_first_slot(cluster_t c, int nb_nodes, double start_time, double duration);

Empty queues and reschedule • void generic_reschedule(cluster_t cluster, void (*schedule)(cluster_t cluster, m_task_t task));

Page 12: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Experiment description

• 2 types of experiments Validation by simulation: parameter variation

Topology, scheduling algorithm… Comparison between simulated platform

• Task generation Inter-arrival time: Poisson law, µ = 300s Resources number: U(1,5) Run time: U(600,1800) Wall time: run time x U(1.1;1.3)

• Experiment platform 5 node cluster Star topology OAR v. 1.6

Page 13: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Validation

Page 14: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Simulation precision

• Number of tasks: 100• Makespan: 23h • Error rate on the flow metrics around 1%

Page 15: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Conclusion and perspectives

• DIET-Batch Diet is now able to handle batch schedulers 3 Sed types: sequential, batch, parallel Good performance improvements

• Simbatch Standalone simulations show good results Configuration file available to simulate Lyon’s site Excellent tool to replay load

• Next steps Integrate Simbatch in DIET-Batch

Page 16: Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes

Questions ?

http://graal.ens-lyon.fr/DIET/

http://graal.ens-lyon.fr/simbatch/