Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes
description
Transcript of Jean-Sébastien Gay LIP ENS Lyon, Université Claude Bernard Lyon 1 INRIA Rh ône-Alpes
Jean-Sébastien GayLIP ENS Lyon, Université Claude Bernard Lyon 1
INRIA Rhône-AlpesGRAAL Research Team
Join work withDIET TEAM
Distributed Interactive Engineering Toolbox
DIET Batch and Simbatch:a quick glance
RPC and Grid Computing: Grid RPC
AGENT(s)
S1 S2 S3 S4
A, B, CAnswer (C)
S2 !
Request
Op(C, A, B)
Client
Outline
1. Introduction
2. Diet-Batch
3. Simbatch
4. Conclusion and perspectives
DIET Architecture
LA
MA
LA
LALA
Server front end
Master Agent
Local Agent
Client
MA
MA
MA
MA
JXTA
FAST libraryApplicationModeling
Systemavailabilities
LDAP NWS
MA
SeD_parallel
FrontalNFS
LSF PBS Loadleveler
GLUE
SeD_batchSeD_seq
Parallel and batch submissions - 1/2
• Parallel & sequential jobs → transparent for the user
• Submit a parallel job→ system dependent
NFS: copy the code? MPI: LAM, MPICH?
batch system dependent Numerous batch systems
(homogenization?) Batch schedulers behaviour
(queues, scripts, etc.) Information about the
internal scheduling process Monitoring
& Performance prediction SGEOAR
LA
Parallel and batch submissions - 2/2
• 2 API Client side
Request for seq, // resolution or let DIET choose the best Server side
Script with generic mnemonics DIET_NAME_FRONTALE, DIET_NB_NODES, DIET_BATCH_NODESFILE
A program that must end with a call to diet_submit_call()
• Experiments
Performance prediction with batch system
• During the submission stage Need to know when the task will begin/end Need to decide how many processors will be used Need performance prediction!
• Three means Use a probabilistic tool Ask the batch system (only available for MAUI and OAR 2.0) Use a simulator
Batch scheduler overview
• Portable Batch System (PBS) First Come First Served (FCFS)
• OAR (v. 1.6) Conservative BackFilling (CBF)
• Torque + Maui Only torque: FCFS Maui
3 scheduling policies: BESTFIT, FIRSTFIT (CBF), GREEDY
• Sun Grid Engine (SGE) FCFS
• Loadleveler 3 scheduling policies: FCFS, CBF, GANG Possibility to plug external schedulers
EASY Maui (should soon become the standard scheduler)
Grid simulator overview
• Data replication: ChicSim :
I. Foster PARallel Simulation Environment for Complex Systems
OptorSim: W. H. Bell, D. G. Cameron, R. Carvajal-Schiaffino JAVA
• Grid-economy GridSim:
R.Buyya(Nimrod/G) JAVA Quite similar to Simgrid
• Non-specialized toolkit Simgrid
H. Casanova, A. Legrand and M. Quinson C
… and their drawbacks
• Minimal support for batch schedulers
• Sometimes lack of functionalities to create them
• Often difficult to reuse Example: OptorSim
• No parallel tasks available Backfilling impossible Lack of realism
Simbatch in a nutshell
• Goals Cluster simulation for enhancing realism Prediction tool for DIET
• API for clients Description of the platform in XML files Use of the API in the deployment.xml file
Example 1: Creating a batch process on the host « Frontal »• <process host=“Frontale” function=“SB_batch” />
Example 2: Creating a resource• <process host=“Node1” function=“SB_node” />
Each batch must be described in simbatch.xml A specific load can be simulated for each batch
• API for developers Algorithms are plug-ins Reusable functions
Find the first matching slot in a Gantt chart• slot_t * find_first_slot(cluster_t c, int nb_nodes, double start_time, double duration);
Empty queues and reschedule • void generic_reschedule(cluster_t cluster, void (*schedule)(cluster_t cluster, m_task_t task));
Experiment description
• 2 types of experiments Validation by simulation: parameter variation
Topology, scheduling algorithm… Comparison between simulated platform
• Task generation Inter-arrival time: Poisson law, µ = 300s Resources number: U(1,5) Run time: U(600,1800) Wall time: run time x U(1.1;1.3)
• Experiment platform 5 node cluster Star topology OAR v. 1.6
Validation
Simulation precision
• Number of tasks: 100• Makespan: 23h • Error rate on the flow metrics around 1%
Conclusion and perspectives
• DIET-Batch Diet is now able to handle batch schedulers 3 Sed types: sequential, batch, parallel Good performance improvements
• Simbatch Standalone simulations show good results Configuration file available to simulate Lyon’s site Excellent tool to replay load
• Next steps Integrate Simbatch in DIET-Batch
Questions ?
http://graal.ens-lyon.fr/DIET/
http://graal.ens-lyon.fr/simbatch/