Post on 17-Aug-2020
Mapping Asynchronous Iterative Applications on
Heterogeneous Distributed Architectures
Raphaël Couturier and David Laiymani and Sébastien Miquée
2010/04/23
AND TeamLIFC
University ofFranche-Comté
IntroductionProblem description
ContributionsExperimentsConclusion
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 2 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 3 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Parallel computingSolving methods
Objectives
Solving large scale numerical problems
Direct methods: solve the problem in a �nite operationsnumber
Iterative methods: solve the problem by successive iterationsand give the desired result's approximation
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 4 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Solving methodsWhy choosing iterative methods?
Advantages
Some problems could not be solved by direct methods
More simple to parallelize
Could solve problems of a larger size
Main drawback
Convergence study
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 5 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Iterative computingThe synchronous model
time
Processor 1
Processor 2
Iterations synchronizations
Communications synchronizations (eventually)
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 6 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Iterative computingThe asynchronous model � AIAC
Processor 1
Processor 2
time
No synchronization between iterations nor communications
More iterations than synchronous model
Not all algorithms can be transformed in asynchronous form
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 7 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Solving methodsIterative computingComputing environments
Computing environments
Clusters
Distributed clusters
Computing grids
Toulouse
Bordeaux
RennesOrsay
Lille
Nancy
GrenobleLyon
Sophia
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 8 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
JaceP2P-V2Mapping relevanceProblematic
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 9 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
JaceP2P-V2Mapping relevanceProblematic
The JaceP2P-V2 platform
Execution platform and programming environment
Platform: daemons, super-nodes, spawnersFunctions library
Designed for AIAC model
Fully fault tolerant
"Multi-threaded"
Java language (portability)
AND Team (Jean-Claude Charr)
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 10 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
JaceP2P-V2Mapping relevanceProblematic
The JaceP2P-V2 platformDefault �mapping�
Tn
...
T4T3
T2T1
AIAC Application
SN3
SN1
SN2
1
3
2
4
5
Cluster 1
Cluster 2
Cluster 3
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 11 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
JaceP2P-V2Mapping relevanceProblematic
Mapping relevance
Conditions
Grid'5000 platform � 200 computing nodes
Simple Mapping algorithm (SMa)
Application using the Multisplitting CG
Problems sizes : 550'000 and 5'000'000
Results � Gains in execution time
For 550'000: 30% faster than without mapping algorithm
For 5'000'000: 40% faster than without mapping algorithm
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 12 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
JaceP2P-V2Mapping relevanceProblematic
Problematic
������������
������������
������������
������������
������������
������������
������������
������������
���������
���������
���������
���������
��������
Tn
...
T4T3
T2T1
AIAC Application
clustersSite containing
High latency network
Low latency network
?
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 13 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 14 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
Application modeling
(�t with synchronous model)
0
1 2 3
54
6
5
14
10 1011
10 10
5
5 5 5 5
125
55
DAG
(�t with asynchronous model)
0
1
2
3
4
5
5 5
5
10
11
10
5
10
14
5
5
10
3
3
TIG
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 15 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
Existing mapping algorithmsMinimization of external links � edgecuts
0
1
2
3
4
5
Goal: avoiding penalizing communications
Algorithms: Metis[1], Chaco[2]. . .
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 16 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
Existing mapping algorithmsMinimization of execution time
0
1
2
3
4
5
Goal: reducing the application execution time
Algorithms : QM[3], FastMap[4], MiniMax[5]. . .
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 17 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
Speci�cities of the AIAC model and JaceP2P-V2
Speci�cities due to the AIAC model and JaceP2P-V2
Only one task per computing node
No task precedence
Speci�cities due to the targeted architecture
Computing node volatility
Tasks state save
Heterogeneity in computing nodes
Heterogeneity in networks
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 18 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
AIAC-QM algorithm
Quick Quality Map (QM) [3]
Optimization of the execution time of each task
Execution time taking care of computation and communication
Does not satisfy model's constraints
Evaluation of AIAC-QM
Adaptation and implementation of QM in JaceP2P-V2 →AIAC-QM
Introduction of a little part of edge-cuts
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 19 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
ModelsRelated worksSpeci�cities of the AIAC modelAlgorithms implementation
AIAC-QM algorithm
Principles
Sort computing nodes by computation power
Iterate on each task for searching a better node
Limited number of rounds
Execution time computation based on:
Computing node powerLocality with task's neighbors
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 20 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Multisplitting Conjugate GradientExperiments
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 21 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Multisplitting Conjugate GradientExperiments
NPB � Multisplitting Conjugate Gradient
Nas Parallel Benchmark
Conjugate Gradientmatrix computation
Asynchronous versionusing the multisplittingmethod[6]
BX
ADepLeft DepRight
XL
eftX
Right
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 22 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Multisplitting Conjugate GradientExperiments
Conditions of the evaluations
Application and architectures
Multisplitting CG (E : 550'000 and F : 5'000'000)
64 (for E) and 128 (for F) computing nodes
Homogeneous architectures
Architecture 1: 113 nodes � 440 coresArchitecture 2: 213 nodes � 840 cores
Heterogeneous architectures
Architecture 3: 112 nodes � 394 coresArchitecture 4: 212 nodes � 754 cores
No computing node failure
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 23 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Multisplitting Conjugate GradientExperiments
Results 1 � Homogeneous Architectures
Algorithm No SMa AIAC QM F-EC
Execution time 150s 110s 101s 90s
Gains � 27% 33% 40%
Table: Multisplitting CG 550'000 � Architecture 1
Algorithm No SMa AIAC QM F-EC
Execution time 403s 265s 250s 218s
Gains � 34% 38% 46%
Table: Multisplitting CG 5'000'000 � Architecture 2
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 24 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Multisplitting Conjugate GradientExperiments
Results 2 � Heterogeneous Architectures
Algorithm No SMa AIAC QM F-EC
Execution time 498s 341s 273s 385s
Gains � 32% 45% 23%
Table: Multisplitting CG 550'000 � Architecture 3
Algorithm No SMa AIAC QM F-EC
Execution time 943s 594s 453s 660s
Gains � 37% 52% 30%
Table: Multisplitting CG 5'000'000 � Architecture 4
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 25 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Outline
1 Introduction
2 Problem description
3 Contributions
4 Experiments
5 Conclusion & Future works
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 26 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
Conclusion & Future works
Conclusion
Mapping is essential for JaceP2P-V2
Adaptation of existing mapping algorithms
Considerable gains on application execution time
Future works
Designing a hybrid algorithm
Mapping tasks' saves
Take care about nodes capacities
Fault tolerance in mapping algorithms
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 27 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
References I
G. Karypis and V. Kumar.A fast and high quality multilevel scheme for partioningirregular graphs.SIAM Journal on Scienti�c Computing, 20(1):359�392, 1998.
B. Hendrickson and R. W. Leland.The Chaco User's Guide.Sandia National Laboratory, Albuquerque, 1995.
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 28 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
References II
P. Phinjaroenphan.An E�cient, Pratical, Portable Mapping Technique on
Computational Grids.PhD thesis, School of Computer Science and Informationtechnology Science, Engineering and Technology Portfolio,RMIT University, 2006.
S. Sanyal, A. Jain, S. K. Das, and Rupak Biswas.A hierarchical and distributed approach for mapping largeapplications to heterogeneous grids using genetic algorithms.In CLUSTER, pages 496�499, 2003.
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 29 / 30
IntroductionProblem description
ContributionsExperimentsConclusion
References III
S. Kumar, S. K. Das, and Rupak Biswas.Graph partitioning for parallel applications in heterogeneousgrid environments.In IPDPS, 2002.
J. Bahi, S. Contassot-Vivier, and R. Couturier.Parallel Iterative Algorithms: from Sequential to Grid
Computing, volume 1 of Numerical Analysis & Scienti�c
Computating, chapter Asynchronous Iterations, pages 124�131.
Chapman & Hall/CRC, 2007.
Sébastien Miquée PDSEC'10 � Mapping Asynchronous Iterative Applications 2010/04/23 30 / 30