moais.imag.fr
description
Transcript of moais.imag.fr
![Page 1: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/1.jpg)
1
http://moais.imag.fr
Multi-Programming and Scheduling Design for Applications of Interactive Simulation
Jean-Louis Roch & al.
Louvre, Musée de l’Homme Sculpture (Tête)Artist : AnonymeOrigin: Rapa Nui [Easter Island]Date : between the XIst and the XVth centuryDimensions : 1,70 m high
EVALUATION SEMINAR -RESEARCH THEME Num B"Grids and high-performance computing"
March 27-28, 2008
![Page 2: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/2.jpg)
2
Staff and Skills
Vincent Danjean [MdC 9/2005]
Pierre-François Dutot [MdC 9/2006]
Thierry Gautier [CR] Guillaume Huard [MdC] Grégory Mounié [MdC]
Bruno Raffin [CR INRIA]
Jean-Louis Roch [MdC, Team leader]
Denis Trystram [Prof]
Frédéric Wagner [MdC 9/2006]
1 Invited Prof. Alfredo Goldman [USP Sao Paulo]
19 PhD students, 1 engineer 14 PhDs defended since 2005
Parallel algorithms & programming
Scheduling Interactive applications
1/1/2005: Creation of MOAIS team1/1/2006: Creation of INRIA team-project MOAIS
![Page 3: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/3.jpg)
3
Evolution of parallel programming
Parallelism everywhere Distributed, Heterogeneous
Grids
Cluster SMP
GPU
MPSoC
MapReduce [Google]
multi-core
TBB [Intel]
Fortress [Sun]
Cuda [NVidia]
…SPIRIT
MPI OpenMP
Cilk++ [CilkArts]
![Page 4: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/4.jpg)
4
End-to-end parallel programming solutionsfor high-performance interactive computing with provable performances.
optimization computational steering, VR embedded
Performance is multi-objective
output
QAP/Nugent on Grid’5000 [PRISM, GSCOP, DOLPHIN] INRIA Grimage platform
[MOAIS, PERCEPTION, EVASION]
Streaming on MPSoCs [ST]
input
MOAIS objective
![Page 5: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/5.jpg)
5
To mutually adapt application and scheduling. Proactive/static to the platform : the devices evolve gradually Online/dynamic to the execution context : data and resources Tolerant to data variations, failures, other appli. perturbation,…
From algorithms to applications Scheduling and parallel programming schemes Programming interfaces and tools Target applications : batch scheduling, combinatorial
optimization, computational steering, stream encoding
Approach
![Page 6: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/6.jpg)
6
Overview
Architecture
Interactive applicationP
erfo
rman
ce
Adaptive controlof execution
model: abstract representation
algorithm: scheduling, fault tolerance
MOAIS
4. Interactivity
3. Adaptive algorithms
2. Interfaces for coordination
1. Scheduling
Research Directions
![Page 7: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/7.jpg)
7
Research directions and
achievements for 2005-2007
1. Scheduling2. Interfaces for coordination3. Adaptive algorithms4. Interactive applications
![Page 8: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/8.jpg)
8
1. Scheduling Objective A: modeling of scheduling problems for adaptive applications
Adaptable parallelism degree for efficient coarse grain scheduling Parallel task models: moldable tasks, divisible load
Some results: Comparisons and coupling models: [IJ FCS 06 ]
Off-line: improvement of performance ratio : - 3/2-approximation [SIAM J.Comp 07] instead of 2 [Turek&al] by strip-packing- (3+5) for moldable tasks on a grid of clusters [Europar’06]
On-line: decrease of control overhead : « work-first principle » [Cilk] - Extension to general distributed data-flow computations [ICTTA’06, ICCS’07]
Task == | | … | ...
![Page 9: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/9.jpg)
9
Objective B: Design of multi-objective scheduling with provable guarantees.
Simultaneous approximation for each objective Approximated solutions of Pareto optimal solutions:
- Makespan/Reliability[SPAA07] - Makespan/Memory [IPDPS08]
t0 2 4 8 16
1. Scheduling
For moldable tasks: yields a
bi-approximation with arbitrary ratio between Cmax and Minsum [WEA05]
To include a smart algorithm inside a recursive doubling (eg. for Makespan) (eg. for Minsum)
Generic -Relaxation scheme [Shmoys&al.]: - Makespan/Minsum [WEA05];
![Page 10: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/10.jpg)
10
2. Interfaces for coordination Objective: provably efficient control at runtime of the coupling of components with various synchronizations constraints.
Kaapi middleware
Provable performances: - Efficient local serialization “work-first principle”, zero-copy [J. CLSS’07, ICCS07]]
- Scheduling:• coarse-grain graph partitioning + ``work-stealing’
- Fault-tolerance protocols, from scheduling properties• coordinated protocol [ICTTA’06] + original TIC protocol [EIT05, TDSC08]
Positioning: Multi-processors/multi-core architectures: Intel TBB, Cilk++ Grid / global platforms: Tolerate failure and falsification: Satin (FT)
1 struct sum { 2 void operator()(Shared_r < int > a, 3 Shared_r < int > b, 4 Shared_w < int > r ) 5 { r.write(a.read() + b.read()); } 6 } ; 7 8 struct fib { 9 void operator()(int n, Shared_w<int> r) 10 { if (n <2) r.write( n ); 11 else 12 { int r1, r2; 13 Fork< fib >() ( n-1, r1 ) ; 14 Fork< fib >() ( n-2, r2 ) ; 15 Fork< sum >() ( r1, r2, r ) ; 16 } 17 } 18 } ;
runtimeLocal stack
Distributed nestedmacrodataflow graph
![Page 11: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/11.jpg)
11
2. Kaapi: Support and transfert
Quadratic assignment [ANR CHOC] Finite element computations [ANR DISCOGRID] Cryptographic S-Box selection [ANR SAFESCALE] Probabilistic inference engine [ProBayes]
Distributed implementation of CAPE-Open standard for process engineering computations [IFP]
Cluster implementation of compliant runtime RSI/Indiss-RT
![Page 12: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/12.jpg)
12
QuickTime™ et undécompresseur
sont requis pour visionner cette image.QuickTime™ et undécompresseur
sont requis pour visionner cette image.
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
Sequentialalgorithm
Parallel algo 1 P=2
redundancyOverheads
synchronization
communication
Objective: To design and analyze algorithms that may obliviously adapt their execution under the control of the scheduling
Parallel algo 2 P=100
Parallel algo k P=+∞
3. Adaptive algorithms
…
Which one to select?
![Page 13: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/13.jpg)
13
3. Adaptive algorithms
Heterogeneous resources, variable speeds: work-stealing to obliviously self-tune granularity
But: work Wp increases when depth Dp decreases : multi-objective problem Adaptive recursive coupling of algorithms [Europar’06, PASCO’07, PDP’08]
- Relaxation: sequential / parallel work-stealing
€
Tp ≤ W p
p.Π ave
+ ODp
Π ave
⎛
⎝ ⎜
⎞
⎠ ⎟
Minimize both the work Wp and the depth Dp
![Page 14: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/14.jpg)
14
Cache&processor oblivious stream computations [ PDP’07] AWS: adaptive work-stealing for MPSoCs
Use case: HDTV on MPSoCs [ST Microelectronics film grain tech.]
MPSoC
Architecture description[SPIRIT / IP-XACT Simulator]
Application description potential parallelism
[AWS api]
Bridge =
AWSscheduling
Film-grainapplication(streaming)
Ex: HD-TVNoise reduction
Near optimal experimental results [PDP08]
3. Adaptive algorithms
![Page 15: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/15.jpg)
15
Adaptive 3D-vision [VR’07]
Realtime constraint : 30 frames per sec
Adaptive heterogeneous coupling with Kaapi: CPU+GPU [EGPV’07]
3. Adaptive algorithms
1 .. 16 CPUs
Tim
e (m
s)
Maximum precision
Lev
el o
f d
etai
ls
![Page 16: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/16.jpg)
16
4. Interactivity Motivation: parallelism for interactive applications
Challenging application: multi-cameras, multi-cpus, multi-GPUs, multi-display
Grimage platform [2004…]
Positioning: other platforms: - [Blue-C, ETH Zurich, 2005 …],[Tele-Immersion@UCBerkeley 2005]
Specificity: collaboration with EVASION(realtime physics simulation) PERCEPTION (computer vision)
-- 30 nodes cluster-- 15 cameras-- 16 projectors
![Page 17: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/17.jpg)
17
Middleware dedicated to interactive applications Distributed components, moldable Parallel code coupling
Static coarse grain mappingHDTV player on 12 Mpixels display wall (16 projectors)
[CPUs + GPUs]
4. Interactivity
![Page 18: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/18.jpg)
18
Grids
Cluster SMP
GPU
MPSoC
multi-core
Summary of 2005-2007
AWS
Multi-objective Adaptive Performance Applications are time-consuming but essential to validate scientific approach
![Page 19: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/19.jpg)
19
Some facts Publications
Contracts
Softwares: Kaapi, FlowVR, Taktuk, AWS
0
5
10
15
20
25
2005 2006 2007
JournalBook/Chap.Int. ConfPhDOthers (Nat.)
• 127 in 3 years , 19 rank #1
- 17 Int. Journal (SIAM J.Comp, IEEE TC, TPDS, TDSC, EJOR, FCS, …) - 59 Int. Conf (SPAA, IPDPS, CCGrid VR, VIS, Europar, ICCS, Siggraph…)
0
100
200
300
400
500
600
700
800
2005 2006 2007
ARC INRIAANR Nat.EuropeIndustryScholarshipsTotal
• Industry partners: STM, IFP, CEA, Bull, C-S, DCN, • 2 ARC, 5 ANRs• 1 pole MINALOGIC• 2 Europe, 1 Ass. team
K€
![Page 20: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/20.jpg)
20
Highlights 1st prize Plugtest Nov. 2007 Nqueens challenge
SIGGRAPH Aug. 2007 Emerging Technologies Demo
Valorization: start-up (Sep. 2007)- co-founded by former PhD C. Menier [joined MOAIS / PERCEPTION]
- transfer: parallel 3D modeling
Dec. 2006: special Jury prizeNov. 2007: 1st prize
• Nqueens(23) in 2107s with 3654 cores
~4000 visitors
![Page 21: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/21.jpg)
21
Research directions 2008-2012[1/3]
To push the interactions to large scale
Heterogeneous computing- Complex memory hierarchy
Provable performances vs adversary- Game theory
![Page 22: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/22.jpg)
22
Research directions 2008-2012[2/3]
Scheduling : multi-objective Large systems, many users, various objectives : equity / fairness Extra global objective to non-cooperative strategies
Coordination interface => Runtime for HIPC on demand Work-stealing based runtime extended to complex memory hierarchy Dependable computing on global computing platforms
![Page 23: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/23.jpg)
23
Research directions [3/3]
Adaptive algorithms- Large data sets, out-of-core issues - Framework / high level library
High performance interactive computing
- Interactive resolution of complex problem (scheduling)
- Grimage: explore new 3D interactions [PERCEPTION]• Parallelism for adaptive interactive performance
- [EVASION, ALCOVE]
Kaapi partitioning+work-stealing to balance load between heterogeneous resources (CPUs / GPUs )
![Page 24: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/24.jpg)
24
Summary
“To provide parallel programming schemes, interfaces and tools for high performance interactive computing that enable to achieve provable performances on distributed parallel architectures, from multi-processors system-on-chip to lightweight grids and global computing platforms.”
SIGGRAPH’07 [MOAIS - PERCEPTION - EVASION]
![Page 25: moais.imag.fr](https://reader036.fdocuments.in/reader036/viewer/2022081515/568153e3550346895dc1e2ba/html5/thumbnails/25.jpg)
25
Former members
13 PhDs defended in 2005-2007 Now: 2 at INRIA : Alcove, Cepage
7 in university : Reims, IKI Iran, Luxembourg, Vannes, Damascus, Warsaw, Colima 1 in Postdoc : Iowa SU
1 Start-up co-founder : 4DViews 2 in industry : IFP, Amadeus
2 postdocs Now: Univ. Paris 6, Petrobraz
1 long term visit Axel Krings, Idaho State Univ
3 engineers Now: INRIA/PARIS, industry