moais.imag.fr

25
1 http://moais.imag.fr Multi-Programming and Scheduling Design for Applications of Interactive Simulation Jean-Louis Roch & al. Louvre, Musée de l’Homme Sculpture (Tête) Artist : Anonyme Origin: Rapa Nui [Easter Island] Date : between the XIst and the XVth century Dimensions : 1,70 m high EVALUATION SEMINAR -RESEARCH THEME Num B "Grids and high-performance computing" March 27-28, 2008

description

M ulti-Pr o gramming and Scheduling Design for A pplications of I nteractive S imulation Jean-Louis Roch & al. http://moais.imag.fr. Louvre, Musée de l’Homme Sculpture (Tête) Artist : Anonyme Origin: Rapa Nui [Easter Island] Date : between the XIst and the XVth century - PowerPoint PPT Presentation

Transcript of moais.imag.fr

Page 1: moais.imag.fr

1

http://moais.imag.fr

Multi-Programming and Scheduling Design for Applications of Interactive Simulation

Jean-Louis Roch & al.

Louvre, Musée de l’Homme Sculpture (Tête)Artist : AnonymeOrigin: Rapa Nui [Easter Island]Date : between the XIst and the XVth centuryDimensions : 1,70 m high

EVALUATION SEMINAR -RESEARCH THEME Num B"Grids and high-performance computing"

March 27-28, 2008

Page 2: moais.imag.fr

2

Staff and Skills

Vincent Danjean [MdC 9/2005]

Pierre-François Dutot [MdC 9/2006]

Thierry Gautier [CR] Guillaume Huard [MdC] Grégory Mounié [MdC]

Bruno Raffin [CR INRIA]

Jean-Louis Roch [MdC, Team leader]

Denis Trystram [Prof]

Frédéric Wagner [MdC 9/2006]

1 Invited Prof. Alfredo Goldman [USP Sao Paulo]

19 PhD students, 1 engineer 14 PhDs defended since 2005

Parallel algorithms & programming

Scheduling Interactive applications

1/1/2005: Creation of MOAIS team1/1/2006: Creation of INRIA team-project MOAIS

Page 3: moais.imag.fr

3

Evolution of parallel programming

Parallelism everywhere Distributed, Heterogeneous

Grids

Cluster SMP

GPU

MPSoC

MapReduce [Google]

multi-core

TBB [Intel]

Fortress [Sun]

Cuda [NVidia]

…SPIRIT

MPI OpenMP

Cilk++ [CilkArts]

Page 4: moais.imag.fr

4

End-to-end parallel programming solutionsfor high-performance interactive computing with provable performances.

optimization computational steering, VR embedded

Performance is multi-objective

output

QAP/Nugent on Grid’5000 [PRISM, GSCOP, DOLPHIN] INRIA Grimage platform

[MOAIS, PERCEPTION, EVASION]

Streaming on MPSoCs [ST]

input

MOAIS objective

Page 5: moais.imag.fr

5

To mutually adapt application and scheduling. Proactive/static to the platform : the devices evolve gradually Online/dynamic to the execution context : data and resources Tolerant to data variations, failures, other appli. perturbation,…

From algorithms to applications Scheduling and parallel programming schemes Programming interfaces and tools Target applications : batch scheduling, combinatorial

optimization, computational steering, stream encoding

Approach

Page 6: moais.imag.fr

6

Overview

Architecture

Interactive applicationP

erfo

rman

ce

Adaptive controlof execution

model: abstract representation

algorithm: scheduling, fault tolerance

MOAIS

4. Interactivity 

3. Adaptive algorithms

2. Interfaces for coordination

1. Scheduling  

Research Directions

Page 7: moais.imag.fr

7

Research directions and

achievements for 2005-2007

1. Scheduling2. Interfaces for coordination3. Adaptive algorithms4. Interactive applications

Page 8: moais.imag.fr

8

1. Scheduling Objective A: modeling of scheduling problems for adaptive applications

Adaptable parallelism degree for efficient coarse grain scheduling Parallel task models: moldable tasks, divisible load

Some results: Comparisons and coupling models: [IJ FCS 06 ]

Off-line: improvement of performance ratio : - 3/2-approximation [SIAM J.Comp 07] instead of 2 [Turek&al] by strip-packing- (3+5) for moldable tasks on a grid of clusters [Europar’06]

On-line: decrease of control overhead : « work-first principle » [Cilk] - Extension to general distributed data-flow computations [ICTTA’06, ICCS’07]

Task == | | … | ...

Page 9: moais.imag.fr

9

Objective B: Design of multi-objective scheduling with provable guarantees.

Simultaneous approximation for each objective Approximated solutions of Pareto optimal solutions:

- Makespan/Reliability[SPAA07] - Makespan/Memory [IPDPS08]

t0 2 4 8 16

1. Scheduling

For moldable tasks: yields a

bi-approximation with arbitrary ratio between Cmax and Minsum [WEA05]

To include a smart algorithm inside a recursive doubling (eg. for Makespan) (eg. for Minsum)

Generic -Relaxation scheme [Shmoys&al.]: - Makespan/Minsum [WEA05];

Page 10: moais.imag.fr

10

2. Interfaces for coordination Objective: provably efficient control at runtime of the coupling of components with various synchronizations constraints.

Kaapi middleware

Provable performances: - Efficient local serialization “work-first principle”, zero-copy [J. CLSS’07, ICCS07]]

- Scheduling:• coarse-grain graph partitioning + ``work-stealing’

- Fault-tolerance protocols, from scheduling properties• coordinated protocol [ICTTA’06] + original TIC protocol [EIT05, TDSC08]

Positioning: Multi-processors/multi-core architectures: Intel TBB, Cilk++ Grid / global platforms: Tolerate failure and falsification: Satin (FT)

1 struct sum { 2 void operator()(Shared_r < int > a, 3 Shared_r < int > b, 4 Shared_w < int > r ) 5 { r.write(a.read() + b.read()); } 6 } ; 7 8 struct fib { 9 void operator()(int n, Shared_w<int> r) 10 { if (n <2) r.write( n ); 11 else 12 { int r1, r2; 13 Fork< fib >() ( n-1, r1 ) ; 14 Fork< fib >() ( n-2, r2 ) ; 15 Fork< sum >() ( r1, r2, r ) ; 16 } 17 } 18 } ;

runtimeLocal stack

Distributed nestedmacrodataflow graph

Page 11: moais.imag.fr

11

2. Kaapi: Support and transfert

Quadratic assignment [ANR CHOC] Finite element computations [ANR DISCOGRID] Cryptographic S-Box selection [ANR SAFESCALE] Probabilistic inference engine [ProBayes]

Distributed implementation of CAPE-Open standard for process engineering computations [IFP]

Cluster implementation of compliant runtime RSI/Indiss-RT

Page 12: moais.imag.fr

12

QuickTime™ et undécompresseur

sont requis pour visionner cette image.QuickTime™ et undécompresseur

sont requis pour visionner cette image.

QuickTime™ et undécompresseur

sont requis pour visionner cette image.

QuickTime™ et undécompresseur

sont requis pour visionner cette image.

Sequentialalgorithm

Parallel algo 1 P=2

redundancyOverheads

synchronization

communication

Objective: To design and analyze algorithms that may obliviously adapt their execution under the control of the scheduling

Parallel algo 2 P=100

Parallel algo k P=+∞

3. Adaptive algorithms

Which one to select?

Page 13: moais.imag.fr

13

3. Adaptive algorithms

Heterogeneous resources, variable speeds: work-stealing to obliviously self-tune granularity

But: work Wp increases when depth Dp decreases : multi-objective problem Adaptive recursive coupling of algorithms [Europar’06, PASCO’07, PDP’08]

- Relaxation: sequential / parallel work-stealing

Tp ≤ W p

p.Π ave

+ ODp

Π ave

⎝ ⎜

⎠ ⎟

Minimize both the work Wp and the depth Dp

Page 14: moais.imag.fr

14

Cache&processor oblivious stream computations [ PDP’07] AWS: adaptive work-stealing for MPSoCs

Use case: HDTV on MPSoCs [ST Microelectronics film grain tech.]

MPSoC

Architecture description[SPIRIT / IP-XACT Simulator]

Application description potential parallelism

[AWS api]

Bridge =

AWSscheduling

 Film-grainapplication(streaming)

Ex: HD-TVNoise reduction

Near optimal experimental results [PDP08]

3. Adaptive algorithms

Page 15: moais.imag.fr

15

Adaptive 3D-vision [VR’07]

Realtime constraint : 30 frames per sec

Adaptive heterogeneous coupling with Kaapi: CPU+GPU [EGPV’07]

3. Adaptive algorithms

1 .. 16 CPUs

Tim

e (m

s)

Maximum precision

Lev

el o

f d

etai

ls

Page 16: moais.imag.fr

16

4. Interactivity Motivation: parallelism for interactive applications

Challenging application: multi-cameras, multi-cpus, multi-GPUs, multi-display

Grimage platform [2004…]

Positioning: other platforms: - [Blue-C, ETH Zurich, 2005 …],[Tele-Immersion@UCBerkeley 2005]

Specificity: collaboration with EVASION(realtime physics simulation) PERCEPTION (computer vision)

-- 30 nodes cluster-- 15 cameras-- 16 projectors

Page 17: moais.imag.fr

17

Middleware dedicated to interactive applications Distributed components, moldable Parallel code coupling

Static coarse grain mappingHDTV player on 12 Mpixels display wall (16 projectors)

[CPUs + GPUs]

4. Interactivity

Page 18: moais.imag.fr

18

Grids

Cluster SMP

GPU

MPSoC

multi-core

Summary of 2005-2007

AWS

Multi-objective Adaptive Performance Applications are time-consuming but essential to validate scientific approach

Page 19: moais.imag.fr

19

Some facts Publications

Contracts

Softwares: Kaapi, FlowVR, Taktuk, AWS

0

5

10

15

20

25

2005 2006 2007

JournalBook/Chap.Int. ConfPhDOthers (Nat.)

• 127 in 3 years , 19 rank #1

- 17 Int. Journal (SIAM J.Comp, IEEE TC, TPDS, TDSC, EJOR, FCS, …) - 59 Int. Conf (SPAA, IPDPS, CCGrid VR, VIS, Europar, ICCS, Siggraph…)

0

100

200

300

400

500

600

700

800

2005 2006 2007

ARC INRIAANR Nat.EuropeIndustryScholarshipsTotal

• Industry partners: STM, IFP, CEA, Bull, C-S, DCN, • 2 ARC, 5 ANRs• 1 pole MINALOGIC• 2 Europe, 1 Ass. team

K€

Page 20: moais.imag.fr

20

Highlights 1st prize Plugtest Nov. 2007 Nqueens challenge

SIGGRAPH Aug. 2007 Emerging Technologies Demo

Valorization: start-up (Sep. 2007)- co-founded by former PhD C. Menier [joined MOAIS / PERCEPTION]

- transfer: parallel 3D modeling

Dec. 2006: special Jury prizeNov. 2007: 1st prize

• Nqueens(23) in 2107s with 3654 cores

~4000 visitors

Page 21: moais.imag.fr

21

Research directions 2008-2012[1/3]

To push the interactions to large scale

Heterogeneous computing- Complex memory hierarchy

Provable performances vs adversary- Game theory

Page 22: moais.imag.fr

22

Research directions 2008-2012[2/3]

Scheduling : multi-objective Large systems, many users, various objectives : equity / fairness Extra global objective to non-cooperative strategies

Coordination interface => Runtime for HIPC on demand Work-stealing based runtime extended to complex memory hierarchy Dependable computing on global computing platforms

Page 23: moais.imag.fr

23

Research directions [3/3]

Adaptive algorithms- Large data sets, out-of-core issues - Framework / high level library

High performance interactive computing

- Interactive resolution of complex problem (scheduling)

- Grimage: explore new 3D interactions [PERCEPTION]• Parallelism for adaptive interactive performance

- [EVASION, ALCOVE]

Kaapi partitioning+work-stealing to balance load between heterogeneous resources (CPUs / GPUs )

Page 24: moais.imag.fr

24

Summary

“To provide parallel programming schemes, interfaces and tools for high performance interactive computing that enable to achieve provable performances on distributed parallel architectures, from multi-processors system-on-chip to lightweight grids and global computing platforms.”

SIGGRAPH’07 [MOAIS - PERCEPTION - EVASION]

Page 25: moais.imag.fr

25

Former members

13 PhDs defended in 2005-2007 Now: 2 at INRIA : Alcove, Cepage

7 in university : Reims, IKI Iran, Luxembourg, Vannes, Damascus, Warsaw, Colima 1 in Postdoc : Iowa SU

1 Start-up co-founder : 4DViews 2 in industry : IFP, Amadeus

2 postdocs Now: Univ. Paris 6, Petrobraz

1 long term visit Axel Krings, Idaho State Univ

3 engineers Now: INRIA/PARIS, industry