Bull's view on the future of computin

15
1 © Bull, 2013 SurfSara Opening June 14th June, 14th, 2013 Jean-Marc DENIS International Business Director Extreme Computing Business Unit Cartesius Opening

Transcript of Bull's view on the future of computin

Page 1: Bull's view on the future of computin

1 © Bull, 2013 SurfSara Opening – June 14th

June, 14th, 2013 Jean-Marc DENIS

International Business Director

Extreme Computing Business Unit

Cartesius Opening

Page 2: Bull's view on the future of computin

2 © Bull, 2013 SurfSara Opening – June 14th

Cartesius (Renatus, 1596 – 1650)(*)

René Descartes (French: [ʁəne dekaʁt]; Latinized: Renatus Cartesius; adjectival form: "Cartesian";[6] 31 March 1596 – 11 February 1650) was a French philosopher, mathematician, and writer who spent most of his adult life in the Dutch Republic. He has been dubbed the 'Father of Modern Philosophy'. Descartes' influence in mathematics is equally apparent; the Cartesian coordinate system — allowing reference to a point in space as a set of numbers, and allowing algebraic equations to be expressed as geometric shapes in a two-dimensional coordinate system (and conversely, shapes to be described as equations) — was named after him. He is credited as the father of analytical geometry, the bridge between algebra and geometry, crucial to the discovery of infinitesimal calculus and analysis. Descartes was also one of the key figures in the Scientific Revolution and has been described as an example of genius. Descartes was a major figure in 17th-century continental rationalism, later advocated by Baruch Spinoza and Gottfried Leibniz, and opposed by the empiricist school of thought consisting of Hobbes, Locke, Berkeley, Jean-Jacques Rousseau, and Hume. Leibniz, Spinoza and Descartes were all well versed in mathematics as well as philosophy, and Descartes and Leibniz contributed greatly to science as well. He is perhaps best known for the philosophical statement "Cogito ergo sum" (French: Je pense, donc je suis; English: I think, therefore I am), found in part IV of Discourse on the Method (1637) and §7 of part I of Principles of Philosophy (1644). La Haye en Touraine, the town was the birthplace of the philosopher René Descartes (1596–1650), although his family home was in nearby Chatellerault. Descartes left La Haye in approximately 1606 to attend the College Henri IV at La Fleche. The town was renamed La Haye-Descartes in 1802 in his honor, and then renamed again to Descartes in 1967.

(*) http://en.wikipedia.org/wiki/Ren%C3%A9_Descartes

Page 3: Bull's view on the future of computin

3 © Bull, 2013 SurfSara Opening – June 14th

Cartesius (SurfSara, 2013 – … )

Phase 1 (2013) 271 TFlops 572 compute nodes 44800 GB memory 1071 TiB storage IB FDR

Page 4: Bull's view on the future of computin

4 © Bull, 2013 SurfSara Opening – June 14th

Phase 2 (2014) 1.349 Tflops (x5) 1.652 compute nodes (32 Fat & 1620 Thin) (x3) 112.512 GB Memory (x2,5) 6.964 TiB storage & 202 GB/s (x7) IB FDR (no change)

Page 5: Bull's view on the future of computin

5 © Bull, 2013 SurfSara Opening – June 14th

Why ExaScale Computing?

1 PF

(10 15)

10 PF

(10 16)

Oil reservoir

discovered

Non-significant

image

Unclear image

Industrial challenges in oil and gas:

depth imaging roadmap – courtesy IESP

1015

flops

0,1

1

10

1000

100

1995 2000 2005 2010 2015 2020

0,5

Complexity of algorithm

Visco-elastic FWI

Petro-elastic inversion

Elastic FWI

Visco-elastic modeling

Isotropic/anisotropic FWI

Elastic modeling/RTM

Isotropic/anisotropic RTM

Isotropic/anisotropic modeling

Paraxial isotropic/anisotropic imaging

Asymptotic approximation imaging

50 TF

(50x10 12)

Oil & Gas: better resource detection

1980 1990 2000 2010 2020 2030

Capacity: # of Overnight Loads cases run

Available Computational

Capacity [Flop/s]

CFD-based LOADS & HQ

Aero Optimisation & CFD-CSM

Full MDO

Real-time CFD-based

in flight simulation

10

6

1 Zeta (1021

)

1 Peta (1015

)

1 Tera (1012

)

1 Giga (109

)

1 Exa (1018

)

102

103

104

105

106

LE

S

CFD-based noise

simulation

RANS

Low

Speed

RANS

High

Speed

HS

Design

Data

Set

Unsteady

RANS

‘Smart’ use of HPC power:

• Algorithms

• Data mining

• Knowledge

Capability achieved during one night batch

Cou

rte

sy A

IRB

US

Fra

nce

/IE

SP

Aircraft: complete multi-physics simulation Human brain project

Page 6: Bull's view on the future of computin

6 © Bull, 2013 SurfSara Opening – June 14th

(Some) Exascale challenges

1

x30

2010 2020 2015

x30

30

1,000

PFlops

Page 7: Bull's view on the future of computin

7 © Bull, 2013 SurfSara Opening – June 14th

Addressing the Exascale Challenges

Optimize system Power Consumption (minimize PUE)

Develop new HPC processors

Fix the Memory wall TeraBytes Bandwidth

Terabit interconnect (optical links everywhere)

Non-Volatile Memory (NV-RAM) storage and fast memory

SW complexity: manageability, programming models

Page 8: Bull's view on the future of computin

8 © Bull, 2013 SurfSara Opening – June 14th

Bull focus for ExaScale Computing

Power Consumption

2012

1 PF

1MW

1000 PF

20MW MWatts

x20

FLOPS x1000

2020

In 2011, 50% of CIO claimed that none of their compute tasks did use more than 120 cores

Average number of cores per supercomputer (Top 20 of Top500)

Exponential increase in number of cores

100 millions of cores

2020

exaflops

#cores

Page 9: Bull's view on the future of computin

9 © Bull, 2013 SurfSara Opening – June 14th

Bull research program for ExaScale Computing

PUE optimization Down to 1 + ε (very) hot water Adiabatic Computer room

Cogeneration No wasted energy. Any piece of heat is re-used

Supercomputer management Power monitoring tools Use the right HW for the right app

Application optimization Save (a lot) on energy consumption with (very) limited performance degradation

SW stack OS Communications (MPI but not only) Batch Affinity (cpu/mem/node/…) Data management (filesystems)

Overpass current interconnect limitations

Topology (ies) RDMA mechanisms Latency at large Scale

Programming model (many) different programming models: MIMD+SIMD Languages

Reliability MTBF close to zero…

automatic recovery mechanisms

Power Consumption Exponential increase in number of cores

Opportunities for Collaborations

Page 10: Bull's view on the future of computin

10 © Bull, 2013 SurfSara Opening – June 14th

Manageability at ExaScale

•MPI, OpenMP, Threads, Cuda, OpenCL, ... Set of compute resources

•Message passing, shared memory Parallelism based compute resources

•Locality New high level programming languages

“The processor is the new transistor" (Chris Rowen)

• Describe key characteristics of applications

• Elect the most appropriate set of node types

• Manage resources with heuristics predicting the future workload

Optimize compute environment

• Resource fragmentation reduction

• Hardware failures Prediction

Migrate Processes

• Automatic application loadbalancing

• Meshes refinement optimization

• Restart lost processes in case of failure

Allow dynamic application frameworks

Raise level of

abstraction

Page 11: Bull's view on the future of computin

11 © Bull, 2013 SurfSara Opening – June 14th

Programmability at ExaScale

Parallelism / Concurrency is easy to apprehend

… but much more complex to express in an application program

Distribute task and data to operate on

Old SMP approaches (bulk parallelism à la OpenMP) making a come back (cf MIC)

Old SIMD approaches (bulk parallelism à la CM2/CM5) making a come back (cf CUDA)

At Highest level Message passing (MPI-3) Data decomposition

With increasing degree of parallelism hierarchical approach is necessary

Page 12: Bull's view on the future of computin

12 © Bull, 2013 SurfSara Opening – June 14th

2020 exascale downscale to departmental and Embedded computing SME’s computing

PetaFlop

system

(2012)

ExaFlop /

data center

(2020)

PetaFlop/

departmental

(2020)

TeraFlop /

embedded

(2020)

Number of nodes [3-8],000 [50-200],000

(10x)

[50-100] 1

Computation

(Flops & Inst.)

1 PetaFlop 1 ExaFlop

(1000x)

1 PetaFlop 1 TeraFlop

Memory Capacity

(B)

[1-2]00 TB > 100 PB

(1000x)

> 10^14 > 10^11

Global Memory

BW (B/s)

[2-5] 00

TB/s

> 100 PB/s

(1000x)

> 100 TB/s > 100 GB/s

Interconnect

bisection BW

[5-10]0

TB/s

~50 PB/s

(1000x)

~10 TB/s N/A

Storage Capacity

(B)

[1-10] PB >1 EB

(1000x)

>1 PB > 1 TB

Storage BW (B/s) [10-500]

GB/s

> 10 TB/s

(1000x)

> 10 GB/s > 10 MB/s

IOP/s 100,000 > 100 M

(1000x)

> 100,000 > 100

Power Cons.

(W)

[.5-1.] MW < 20 MW

(20x)

< 20 KW < 20 W

By 2020 - Pflops in a rack - TFlops in a chip

Page 13: Bull's view on the future of computin

13 © Bull, 2013 SurfSara Opening – June 14th

Cogito Ergo Sum

Computa Ergo Sum

Page 14: Bull's view on the future of computin

14 © Bull, 2013 SurfSara Opening – June 14th

Cogito Ergo Sum

Computo Ergo Sum

Page 15: Bull's view on the future of computin

15 © Bull, 2013 SurfSara Opening – June 14th