Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ...

20
1 Extreme Scaling vs. Energy Efficiency – A Challenge for Computer Science Dieter Kranzlmüller Munich Network Management Team Ludwig-Maximilians-Universität München (LMU) & Leibniz SupercompuFng Centre (LRZ) of the Bavarian Academy of Sciences and HumaniFes Leibniz Supercompu?ng Centre of the Bavarian Academy of Sciences and Humani?es D. Kranzlmüller ACOMP 2015 2 Cuboid containing compuFng systems 72 x 36 x 36 meters InsFtute Building InsFtute Building Lecture Halls VisualisaFon Centre With 156 employees + 38 extra staff for more than 100.000 students and for more than 30.000 employees including 8.500 scienFsts

Transcript of Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ...

Page 1: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

1

ExtremeScalingvs.EnergyEfficiency–AChallengeforComputerScience

DieterKranzlmüllerMunichNetworkManagementTeamLudwig-Maximilians-UniversitätMünchen(LMU)&LeibnizSupercompuFngCentre(LRZ)oftheBavarianAcademyofSciencesandHumaniFes

LeibnizSupercompu?ngCentreoftheBavarianAcademyofSciencesandHumani?es

D.Kranzlmüller ACOMP2015 2

Cuboidcontaining

compuFngsystems72x36x36meters

InsFtuteBuilding

InsFtuteBuilding

LectureHalls

VisualisaFonCentre

With156employees+38extrastaffformorethan100.000studentsandformorethan30.000employeesincluding8.500scienFsts

Page 2: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

2

LeibnizSupercompu?ngCentreoftheBavarianAcademyofSciencesandHumani?es

n ComputerCentreforallMunichUniversiFes

D.Kranzlmüller ACOMP2015 3

ITServiceProvider:• MunichScienFficNetwork(MWN)• Webservers• e-Learning• E-Mail• Groupware• Specialequipment:

•  VirtualRealityLaboratory•  VideoConference•  Scannersforslidesandlarge

documents•  Largescaleploders

ITCompetenceCentre:• Hotlineandsupport• ConsulFng(security,networking,scienFfccompuFng,…)• Courses(textediFng,imageprocessing,UNIX,Linux,HPC,…)

D.Kranzlmüller

TheMunichScien?ficNetwork(MWN)

ACOMP2015 4

Page 3: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

3

LeibnizSupercompu?ngCentreoftheBavarianAcademyofSciencesandHumani?es

n RegionalComputerCentreforallBavarianUniversiFes

n ComputerCentreforallMunichUniversiFes

D.Kranzlmüller ACOMP2015 5

VirtualReality&Visualiza?onCentre(LRZ)

D.Kranzlmüller ACOMP2015 6

Page 4: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

4

ExamplesfromtheV2C

D.Kranzlmüller ACOMP2015 7

LeibnizSupercompu?ngCentreoftheBavarianAcademyofSciencesandHumani?es

n  NaFonalSupercompuFngCentre

n  RegionalComputerCentreforallBavarianUniversiFes

n  ComputerCentreforallMunichUniversiFes

D.Kranzlmüller ACOMP2015 8

Page 5: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

5

GaussCentreforSupercompu?ng(GCS)

n  CombinaFonofthe3GermannaFonalsupercompuFngcenters:–  JohnvonNeumannInsFtuteforCompuFng(NIC),Jülich–  HighPerformanceCompuFngCenterStudgart(HLRS)–  LeibnizSupercompuFngCentre(LRZ),Garchingn.Munich

n  Foundedon13.April2007

n  HosFngmemberofPRACE(PartnershipforAdvancedCompuFnginEurope)

D.Kranzlmüller ACOMP2015 9

PRACEResearchInfrastructureCreated

n  Establishmentofthelegalframework–  PRACEAISBLcreatedwithseatinBrusselsinApril

(AssociaFonInternaFonaleSansButLucraFf)–  20membersrepresenFng20Europeancountries–  InauguraFoninBarcelonaonJune9

n  Fundingsecuredfor2010-2015–  400Million€fromFrance,Germany,Italy,Spain

ProvidedasTier-0servicesonTCObasis–  Fundingdecisionfor100Million€inTheNetherlands

expectedsoon–  70+Million€fromECFP7forpreparatoryandimplementaFon

GrantsINFSO-RI-211528and261557Complementedby~60Million€fromPRACEmembers

D.Kranzlmüller ACOMP2015 10

Page 6: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

6

PRACETier-0Systems

n  Curie@GENCI:BullCluster,1.7PFlop/s

n  FERMI@CINECA:IBMBG/Q,2.1PFlop/s

n  Hermit@HLRS:CrayXE6,1Pflop/s

n  JUQUEEN@FZJ:IBMBlueGene/Q,5.9PFlop/s

n  MareNostrum@BSC:IBMSystemXiDataPlex,1PFlop/s

n  SuperMUC@LRZ:IBMSystemXiDataPlex,3.2PFlop/s

D.Kranzlmüller ACOMP2015 11

LeibnizSupercompu?ngCentreoftheBavarianAcademyofSciencesandHumani?es

n  EuropeanSupercompuFngCentre

n  NaFonalSupercompuFngCentre

n  RegionalComputerCentreforallBavarianUniversiFes

n  ComputerCentreforallMunichUniversiFes

D.Kranzlmüller ACOMP2015 12

SGI UV

SGI Altix

Linux Clusters

SuperMUC

Linux Hosting and Housing

Page 7: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

7

SuperMUC@LRZ

D.Kranzlmüller ACOMP2015 13

Video: SuperMUC rendered on SuperMUC by LRZ

hdp://youtu.be/OlAS6iiqWrQ

Top500SupercomputerList(June2012)

D.Kranzlmüller ACOMP2015 14

www.top500.org

Page 8: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

8

LRZSupercomputers

D.Kranzlmüller 15

SuperMUCPhaseII

ACOMP2015

SuperMUC Phase 1 + 2

D.Kranzlmüller ACOMP2015 16

Page 9: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

9

InFrontof6.4PFlop/s

D.Kranzlmüller ACOMP2015 17

ChallengesinProgrammingandUsingtheseSupercomputers

D.Kranzlmüller ACOMP201518

Page 10: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

10

SuperMUC and its predecessors

D.Kranzlmüller ACOMP2015 19

10m

SuperMUC and its predecessors

D.Kranzlmüller ACOMP2015 20

10m

22m

11m

Page 11: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

11

SuperMUC and its predecessors

D.Kranzlmüller ACOMP2015 21

10m

22m

11m

LRZBuildingExtension

D.Kranzlmüller ACOMP2015 22

Picture: Ernst A. Graf

Picture: Horst-Dieter Steinhöfer

Figure: Herzog+Partner für StBAM2 (staatl. Hochbauamt München 2)

Page 12: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

12

I/O

nodes

NAS 80 Gbit/s

18 Thin node islands (each >8000 cores)

1 Fat node island (8200 cores) è SuperMIG

$HOME 1.5 PB / 10 GB/s

Snapshots/Replika 1.5 PB

(separate fire section)

non blocking

non blocking

pruned tree (4:1)

SB-EP 16 cores/node

2 GB/core

WM-EX 40cores/node 6.4 GB/core

10 PB

200 GB/s

GPFS for $WORK

$SCRATCH

Internet Achive and Backup ~ 30 PB

Desaster Recovery Site

Compute nodes Compute nodes

SuperMUCArchitecture

D.Kranzlmüller ACOMP2015 23

PowerConsump?onatLRZ

D.Kranzlmüller ACOMP2015 24

0

5.000

10.000

15.000

20.000

25.000

30.000

35.000

Stro

mve

rbra

uch

in M

Wh

Page 13: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

13

CoolingSuperMUC

D.Kranzlmüller ACOMP2015 25

SuperMUCPhase1&2@LRZ

D.Kranzlmüller ACOMP2015 26

Page 14: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

14

LRZ Application Mix

q  Computational Fluid Dynamics: Optimisation of turbines and wings, noise reduction, air conditioning in trains

q  Fusion: Plasma in a future fusion reactor (ITER) q  Astrophysics: Origin and evolution of stars and galaxies q  Solid State Physics: Superconductivity, surface properties q  Geophysics: Earth quake scenarios q  Material Science: Semiconductors q  Chemistry: Catalytic reactions q  Medicine and Medical Engineering: Blood flow, aneurysms, air

conditioning of operating theatres q  Biophysics: Properties of viruses, genome analysis q  Climate research: Currents in oceans

D.Kranzlmüller ACOMP2015 27

Increasing numbers

Date System Flop/s Cores

2000 HLRB-I 2 Tflop/s 1512

2006 HLRB-II 62 Tflop/s 9728

2012 SuperMUC 3200 Tflop/s 155656

2015 SuperMUC Phase II 3.2 + 3.2 Pflop/s 229960

D.Kranzlmüller ACOMP2015 28

Page 15: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

15

SuperMUC Jobsize 2015 (in Cores)

ACOMP2015D.Kranzlmüller 29

ChallengesonExtremeScaleSystems

n  Size:numberofcores>100.000

n  Complexity/Heterogeneity

n  Reliability/Resilience

n  EnergyconsumpFonaspartofTotalCostofOwnership(TCO)–  ExecutecodeswithopFmalpowerconsumpFon

(orwithinacertainpowerband)èFrequencyscaling–  OpFmizeforenergy-to-soluFon

èAllowmorecodeswithingivenbudget–  Improvedperformance

è(inmostcases)improvedenergy-to-soluFon

D.Kranzlmüller ACOMP2015 30

Page 16: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

16

1st LRZ Extreme Scale Workshop

n  July2013:1stLRZExtremeScaleWorkshop

n  ParFcipants:–  15internaFonalprojects

n  Prerequisites:–  Successfulrunon4islands(32768cores)

n  ParFcipaFngGroups(Sotwarepackages):–  LAMMPS,VERTEX,GADGET,WaLBerla,BQCD,Gromacs,APES,SeisSol,CIAO

n  Successfulresults(>64000Cores):–  InvitedtoparFcipateinPARCOConference(Sept.2013)includinga

publicaFonoftheirapproach

D.Kranzlmüller ACOMP2015 31

1st LRZ Extreme Scale Workshop

n  Regular SuperMUC operation –  4 Islands maximum –  Batch scheduling system

n  Entire SuperMUC reserved 2,5 days for challenge: –  0,5 Days for testing –  2 Days for executing –  16 (of 19) Islands available

n  Consumed computing time for all groups: –  1 hour of runtime = 130.000 CPU hours –  1 year in total

D.Kranzlmüller ACOMP2015 32

Page 17: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

17

Results (Sustained TFlop/s on 128000 cores)

D.Kranzlmüller ACOMP2015 33

Name MPI #cores Description TFlop/s/island TFlop/smaxLinpack IBM 128000 TOP500 161 2560Vertex IBM 128000 PlasmaPhysics 15 245GROMACS IBM,Intel 64000 MolecularModelling 40 110Seissol IBM 64000 Geophysics 31 95waLBerla IBM 128000 LatticeBoltzmann 5.6 90LAMMPS IBM 128000 MolecularModelling 5.6 90APES IBM 64000 CFD 6 47BQCD Intel 128000 QuantumPhysics 10 27

Results

n  5 Software packages were running on max 16 islands: –  LAMMPS –  VERTEX –  GADGET –  WaLBerla –  BQCD

n  VERTEX reached 245 TFlop/s on 16 islands (A. Marek)

D.Kranzlmüller ACOMP2015 34

Page 18: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

18

Extreme Scaling Continued

n  Lessons learned è Stability and scalability

n  LRZ Extreme Scale Benchmark Suite (LESS) will be available in two versions: public and internal

n  All teams will have the opportunity to run performance benchmarks after upcoming SuperMUC maintenances

n  2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 –  Full system production runs on 18 islands with sustained Pflop/s

(4h SeisSol, 7h Gadget) –  4 existing + 6 additional full system applications –  High I/O bandwidth in user space possible (66 GB/s of 200 GB/s max) –  Important goal: minimize energy*runtime (3-15 W/core)

n  Extreme Scale-Out with new SuperMUC Phase 2

D.Kranzlmüller ACOMP2015 35

ExtremeScale-OutSuperMUCPhase2

n  12May–12June2015(30days)n  SelectedGroupofEarlyUsers

n  NightlyOperaFon:generalqueuemax3islandsn  DayFmeOperaFon:specialqueuemax6islands(fullsystem)

n  Totalavailable:63,432,000corehoursn  Totalused:43,758,430corehours(UFlisaFon:68.98%)

Lessonslearned(2015):n  PreparaFoniseverythingn  FindingHeisenbugsisdifficultn  MPIisatitslimitsn  Hybrid(MPI+OpenMP)isthewaytogon  I/Olibrariesgewngevenmoreimportant

D.Kranzlmüller ACOMP2015 36

Page 19: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

19

PartnershipIni?a?veComputa?onalSciencesπCS

n  IndividualizedservicesforselectedscienFficgroups–flagshiprole–  Dedicatedpoint-of-contact–  Individualsupportandguidanceandtargetedtraining&educaFon–  PlanningdependabilityforusecasespecificopFmizedITinfrastructures–  EarlyaccesstolatestITinfrastructure(hard-andsotware)developments

andspecificaFonoffuturerequirements–  AccesstoITcompetencenetworkandexperFseatCSandMathdepartments

n  Partnercontribu?on–  EmbeddingITexpertsinusergroups–  Jointresearchprojects(includingfunding)–  ScienFficpartnership–equalfooFng–jointpublicaFons

n  LRZbenefits–  Understandingthe(currentandfuture)needsandrequirementsofthe

respecFvescienFficdomain–  Developingfutureservicesforallusergroups–  ThemaFcfocusing:EnvironmentalCompu?ng

D. Kranzlmüller ACOMP2015 37

SeisSol-NumericalSimula?onofSeismicWavePhenomena

D.Kranzlmüller ACOMP2015 38

Picture:AlexBreuer(TUM)/ChrisFanPelFes(LMU)

Dr.ChrisFanPelFes,DepartmentofEarthandEnvironmentalSciences(LMU)Prof.MichaelBader,DepartmentofInformaFcs(TUM)

1,42Petaflop/son147.456CoresofSuperMUC(44,5%ofPeakPerformance)

hdp://www.uni-muenchen.de/informaFonen_fuer/presse/presseinformaFonen/2014/pelFes_seisol.html

Page 20: Extreme Scaling vs. Energy Efficiency – A Challenge for Computer … · 2015-11-25 · n 2nd LRZ Extreme Scaling Workshop è 2-5 June 2014 – Full system production runs on 18

20

ExtremeScaling-Conclusions

n  Thenumberofcomputecores,thecomplexity(andheterogeneity)issteadilyincreasing

n  Usersneedtopossibilitytoreliablyexecute(andopFmize)theircodesonthefullsizemachineswithmorethan100.000cores

n  TheExtremeScalingWorkshopSeries@LRZoffersanumberofincenFvesforusersèNextWorkshopSpring2016

n  ThelessonslearnedfromtheExtremeScalingWorkshopareveryvaluablefortheoperaFonofthecenter

n  TheLRZPartnershipIniFaFveComputaFonalScience(piCS)triestoimproveusersupport

hdp://www.sciencedirect.com/science/arFcle/pii/S1877050914003433

D.Kranzlmüller ACOMP2015 39

D.Kranzlmüller ACOMP2015 40

Extreme Scaling vs. Energy Efficiency - A Challenge for Computer Science

Dieter Kranzlmüller

[email protected]