HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g....
Transcript of HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g....
![Page 1: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/1.jpg)
Atos
Marc Simon
Atos Senior Expert
Global HPC Presales manager
![Page 2: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/2.jpg)
Exascale vs Exaflops
2
![Page 3: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/3.jpg)
Exascale is not only about ExaflopsLessons learned from the Petaflops ages
First Petaflops system in June 2008 – Roadrunner1 Pflops – 2.5 MW , 300 racks
~13 k x IBM power Cell 8i accelerators + AMD Opteron
100 M$
10 years after, in June 2018 – Summit 122 Pflops – 13 MW , 256 racks
~27 k x Nvidia V100 accelerators + Power9
250 M$
Road to Exaflops is longer than expected …
![Page 4: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/4.jpg)
Exascale is not only about ExaflopsLessons learned from the Petaflops ages
End’s of
Moore law…
![Page 5: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/5.jpg)
Exascale is not only about ExaflopsLessons learned from the Petaflops ages
Application performances
are not increasing
at the same speed …
![Page 6: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/6.jpg)
Exascale is not only about ExaflopsLessons learned from the Petaflops ages
There are Flops …
and Flops…
Not all HPC and AI need Double Precision (64 bits) Flops
– Reduced Precision (16 bits) is enough
– FP16 (IEEE) : 5 bits for exponent
– new Floating Point Format B16 … “B” like Brain, B16: 8 bits for
exponent
– Same dynamic range as FP32(IEEE)
– usually FP16/B16 4x faster than FP64
With Matrix Multiplication / Machine Learning Special instructions …
when usable
64bits: x5 Flops and
16bits: x9 Flops … x4→ x36 vs usual FP64
![Page 7: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/7.jpg)
Exascale is not only about ExaflopsLessons learned from the Petaflops ages
Using
Some
(Artificial)
Intelligence
AI can:
- recognize cats from dogs
- play Go much better than Lee Sedol
- analyze 3D data better than anyone
- …
AI can be used in HPC for:- pre-processing → data assimilation
- post-processing → data analysis
- accelerating computing
- empirical modeling
- …
Precision
Medicine
Autonomous
Driving
CFD
![Page 8: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/8.jpg)
Exascale is about new usagesUsage
o Infrastructure convergence
o HPC infrastructure – Density / Performance / Reliability
o Handle new paradigm for Bigdata / AI workloads
o Hybrid workloads to maximize insight
o AI augmented simulation
o Datalake – Data analysis
o Pre/post processing
o Data deluge
o Digital twins
o High energy physics – Astrophysics
o HPCaaS
o Private and public cloud
o Hybrid orchestrator
HPC, HTC,
Bigdata and AI
![Page 9: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/9.jpg)
Exascale is about new usagesUsage: Digital Twins
Couplingo Data collection
o Analysis
o Multi Physics simulation
![Page 10: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/10.jpg)
Exascale is about new usagesUsage: Precise Medicine
Predictive
Preventive
Personalised
Participatory
50% of the children born in 2018 will live to be 100 in most countries
![Page 11: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/11.jpg)
Exascale is about new usagesUsage: HPC in the loop – Edge Computing
o Much more data is produced than
could be stored
o in-flight pre-processing necessary
SKA: 250 ExaByte/year
![Page 12: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/12.jpg)
Exascale – European Initiatives
12
![Page 13: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/13.jpg)
European Exascale EffortThe European HPC Landscape is changing
![Page 14: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/14.jpg)
Code evolution and performance portability
Be ready for exascale
Weather and Climate, BioMedecine, CFD, etc…
Provide services to users community
Be recognized as an essential industrial partner
Consulting, Co-design, Benchmarking
14
HORIZON 2020EU Collaborative Projects (H2020)
Weather Forecast & ClimateDesigning next gen applications
Life ScienceBiomedical modelling community
Solid Earth, Energy, VVUQ,
Extreme DataStart, reinforce, prepare future needs
![Page 15: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/15.jpg)
▶ Develop the roadmap for the full length of the EPI Initiative
▶ Develop the first generation of technologies through a co-
design approach
▶ Tape-out of the first-generation chip by integrating the IPs
developed
▶ Validate this chip in the HPC context and in the automative
context using a demonstration platform
EPI Proposal for HPC
![Page 16: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/16.jpg)
16
Public Procurement of Innovations for High
Performance Computing
• In this project a group of leading European supercomputing centres decided the formation of a buyers group to execute a joint Public Procurement of Innovative Solutions
• The participants will work together on coordinated roadmaps for providing HPC resources optimized to the needs of European scientists and engineers.
• Energy efficiency and power management
• Data management
• Programming environment and productivity
• Data centre integration
• Maintenance and support
• System and application monitoring
• Security
FEATURES
OBJECTIVES
GOALS
![Page 17: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/17.jpg)
17
EuroHPC World-Class Supercomputer
▶ 8 hosting sites selected
▶ 19/28 countries
▶ 840 M€ co-investment
(EU/countries)
▶ Targeting 2020
▶ Requires diversity
![Page 18: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/18.jpg)
Exascale – Atos Technologies
18
![Page 19: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/19.jpg)
Exascale ChallengesTechnology Trends
Processing Data Management Networking
Energy Applications
![Page 20: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/20.jpg)
BullSequana X1000Open platform
ExascaleInterconnect
Energy & Performance Optimizer
2019 2020 2021 2022 2023
1018
Exascale
BullSequana XH2000
Data Management
High-speedEthernet
Atos Quantum Learning Machine
Atos HPC roadmap – New technologies
![Page 21: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/21.jpg)
Exascale ChallengesTechnology Trends
Processing
![Page 22: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/22.jpg)
Exascale ChallengesProcessors
Diversity of compute engine
o Adapted to specific needs
Higher and Higher TDP to manage
o Up to 800w ….
High Bandwidth Memory
o Targeting the 1:1 Byte/Flops
Interconnection of heterogeneous chips
o Coherent and non-coherent
Ideal HPC unit
Core
Core
Core
![Page 23: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/23.jpg)
Exascale ChallengesTechnology Trends
Data Management
![Page 24: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/24.jpg)
Different spaces for different needs
L2
Scalable storage
Lext
External storage
L3
Archive storage
L1
Processing storage
e.g. Parallel File System
e.g. HSM
e.g. Object Storage
Storage Manager
e.g. Cloud Storage
Exascale ChallengesData Management
![Page 25: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/25.jpg)
Exascale ChallengesTechnology Trends
Networking
![Page 26: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/26.jpg)
Exascale ChallengesNetworking
High Speed Interconnect requirements
▶ High bandwidth & low latency
o x20 bandwidth – x3 latency in 15 years
▶ Increase Resiliency , Availability and Serviceability features
o Larger fabric to manage
▶ Topology support - Scalability
o Adapted to size and performance requirements
▶ Routing algorithm – Mix and match
o Path relative to type of communication
▶ Adaptive routing & QoS
o Congestion management
▶ Interoperability – Open Standard
![Page 27: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/27.jpg)
Exascale ChallengesTechnology Trends
Energy
![Page 28: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/28.jpg)
Exascale ChallengesEnergy
Direct Liquid Cooling :
o Compute nodes (CPU, Memory, Drives, GPU)
o High Speed Interconnect: HDR and BXI switches (L1,L2,L3)
o Management network: Intra Rack management switches
o Power Supply Unit: DLC shelves
The only components in BullSequana XH2000 that are not liquid cooled are the pumps of the Hydraulic Chassis (HYC)
95% Efficiency:
o Warm water up to 40°C inlet
o Heat rejected in air is almost constant
Targeting > 45°C inlet
And
>98% efficiency ….
BullSequana XH2000: Fan less Innovative cooling solution ……
![Page 29: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/29.jpg)
Exascale ChallengesTechnology Trends
Applications
![Page 30: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/30.jpg)
Smart Hybrid ManagementUnify User experience
Bull Super Computing Stack
Integrate and manage HPC, Bigdata, AI and Cloud workflows
Atos codex AI Suite – Orchestrator
Smart Energy OptimizationMeasure the power consumption of your job through the
Bull Energy Optimizer (BEO)
Optimize the power consumption of your job through the
Bull Dynamic Power Optimizer (BDPO)
Smart IO ManagementManage IO performance through the
Bull IO Instrumentation (IOI)
Accelerate some I/O through the
Bull Fast IO Library (FIOL)
Exascale ChallengesApplications
![Page 31: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/31.jpg)
Exascale ChallengesCollaborating with Communities
Weather Forecast & ClimateDesigning next gen applications
Life ScienceBiomedical modelling community
Solid Earth, Energy, VVUQ, Extreme DataStart, reinforce, prepare future needs
![Page 32: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/32.jpg)
Exascale ChallengesSummary
Processing : Build open common platform for Hybrid processors
Data Management : Data Centric architecture
Networking : Backbone of modern Hybrid / Heterogenous supercomputer
Energy : Increase efficiency
Applications : Adapt for hybrid usage model – Collaboration , Center Of Excellence
![Page 33: HPC, AI & Quantum · storage L ext External storage L 3 Archive storage L 1 Processing storage e.g. Parallel File ... Integrate and manage HPC, Bigdata, AI and Cloud workflows Atos](https://reader034.fdocuments.in/reader034/viewer/2022042711/5f7e15a04b8ce31a02128c05/html5/thumbnails/33.jpg)
The Next Generation starts Today!