INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU...
Transcript of INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU...
![Page 1: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/1.jpg)
July 2020
INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY
![Page 2: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/2.jpg)
2
THE NEW SCIENTIFIC COMPUTING WORLD
NETWORK
EDGE
APPLIANCE
SUPERCOMPUTER
STORAGE
![Page 3: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/3.jpg)
3
HDR 200G INFINIBAND ACCELERATES NEXT GENERATION HPC AND AI SUPERCOMPUTERS (EXAMPLES)
8K HDR NodesDragonfly+ Topology
9 PetaFLOPS3K HDR NodesDragonfly+ Topology
7.5 PetaFLOPS2K HDR NodesDragonfly+ Topology
35.5 PetaFLOPS2K HDR NodesFat-Tree Topology
23 PetaFLOPS5.6K HDR NodesDragonfly+ Topology
HPC/AI CloudHDR InfiniBand
HDR Supercomputers
23.5 PetaFLOPS8K HDR NodesFat-Tree Topology
27.6 PetaFLOPS3K HDR NodesFat-Tree Topology
3K HDR Nodes16 PetaFLOPSDragonfly+ Topology
![Page 4: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/4.jpg)
4
INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS
GPU
CPU
DPU
Smart End-Point Architected to Scale Centralized Management Standard
![Page 5: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/5.jpg)
5
INFINIBAND ACCELERATED SUPERCOMPUTING
SHARP AI Technology
AI Acceleration Engines
2.5X Higher AI Performance
UFM Cyber AI
Data Center Cyber Intelligence and Analytics
Speed of Light
200Gb/s Data Throughput
RDMA and GPUDirect RDMA
3X Better (Lower) Latency
SHIELD AI Technology
Self Healing Network
1000X Faster Recovery Time
![Page 6: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/6.jpg)
6
THE NEW DATA CENTER Faster Data Speeds and In-Network Computing Enable Higher Performance and Scale
CPU-Centric (Onload)
Must Wait for the Data
Creates Performance Bottlenecks
Security Limitations
Onload Network
CPU
GPU
CPU
GPU
CPU
GPU
CPU
GPU
Data-Centric (Offload)
Analyze Data as it Moves!
Higher Performance and Scale
Secured Supercomputing
In-Network Computing
CPU
GPU
CPU
GPU
CPU
GPU
CPU
GPU
![Page 7: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/7.jpg)
7
Network
Communication
Application High Performance Computing
Data Analysis
Deep Learning
Cyber Security
In-Network Computing
NVMe, Containers, OpenStack
Storage / Other Resource Disaggregation
Full Network Transport Offload
RDMA and GPU-Direct RDMA
SHIELD (Self-Healing Network)
Enhanced Adaptive Routing and Congestion Control
Connectivity Ultimate Software Defined Network
Multi-Host and Socket-Direct Technology
Enhanced and Flexible Topologies
THE SMARTEST INTERCONNECT
![Page 8: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/8.jpg)
8
SCALABLE HIERARCHICAL AGGREGATION AND
REDUCTION PROTOCOL (SHARP)
![Page 9: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/9.jpg)
9
SCALABLE HIERARCHICAL AGGREGATION AND REDUCTION PROTOCOL (SHARP)
In-network Tree based aggregation mechanism
Multiple simultaneous outstanding operations
For HPC (MPI / SHMEM) and Distributed Machine Learning applications
Scalable High Performance Collective Offload
Barrier, Reduce, All-Reduce, Broadcast and more
Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND
Integer and Floating-Point, 16/32/64 bits
DataAggregated
AggregatedResult
Aggregated Result
Data
Switch Switch
Switch
HostHostHost Host Host
![Page 10: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/10.jpg)
10
SHARP ALLREDUCE PERFORMANCE ADVANTAGES Providing Flat Latency, 7X Higher Performance
![Page 11: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/11.jpg)
11
SHARP PERFORMANCE ADVANTAGE OVER ROCE4X Higher Performance
![Page 12: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/12.jpg)
12
INFINIBAND SHARP AI PERFORMANCE ADVANTAGE2.5X Higher Performance
![Page 13: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/13.jpg)
13
INFINIBAND ACCELERATED AI PLATFORMS
NVIDIA DGX A100 SuperPODWorld’s most Advanced AI System
AISTThe AI Bridging Cloud
Infrastructure
Microsoft Azure200 Gigabit HDR InfiniBand Boosts Microsoft Azure High-Performance
Computing Cloud Instances
ContinentalAdvanced Driver Assistance
Systems (ADAS)
![Page 14: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/14.jpg)
14
HDR INFINIBAND
![Page 15: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/15.jpg)
15
HDR 200G INFINIBAND SOLUTIONS (NOW)
Transceivers
Active Optical and Copper Cables
40 HDR (200Gb/s) InfiniBand Ports
80 HDR100 InfiniBand Ports
Modular Switch - 800 HDR (1600 HDR1000) Ports
200Gb/s Adapter
PCIe Gen4
Drivers, Management, Frameworks and Accelerations
UFM, UCX, MPI, SHMEM/PGAS, UPC
System on Chip and SmartNIC
Programmable adapter, Smart Offloads
![Page 16: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/16.jpg)
16
MELLANOX SKYWAY™ INFINIBAND TO ETHERNET GATEWAY
100G EDR / 200G HDR InfiniBand to 100G and 200G Ethernet gateway
400G NDR / 800G XDR InfiniBand speeds ready
Eight EDR/HDR100/HDR InfiniBand ports to eight 100/200G Ethernet
Max throughput of 1.6 Terabit per second
High availability and load balancing
Mellanox Gateway operating system
Scalable and efficient
![Page 17: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/17.jpg)
17
METROX®-2
Seamlessly connects InfiniBand data-centers up to 40 kilometers-apart
Scalability and load balancing across data-centers
Continues compute service in case of data-center failures
Standard HDR and EDR InfiniBand end-to-end
Advanced In-Network Computing
Extending InfiniBand to 40km Reach
![Page 18: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/18.jpg)
18
UFM INFINIBAND CYBER INTELLIGENCE AND
ANALYTICS PLATFORM
![Page 19: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/19.jpg)
19
REVOLUTIONIZING SUPERCOMPUTINGAI-Powered InfiniBand Cyber Intelligence and Analytics Platform
Management and Orchestration
Predictive and Preventive Maintenance
Telemetry and Monitoring Cyber-security and Anomaly Detection
Integration of Real-Time Telemetry with AI Algorithms to Secure Supercomputers, and Enable Predictive Maintenance for OPEX Optimizations
![Page 20: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/20.jpg)
20
UFM PLATFORMS PORTFOLIO
UFM Cyber-AICyber Intelligence and Analytics
(UFM Cyber-AI includes UFM Enterprise)
UFM EnterpriseManagement, Monitoring & Orchestration
(UFM Enterprise includes UFM Telemetry)
UFM Telemetry Real-Time Monitoring
![Page 21: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/21.jpg)
21
UFM DASHBOARD
Secure Cable Management
Network Validation
Performance MonitoringReal-Time AnalysisPrediction Dashboard
Congestion Mapping Health Reports Inventory Mapping
![Page 22: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/22.jpg)
22
SUMMARY
![Page 23: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/23.jpg)
23NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
INFINIBAND DELIVERS HIGHEST PERFORMANCE AND ROI
200G end-to-end, extremely low latency, high message rate, RDMA and GPUDirect
Advanced adaptive routing, congestion control and quality of service for highest network efficiency
In-Network Computing engines for accelerating applications performance and scalability
Self Healing Network with SHIELD for highest network resiliency
Standard - backward and forward compatibility – protecting datacenter investments
InfiniBand
NVMe / Storage
InfiniBand High Speed Network
Advanced In-Network Computing
Extremely Low Latency
Ethernet
NVMe / Storage
High Speed Gateway
InfiniBand to Ethernet
Compute Servers
InfiniBand
Long-Haul InfiniBand
![Page 24: INFINIBAND IN-NETWORK COMPUTING TECHNOLOGY · INFINIBAND NETWORK TECHNOLOGY FUNDAMENTALS GPU CPU DPU Smart End-Point Architected to Scale Centralized Management Standard. 5 INFINIBAND](https://reader035.fdocuments.in/reader035/viewer/2022063023/5ffedd617b9fbe09b1488228/html5/thumbnails/24.jpg)