Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems...

19
1 © 2018 Mellanox Technologies | Confidential Interconnect Topology Considerations Applied To Differing Applications And Clusters December 2018 Interconnect Your Future

Transcript of Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems...

Page 1: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

1© 2018 Mellanox Technologies | Confidential

Interconnect Topology Considerations Applied To Differing Applications And ClustersDecember 2018

Interconnect Your Future

Page 2: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

2© 2018 Mellanox Technologies | Confidential

Mellanox Accelerates Leading HPC and AI SystemsWorld’s Top 3 Supercomputers

Summit CORAL SystemWorld’s Fastest HPC / AI System9.2K InfiniBand Nodes

Sierra CORAL System#2 USA Supercomputer 8.6K InfiniBand Nodes

1 2Wuxi Supercomputing CenterFastest Supercomputer in China41K InfiniBand Nodes

3

Page 3: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

3© 2018 Mellanox Technologies | Confidential

Mellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

Mellanox InfiniBand and Ethernet Accelerate World-Leading Supercomputers on the Nov’18 TOP500 List

InfiniBand is the Interconnect of Choice for HPC and AI Infrastructures

Mellanox Ethernet is the Interconnect of Choice for Cloud and Hyperscale Platforms

InfiniBand accelerates the fastest HPC and AI supercomputer in the world – Oak Ridge National Laboratory ‘Summit’ system

InfiniBand accelerates the top 3 supercomputers in the world - #1 (USA), #2 (USA), #3 (China)

InfiniBand connects 135 supercomputers, or nearly 55% of overall HPC systems on the TOP500 list

InfiniBand is the most used high-speed interconnect for the TOP500 systems

Mellanox connects 130 Ethernet systems (25 Gigabit and faster), or 51% of total Ethernet systems

The TOP500 list has evolved to include both HPC and cloud / hyperscale (non-HPC) platforms

Nearly half of the platforms on the TOP500 list can be categorized as non-HPC application platforms (mostly Ethernet-based)

Page 4: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

4© 2018 Mellanox Technologies | Confidential

Higher Data SpeedsFaster Data Processing

Better Data Security

Adapters SwitchesCables &

Transceivers

SmartNIC System on a Chip

HPC and AI Needs the Most Intelligent Interconnect

Page 5: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

5© 2018 Mellanox Technologies | Confidential

Adapters

Switch

Interconnect

Transceivers

Active Optical and Copper Cables

(10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s)

40 HDR (200Gb/s) Ports

80 HDR100 (100Gb/s) Ports

16Tb/s Throughput, 15.6 Billion msg/sec

200Gb/s, 0.6us Latency

215 Million Messages per Second

(10 / 25 / 40 / 50 / 56 / 100 / 200Gb/s)

HDR 200G InfiniBand Accelerates Next Generation HPC/AI Systems

Highest Performance HDR 200G InfiniBand

Page 6: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

6© 2018 Mellanox Technologies | Confidential

The Need for Intelligent and Faster Interconnect

CPU-Centric (Onload) Data-Centric (Offload)

Must Wait for the DataCreates Performance Bottlenecks

Faster Data Speeds and In-Network Computing

Enable Higher Performance and Scale

GPU

CPU

GPU

CPU

Onload Network In-Network Computing

GPU

CPU

CPU

GPU

GPU

CPU

GPU

CPU

GPU

CPU

CPU

GPU

Analyze Data as it Moves!Higher Performance and Scale

Page 7: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

7© 2018 Mellanox Technologies | Confidential

In-Network Computing to Enable Data-Centric Data Centers

GPU

CPU

GPU

CPU

GPU

CPU

CPU

GPU

GPUDirect

RDMA

Scalable Hierarchical

Aggregation and

Reduction Protocol

NVMeOver

Fabrics

Page 8: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

8© 2018 Mellanox Technologies | Confidential

In-Network Computing

Self-Healing Technology

In-Network Computing

Unbreakable Data Centers

Delivers Highest Application Performance

GPUDirect™ RDMA

Critical for HPC and Machine Learning ApplicationsGPU Acceleration Technology

10X Performance AccelerationCritical for HPC and Machine Learning Applications

35XFaster Network Recovery5000X

10X Performance Acceleration

Performance Acceleration

In-Network Computing Delivers Highest Performance

Scalable Hierarchical

Aggregation and

Reduction Protocol

Page 9: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

9© 2018 Mellanox Technologies | Confidential

Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)

Reliable Scalable General Purpose Primitive In-network Tree based aggregation mechanism Large number of groups Multiple simultaneous outstanding operations

Applicable to Multiple Use-cases HPC Applications using MPI / SHMEM Distributed Machine Learning applications

Scalable High Performance Collective Offload Barrier, Reduce, All-Reduce, Broadcast and more Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND Integer and Floating-Point, 16/32/64 bits

SHArP Tree

SHARP Tree Aggregation Node

(Process running on HCA)

SHARP Tree Endnode

(Process running on HCA)

SHARP Tree Root

Page 10: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

10© 2018 Mellanox Technologies | Confidential

10X Higher Performance with GPUDirect™ RDMA

Accelerates HPC and Deep Learning performance

Lowest communication latency for GPUs

GPUDirect™ RDMA

Page 11: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

11© 2018 Mellanox Technologies | Confidential

Network Topologies

Page 12: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

12© 2018 Mellanox Technologies | Confidential

Supporting Variety of Topologies

Torus DragonflyFat Tree Hypercube

Page 13: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

13© 2018 Mellanox Technologies | Confidential

Traditional Dragonfly vs Dragonfly+

Dragonfly+s

3

1

2 l1

s

3

1

2 l1

s

3

1

2 l1

s

3

1

2 l1

Page 14: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

14© 2018 Mellanox Technologies | Confidential

HCA

x 20

1 2 20

HCA

x 20

HCA

x 20

3.1 3.2 3.20

HCA

x 20

1 2 20

HCA

x 20

HCA

x 20

2.1 2.2 2.20

Dragonfly+ Topology

Several “groups”, connected using all to all links

The topology inside each group can be any topology

Reduce total cost of network (fewer long cables)

Utilizes Adaptive Routing to for efficient operations

Simplifies future system expansion

Full-Graph connecting

every group to all

other groups

Group 1

1 2 H

Group 2

H+1 H+2 2H

Group G

GH

BB

B

B

L

1200-Nodes Dragonfly+ Systems Example

HCA

x 20

1 2 20

HCA

x 20

HCA

x 20

1.1 1.2 1.20

G1 G2 G3

Page 15: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

15© 2018 Mellanox Technologies | Confidential

Dragonfly+ Topology

Several “groups”, connected using all to all links

The topology inside each group can be any topology

Reduce total cost of network (fewer long cables)

Utilizes Adaptive Routing to for efficient operations

Simplifies future system expansion

Full-Graph connecting

every group to all

other groups

Group 1

1 2 H

Group 2

H+1 H+2 2H

Group G

GH

BB

B

B

L

1.1

2.1

3.1

1.2

2.23

.2

1.2

0

2.20

3.2

0

1200-Nodes Dragonfly+ Systems Example

HCA

x 20

1 2 20

HCA

x 20

HCA

x 20

HCA

x 20

1 2 20

HCA

x 20

HCA

x 20HCA

x 20

1 2 20

HCA

x 20

HCA

x 20

10

Page 16: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

16© 2018 Mellanox Technologies | Confidential

1 112

1

20HCA

x 20

2 20

20

1

20HCA

x 20

2 20

20

1

20HCA

x 20

2 20

20

Future Expansion of Dragonfly+ Based System

Topology expansion of a Fat Tree, or a regular/Aries like Dragonfly requires one of the following Reduction of early phase bisection bandwidth due to reservation of ports on the network switches Re-cabling the long cables

Dragonfly+ is the only topology that allows system expansion at zero cost While maintaining bisection bandwidth No port reservation No re-cabling

1.2

0

2.20

21.2

01.2

2.2

21

.2

1.1

2.1

21

.1

Phase 1:

11x400 =

4400 hosts

Page 17: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

17© 2018 Mellanox Technologies | Confidential

1 112

1

20HCA

x 20

2 20

20

1

20HCA

x 20

2 20

20

1

20HCA

x 20

2 20

20

Future Expansion of Dragonfly+ Based System

1.1

1.2

0

1.2

2.12.202.2

21

.1

21.2

0

21

.2

1221

1

20HCA

x 20

220

20

1

20HCA

x 20

220

20

21.1 12.121.20 12.2021.2 12.2

Re-cable the central racks,

a change local to the RACK

Phase 1:

11x400 =

4400 hosts

Phase 2:

+10x400 =

8400 hosts

Page 18: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

18© 2018 Mellanox Technologies | Confidential

Questions?

Page 19: Interconnect Your Future - SCDMellanox connects 53% of overall TOP500 platforms or 265 systems (InfiniBand and Ethernet), Demonstrating 38% Growth in 12 months (Nov’17-Nov’18)

19© 2018 Mellanox Technologies | Confidential

Thank You