Congestions on statically routed InfiniBand networks
Transcript of Congestions on statically routed InfiniBand networks
Congestions on statically routed
InfiniBand networks
Branislav Jansík
IT4Innovations national supercomputing center
Mission and Vision
Mission Our mission is to deliver scientifically excellent and industry relevant research in the fields of high performance computing and embedded systems. We are providing state-of-the-art technology and expertise in high performance computing and embedded systems and make it available for Czech and international research teams from academia and industry.
Vision To became top European Centre of Excellence in IT with the emphasis on high performance computing and embedded systems. With our research, know-how and infrastructure we aspire to improve the quality of life, to increase the competitiveness of industrial sector and to promote the cross-fertilization of high-performance computing, embedded systems and other scientific and technical disciplines.
The IT4Innovations national supercomputing
center
Prague
Brno
Ostrava
2013
2014
2015
94 TFLOPs system – most powerful supercomputer in Czech Republic, #6 in Central Europe (June 2013), operational since June 2013, 2 PFLOPs system in summer 2015
Our research helping the Czech Republic
(national grants and bilateral coop.)
VSB-TUO, CVUT, VUT, CDV, CAMEA, CEDA, CE-Traffic, ELTODO, Kapsch, KVADOS
MOLDIMED
FN OL, FN Brno, MU, VSB-TUO, UMG AV CR, IntellMed, GENERI BIOTECH, Sofigen, IAB, CGB lab, EXBIO Praha
Two [Zr4(OH)14(H2O)10]2+
tetrameric cations, two Mg2+
cations and ten water molecules in
the interlayer space of
vermiculite.
IT4Innovations Anselm
45%
17%
16%
8%
4% 3%
3% 1% 3%
Resource usage per domain Computational Chemistry
Physics
Engineering
Earth Sciences
Comp. Fluid Dynamics
Meteorology
Computational mathematics
Hydrology
Other
Anselm got #1 in Czech Republic
Anselm in numbers
• 209 compute nodes
• 3344 Intel Sandybridge cores
• 64 - 96 GB RAM 24 Nvidia
Tesla K20
• 4 Intel Xeon Phi (240 cores)
Salomon in numbers
• 1008 compute nodes
• 24192 Intel Haswell cores
• 128 GB RAM per node
• 864 Intel Xeon Phi 7120P (52704 cores)
Salomon vs. Anselm
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Anselm
Velký cluster
Rpeak CPU
Rpeak GPU
Rpeak MIC
TFLOPS
Salomon
HPC network performance
Characteristics and performance metrics • Connectivity, Distance, Diameter • Latency • Bisection bandwidth (Limits all-to-all comm,
FFT, Matmul, Sorting, etc…)
Very important • Affects tightly coupled parallel appications • Large part of the HPC system budget
Oblivious routing and InfiniBand
InfiniBand • Subnet manager discovers topology and
compute routes • Linear Forwarding Tables written on the
switches
Oblivious routing • Static, no regard for traffic demands • Connectivity can not be fully exploited
Congestion on InfiniBand network
• Traffic on 1 -> 6 collision with 4 -> 14 • Bandwidth affected !
Slide courtesy Torsten Hoefler, Timo Schnieider and Andrew Lumsdiane, MINs are not Corssbars
Effective bisection bandwidth
Bisection patterns
•𝑃𝑃/2
ways to partition P nodes
• (P/2)! ways to pair P/2 nodes • Huge number of patterns
Effective performance • Simulate (ORCS, T. Schneider et al. 2009) • Sampling • Many patterns are FBB
The systems tested
• CARTESIUS, SURFsara , Netherlands FDR InfiniBand, 3x1:1 fat tree islands, 1:3.33 between islands.
• C07, Skoda Auto, Czech Republic QDR InfiniBand, All-to-all topology, 8x18 port switches, quadruple links
• ANSELM, IT4Innovations QDR InfiniBand, 1:1 fat tree
CARTESIUS, SURFsara
Frac
tio
n
Bandwidth (GB/s)
100 samples, 320 nodes
Frac
tio
n
Performance
Link speed Bisection bandwidth
Mean: 0.73 Std: 0.02
C07, Skoda Auto
Frac
tio
n
Bandwidth (GB/s)
529 samples, 64 nodes
Frac
tio
n
Performance
Link speed Bisection bandwidth
Mean: 0.88 Std: 0.05
ANSELM, IT4I
Frac
tio
n
Bandwidth (GB/s)
183 samples, 92 nodes
Frac
tio
n
Performance
Link speed Bisection bandwidth
Mean: 0.88 Std: 0.04
Conclusions
• Static routing on HPC IB networks
strongly affects link and bisection
bandwidth performance
• Congestions do occur, affecting
application bandwidth and load
balancing
• Almost all topologies are affected, no
significant difference found between
fat tree and all-to-all