A Self-adaptive network for Big Data...
Transcript of A Self-adaptive network for Big Data...
![Page 1: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/1.jpg)
Feroz ZahidSimula Research Laboratory
Advisors:Ernst Gunnar GranTor Skeie
SC ’16 Doctoral Showcase
Salt Lake City, UT, USA
November 15, 2016
Realizing a Self-Adaptive Network Architecturefor HPC Clouds
1
![Page 2: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/2.jpg)
This presentation will walk through my doctoral work covering our contributions, and ‘the big picture’ ahead
Approach and Contributions
Motivation and Challenges
The Big Picture
2
![Page 3: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/3.jpg)
InfiniBand (IB) is a popular interconnect for HPC systems
Source: Top500 Supercomputers List, http://top500.org/
40.8% share in June 2016 top supercomputers list
3
![Page 4: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/4.jpg)
A whole array of challenges need to be addressed to realize a self-adaptive HPC cloud based on feedback-control loop
4
In this work, the focus has been on the network architecture for HPC clouds
![Page 5: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/5.jpg)
To fully utilize the interconnection network, the network architecture must coordinate with the upper layers of cloud
5
![Page 6: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/6.jpg)
We use a bottom-up approach, and first attack individual research challenges associated with HPC cloud networks
• High Network Utilization and Better Load-Balancing• Weighted fat-tree routing algorithm (wFatTree)
• Multi-tenancy and Network Isolation• Partition-aware fat-tree routing (pFTree)
• Fast Network Reconfiguration• SlimUpdate routing algorithm (SlimUpdate)• Metabase-aided reconfiguration method
• Efficient Virtualization• Routing for virtualized subnets
We uses OFED, de-facto standard software stack for IB, and Fat-Tree topology for our prototypes
6
![Page 7: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/7.jpg)
Challenge 1: Efficient Network Utilization
[1] A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in InfiniBand Enterprise Clusters. Zahid, Feroz et al., PDP, 2015.
The wFatTree routing algorithm considers node traffic characteristics to balance load across the network links more efficiently
De-facto Fat-Tree Routing The wFatTree Routing
Wt: 100 100
7
![Page 8: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/8.jpg)
Challenge 1: Efficient Network Utilization
[1] A Weighted Fat-Tree Routing Algorithm for Efficient Load-Balancing in InfiniBand Enterprise Clusters. Zahid, Feroz et al., PDP, 2015.
The wFatTree routing algorithm considers node traffic characteristics to balance load across the network links more efficiently
18 switches with rcv nodes 27 switches with rcv nodes
36 switches with rcv nodes8
![Page 9: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/9.jpg)
Challenge 2: Tenant Performance Isolation
[2] Partition-aware Routing to Improve Network Isolation in Multi-tenant Clusters. Zahid, Feroz et al., CCGrid, 2015.
Traditional fat-tree routing in multi-tenant clusters suffers with degraded load balancing and no isolation between partitions
Degraded Load Balancing No Isolation Between Partitions
9
![Page 10: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/10.jpg)
Challenge 2: Tenant Performance Isolation
[2] Partition-aware Routing to Improve Network Isolation in Multi-tenant Clusters. Zahid, Feroz et al., CCGrid, 2015.
The pFTree routing algorithm isolates partitions in a multi-tenant cluster without compromising on the load-balancing
Non-oversubscribed Topology Oversubscribed Topology
10
![Page 11: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/11.jpg)
Challenge 3: Fast Network Reconfiguration
[3] SlimUpdate: Minimal Routing Update for Performance-Based Recongurations in Fat-Trees, Zahid, Feroz et al., HiPINEB 2015.
Minimal Routing Update (MRU) technique tends to preserve the configured paths in the network on a reconfiguration event
Nodes Shutdown
Link Failure
11
![Page 12: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/12.jpg)
Challenge 3: Fast Network Reconfiguration
[3] SlimUpdate: Minimal Routing Update for Performance-Based Recongurations in Fat-Trees, Zahid, Feroz et al., HiPINEB 2015.
SlimUpdate Routing algorithm utilizes MRU technique, and saves up to 80% path updates
Name # Nodes Topology
A 16 4-ary-2-tree
B 32 4-ary-2-tree oversub
C 64 4-ary-3-tree
D 128 4-ary-3-tree oversub
E 64 8-ary-2-tree
F 128 8-ary-2-tree oversub
G 256 16-ary-2-tree
H 512 16-ary-2-tree oversub
12
![Page 13: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/13.jpg)
Challenge 3: Fast Network Reconfiguration
[4] Compact Network Reconfiguration in Fat-Trees, Zahid, Feroz et al., The Journal of Supercomputing, 2016.
In metabase-aided reconfiguration method, routing is divided into two distinct phases: calculation of paths, and assignment of paths to the actual destinations
Phase I: Calculation of paths Phase II: Assignment of Paths
13
![Page 14: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/14.jpg)
Challenge 3: Fast Network Reconfiguration
[4] Compact Network Reconfiguration in Fat-Trees, Zahid, Feroz et al., The Journal of Supercomputing, 2016.
Metabase-aided routing substantially reduces network reconfiguration time on performance-based reconfigurations
Non-oversubscribed Topologies Oversubscribed Topologies
14
![Page 15: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/15.jpg)
Challenge 4: Efficient Virtualization
[5] Towards InfiniBand SR-IOV vSwitch Architecture, Tasoulas, Evangelos et al., IEEE Cluster, 2015.
The vSwitch Architecture has an advantage over shared-port architecture that it allows configuring routes for the individual VMs in the subnet (but bloats LID space); hybrid models can save LIDs
15
![Page 16: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/16.jpg)
Challenge 4: Efficient Virtualization
[6] Towards Efficient Virtualization in HPC Environments. Tasoulas, Evangelos, Zahid, Feroz et al., Submitted to an Internatioal Journal.
The vSwitchFatTree routing considers VMs in the subnet
(a) (b)
(c) (d)
16
![Page 17: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/17.jpg)
[7] Efficient Network Isolation and Load-balancing in Multi-tenant HPC Cluster, Zahid, Feroz et al., Future Generation Computer Sys, 2016.
Weighted pFTree routing (pFTree-Wt) can substantially reduce contention in a partitioned subnet
Big Picture: Enable smart network provisioning for the HPC clouds – combine individual contributions
17
![Page 18: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/18.jpg)
Big Picture: Enable smart network provisioning for the HPC clouds – combine individual contributions
Weighted Routing
Balanced TrafficBetter Routes
Optimized Algorithms Partition-aware Routing
Multi-tenancy
Adjust for Load/Faults
Dynamic Optimizations
Monitor->Optimize->Execute Loop
18
![Page 19: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/19.jpg)
Big Picture: A Self-Adaptive Network Architecture
19
![Page 20: A Self-adaptive network for Big Data Cloudssc16.supercomputing.org/.../doctoral_showcase/doc_files/drs114s2-f… · Challenge 2: Tenant Performance Isolation [2] Partition-aware Routing](https://reader036.fdocuments.in/reader036/viewer/2022071217/604a8fd40b1ee43ce670c857/html5/thumbnails/20.jpg)
Thanks for your attention!
State-of-the-art network architecture with static
configurations
A Self-adaptive network architecture enabling dynamic
HPC clouds
In summary, a self-adaptive network architecture can make HPC clouds fully utilize underlying interconnection network
20