Introduction to Ip Address Management PDF Amazon | Ip Address
Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name...
Transcript of Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name...
![Page 1: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/1.jpg)
Update on
Scalable SA Project
#OFADevWorkshop
Hal Rosenstock
Mellanox Technologies
![Page 2: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/2.jpg)
The Problem And The Solution
• SA queried for every connection
• Communication between all nodes creates an n2
load on the SA • In InfiniBand architecture (IBA), SA is a centralized entity
• Other n2 scalability issues – Name to address (DNS)
• Mainly solved by a hosts file
– IP address translation • Relies on ARPs
• Solution: Scalable SA (SSA) – Turns a centralized problem into a distributed one
March 30 – April 2, 2014 #OFADevWorkshop 2
n^2 SA load
![Page 3: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/3.jpg)
Analysis
March 30 – April 2, 2014 #OFADevWorkshop 3
SM SA
500 MB
1.6 billion
path records
40,000 nodes
50k queries
per second
~ 9 hours
~ 1.5 hours
calculation
![Page 4: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/4.jpg)
SSA Architecture
March 30 – April 2, 2014 #OFADevWorkshop 4
Localized caching
Data Processing
Database replication
Management Core
Distribution
Access
Client Client
Access
Client Client
Distribution
Access
Client
![Page 5: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/5.jpg)
Distribution Tree
• Built with rsockets AF_IB support
• Parent selected based on “nearness” based on
hops as well as balancing based on fanouts
March 30 – April 2, 2014 #OFADevWorkshop 5
![Page 6: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/6.jpg)
rsockets AF_IB rsend/rrecv
performance
• On “luna” class machines as sender and receiver
with 4x QDR links and 1 intervening switch
– 8 core Intel(R) Xeon(R) CPU E5405 @ 2.00GHz
• Default rsocket tuning parameters
• No CPU utilization measurements yet
• SMDB: ~0.5 GB (for 40K nodes)
March 30 – April 2, 2014 #OFADevWorkshop 6
Data Transfer Size in Bytes Elapsed Time
0.5 GB 0.669 seconds
1.0 GB 1.342 seconds
![Page 7: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/7.jpg)
Distribution Tree
• Number of management nodes needed is
dependent on subnet size and node capability
(CPU speed, memory)
– Combined nodes
• Fanouts in distribution tree for 40K compute
nodes
– 10 distribution per core
– 20 access per distribution
– 200 consumer per access
March 30 – April 2, 2014 #OFADevWorkshop 7
![Page 8: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/8.jpg)
Core Layer
March 30 – April 2, 2014 #OFADevWorkshop 8
SM SM’
Nodes join SSA tree
Core found at SM LID
raw SM DB SSA DB
extraction and comparison
Manage SSA group - distribution control - monitoring - rebalancing
![Page 9: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/9.jpg)
Core Performance
• Initial subnet up for ~20K nodes fabric
– Extraction: 0.228 sec
– Comparison: 0.599 sec
• SUBNET UP after no change in fabric
– Extraction: 0.152 sec
– Comparison: 0.100 sec
• SUBNET UP after single switch unlink and relink
– Extraction: 0.190 sec
– Comparison: 0.865 sec
• Measurements above on Intel(R) Xeon(R) CPU E5335 @ 2.00GHz
8 cores & 16G RAM
March 30 – April 2, 2014 #OFADevWorkshop 9
![Page 10: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/10.jpg)
Distribution Layer
March 30 – April 2, 2014 #OFADevWorkshop 10
SM SM’
Transaction log - incremental updates - lockless
Data agnostic Distributes SSA DB
- relational data model - data versioning (epoch value)
![Page 11: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/11.jpg)
Access Layer
March 30 – April 2, 2014 #OFADevWorkshop 11
SM SM’ Epoch value - lightweight notification - minimal job impact
Data aware
Formats data - select SA queries - higher-level queries
![Page 12: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/12.jpg)
Access Layer Notes
• Calculates SMDB into PRDB on per consumer
basis
– Multicore/CPU computation
• Only updates epoch if PRDB for that consumer
has changed
March 30 – April 2, 2014 #OFADevWorkshop 12
![Page 13: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/13.jpg)
Access Layer Measurements/Future
Improvement(s)
• Half world (HW) PR calculations for 10K node
simulated subnet
• Using GUID buckets/core approach, parallelizing
HW PR calculation works ~16 times faster on 16
core CPU – Single threaded takes 8 min 30 sec for all nodes
– Multi threaded (thread per core) takes 33 seconds
– Parallelization will be less than linear with CPU cores
• Future Improvement(s) – One HW path record per leaf switch used for all the hosts that
are attached to the same leaf switch
March 30 – April 2, 2014 #OFADevWorkshop 13
![Page 14: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/14.jpg)
Compute Nodes
(Consumer/ACM)
March 30 – April 2, 2014 #OFADevWorkshop 14
SM SM’
Localized cache - compares epoch - pull updates
Integrated with IB ACM - via librdmacm
Publish local data - hostname - IP addresses
![Page 15: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/15.jpg)
ACM Notes
• ACM pulls PRDB at daemon startup and when
application is resolving routes/paths
– Minimize OS jitter during running job
• ACM is moving to plugin architecture
– ACM version 1 (multicast backend)
– SSA backend
• Other ACM improvements being pursued
– More efficient cache structure
– Single underlying PathRecord cache ?
March 30 – April 2, 2014 #OFADevWorkshop 15
![Page 16: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/16.jpg)
Combined Node/Layer Support
• Core and access
• Distribution and access
March 30 – April 2, 2014 #OFADevWorkshop 16
![Page 17: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/17.jpg)
Reliability
March 30 – April 2, 2014 #OFADevWorkshop 17
SM SM’
Local databases - log files for consistency
Primary and backup parents
Error reporting - parent notifies core of error
![Page 18: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/18.jpg)
System Requirements
• AF_IB capable kernel
– 3.11 and beyond
• librdmacm with AF_IB and keepalive support
– Beyond 1.0.18 release
• libibverbs
• libibumad
– Beyond 1.3.9 release
• OpenSM
– 3.3.17 release or beyond
March 30 – April 2, 2014 #OFADevWorkshop 18
![Page 19: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/19.jpg)
OpenMPI
• RDMA CM AF_IB connector contributed to
master branch recently
– Thanks to Vasily Filipov @ Mellanox
– Need to work out release details
• Not in 1.7 or 1.6 releases
March 30 – April 2, 2014 #OFADevWorkshop 19
![Page 20: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/20.jpg)
Deployment
March 30 – April 2, 2014 #OFADevWorkshop 20
SM SA
IB ACM
Shipped by distros
IB SSA
Core package
IB SSA
Distribution
package
Mgmt Nodes
Compute Nodes
![Page 21: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/21.jpg)
Project Team
• Hal Rosenstock (Mellanox) - Maintainer
• Sean Hefty (Intel)
• Ira Weiny (Intel)
• Susan Coulter (LANL)
• Ilya Nelkenbaum (Mellanox)
• Sasha Kotchubievsky (Mellanox)
• Lenny Verkhovsky (Mellanox)
• Eitan Zahavi (Mellanox)
• Vladimir Koushnir (Mellanox)
March 30 – April 2, 2014 #OFADevWorkshop 21
![Page 22: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/22.jpg)
Development
• Mostly by Mellanox
– Review by rest of project team
• Verification/regression effort as well
March 30 – April 2, 2014 #OFADevWorkshop 22
![Page 23: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/23.jpg)
Initial Release
• Path Record Support
• Limitations (Not Part of Initial Release)
– QoS routing and policy
– Virtualization (alias GUIDs)
• Preview – June
• Release - December
March 30 – April 2, 2014 #OFADevWorkshop 23
![Page 24: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/24.jpg)
Future Development Phases
1. IP address and name resolution
1. Collect <IP address/name, port> up SSA tree
2. Redistribute mappings
3. Resolve path records directly from IP
address/names
2. Event collection and reporting
1. Performance monitoring
March 30 – April 2, 2014 #OFADevWorkshop 24
![Page 25: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/25.jpg)
Summary
• A scalable, distributed SA
• Works with existing apps with minor modification
• Fault tolerant
• Please contact us if
interested in deploying this!
March 30 – April 2, 2014 #OFADevWorkshop 25
![Page 26: Update on Scalable SA Project - OpenFabrics AllianceFuture Development Phases 1. IP address and name resolution 1. Collect up SSA tree 2. Redistribute](https://reader033.fdocuments.in/reader033/viewer/2022052023/6038fa42ecf2b977df7c66fd/html5/thumbnails/26.jpg)
#OFADevWorkshop
Thank You