Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1...
Transcript of Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1...
![Page 1: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/1.jpg)
![Page 2: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/2.jpg)
Lihua YuanPartner Dev Manager
Innovating the Cloud Network
Xin LiuPrincipal Product Manager
Microsoft Azure Networking Team
![Page 3: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/3.jpg)
More apps SNMP BGP DHCP IPv6
SYNCD
LLDP
RedisDB
TeamD
New New
![Page 4: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/4.jpg)
![Page 5: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/5.jpg)
Welcome
![Page 6: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/6.jpg)
SONiC Keeps Evolving
Data Plane
• L3 VxLan Support
• Large Table/Deep Buffer Devices
Routing Stack
• Quagga → FRR
• cRPD from Juniper
Telemetry
• gRPC for streaming telemetry
• Dataplane Telemetry (Dtel) extension
• Virtual Path for streaming telemetry
Reliability
• Warm Reboot
• Routing stack graceful start
RDMA
• PFC Watermark
• Asymmetric PFC
Configuration
• Incremental config
• ConfigDB
Platform Management
• Sensor/Transceiver monitoring
• Dynamic Parameter Tuning
• Platform Enhancement (PMON)
New Platforms
• Juniper PTX
• Broadcom TH3, JR2
• Mellanox Spectrum II
• Facebook Mini-pack
• Marvell 12.8T Falcon and ARM based switch
• Innovium Teralynx
• And more
System
• Kernel Upgrade
• Component docker upgrade
• Security patches
![Page 7: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/7.jpg)
![Page 8: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/8.jpg)
Warm Boot: A True Community Effort
![Page 9: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/9.jpg)
Fast Boot
OS Reboot
(kexec)
OS Boots
up
Data Plane
Reset
Data Plane Restored
Routing
Control plane
Data Plane
Data plane disruption < 30 seconds
Control plane disruption < 90 seconds
![Page 10: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/10.jpg)
Warm Boot
Control plane disruption < 90 seconds
Data plane disruption < 1 second
O.S Reboot SONiC
Starts
ASIC
Warm
InitState Reconciliation, via SAI state-driven API
Warm Reboot
Finishes
Routing
Control plane
Data Plane
![Page 11: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/11.jpg)
Warm Boot Architecture
1. Warm boot script stores App/ASIC DB on disc2. Redis restores App/ASIC DB after reboot3. OA reads AppDB and compiles a new ASIC DB4. SyncD compares old/new ASIC DB, and apply
diff to the ASIC5. Applications waking up in parallel• May staged changes to App DB• OA comes in as usual, updates ASIC dB• SyncD keeps syncing ASIC DB to hardware
APP
DB
ASIC
DB
Ob
ject
Lib
rary
w/
Redis
Backen
d
ASIC
SAI
Network
Applications
SyncD
Orchestration Agent
![Page 12: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/12.jpg)
![Page 13: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/13.jpg)
We are not done yet – Control Plane?
O.S Reboot SONiC
Starts
ASIC
Warm
InitState Reconciliation, via SAI
state-driven API
Warm Reboot
Finishes
Routing
Control plane
Data Plane
What about ARP, DHCP, etc.?
![Page 14: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/14.jpg)
Control Plane Assistant (Upcoming)
CPU
ASIC
Send
up
AssistantCPU
ASIC
tunnel
to A
• ASIC → Assistant:
• ERSPAN mirror
• Assistant → ASIC:
• Assistant encap the payload meant for neighbors
• ASIC decap and forward
![Page 15: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/15.jpg)
SONiC Support for Disaggregated Chassis
![Page 16: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/16.jpg)
SONiC Is Powering Microsoft At Cloud Scale
T2-1-1 T2-1-2 T2-1-8
T3-1 T3-2 T3-3 T3-4
Tier 1 – Row Leaf
T2-4-1 T2-4-2 T2-4-4Tier 2 – Data center
T1-1 T1-8T1-7…
T1-2
… …
Tier 3 – Regional
…
T1-1 T1-8T1-7…
T1-2 T1-1 T1-8T1-7…
T1-2
Tier 0 – Rack
…T0-1 T0-2 T0-20
Servers
…T0-1 T0-2 T0-20
…T0-1 T0-2 T0-20
SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC
SONiC SONiC SONiC SONiC SONiC SONiCSONiC SONiC SONiC
Servers Servers
![Page 17: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/17.jpg)
Enabling SONiC Beyond Tier 1?
T2-1-1 T2-1-2 T2-1-8
T3-1 T3-2 T3-3 T3-4
Tier 1 – Row Leaf
T2-4-1 T2-4-2 T2-4-4Tier 2 – Data center
T1-1 T1-8T1-7…
T1-2
… …
Tier 3 – Regional
…
T1-1 T1-8T1-7…
T1-2 T1-1 T1-8T1-7…
T1-2
Tier 0 – Rack
…T0-1 T0-2 T0-20
Servers
…T0-1 T0-2 T0-20
…T0-1 T0-2 T0-20
SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC SONiC
SONiC SONiC SONiC SONiC SONiC SONiCSONiC SONiC SONiC
Servers Servers
Chassis
![Page 18: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/18.jpg)
Chassis – the challenges
Frontend
ASICFrontend
ASIC
Frontend
ASIC
Frontend
ASIC
Ethernet ports
Linecards
Backend
ASIC
Backend
ASICBackplane
Sheet
Metal
- Power efficiency
- Port density
- Low table scale on backend ASICs
- No standard topology/connectivity
- Proprietary ports/packet format
- Proprietary switching/load balancing
![Page 19: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/19.jpg)
SONiC Support for Disaggregated Chassis
SONiC SONiC SONiC
SONiCSONiCSONiC
1000+ Ports
CLOS Topology with Ethernet ports
VXLAN-based switching
Each front end chip is a VXLAN Tunnel End Point (VTEP)
Packets inside the chassis are encapsulated with VXLAN headers
BGP-EVPN as the internal control plane
One SONiC/BGP instance per ASIC
Frontend SONiC directly redistribute routes using EVPN
BGP-EVPN
VTEP
EBGP
![Page 20: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/20.jpg)
SONiC Disaggregated Chassis Demo at Booth
![Page 21: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/21.jpg)
Commercial supportMore industry adoption
Powering AI/gaming servicePowering bare metal servicePowering data center ToR/Leaf
![Page 22: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/22.jpg)
Open InvitationInviting contributions in all areas
• SONiC/SAI
• Hardware platform
• New features, applications and tools
• Download, test, deploy!
Website: https://azure.github.io/SONiC/
Mailing list: [email protected]
Source code: https://github.com/Azure/SONiC/blob/gh-pages/sourcecode.md
Wiki: https://github.com/Azure/SONiC/wiki/
![Page 23: Innovating the Cloud Network.… · Control plane disruption < 90 seconds Data plane disruption < 1 second O.S Reboot SONiC Starts ASIC Warm Init State Reconciliation, via SAI state-driven](https://reader035.fdocuments.in/reader035/viewer/2022070710/5ec57cd22209df7a95083b61/html5/thumbnails/23.jpg)