Approaches to Designing a High-Performance Switch Router
-
Upload
vishal-sharma -
Category
Technology
-
view
1.378 -
download
7
description
Transcript of Approaches to Designing a High-Performance Switch Router
Approaches to Designing a Approaches to Designing a High-Performance Switch High-Performance Switch
RouterRouterDr. Vishal Sharma Principal ConsultantMetanoia, Inc.Phone: +1 408-955-0910Email: [email protected] Web: http://www.metanoia-inc.com
Metanoia, Inc.Critical Systems Thinking™
© Copyright 2002All Rights Reserved
Designing a High-Performance Switch Router 2
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Classification of Switch Architectures
1st gen. – shared-bus based Bus-based with central memory, centralized processing
2nd gen. – advanced shared-bus based Bus-based with local memory, distributed processing
3rd gen. – interconnection fabric w/ multiple parallel paths Crossbar or cross-point switch, rings, …
4th gen. – distributed switch Interconnect smaller, ASIC-based 1st, 2nd, or 3rd generation switches
in a regular topology Centralized, high-perf. switch core, with distributed line cards
Designing a High-Performance Switch Router 3
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
CPU
MemoryDMA DMA
DMA DMA
LC1LCNR R
Switch Architectures: Shared-bus with Central Memory
Without DMA a packet crosses bus 4 times (2 times with DMA)
1 3
24
Backplane
Line Card 2
Line Card 1
CPU
Memory
Designing a High-Performance Switch Router 4
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Shared-bus with Central Memory
Blocking if bus b/w or CPU processing < 4.N.R (2.N.R w/ DMA)
Delay: function of memory I/O speed and CPU processing
Throughput: upper-bounded by min(bus speed, CPU power) Most commercial Ethernet switching platforms -- 1-2 Gb/s backplane
The most expensive backplanes today could yield up to 20 Gb/s
Designing a High-Performance Switch Router 5
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Shared-bus with Central Memory
Example: Cisco Catalyst 2820 Ethernet Switch (also 1900 family) 24 10BaseT and 2 100BaseT full-duplex ports (on 2820)
440 Mbps x 2 = 880 Mbps min. bus throughput required
Bus bandwidth : 1 Gb/s
CPU: Intel 486 with 1 MB of flash
Central memory: 3 MB of RAM
Observations: 10 Mbps ports Require 20 Kpps/port for 64B packets
Available: 14.8 Kpps per port
Require: 880 Kpps aggregate forwarding perf. (Ethernet + Fast Eth.) Available: 450 Kpps
Performance is CPU limited (not bus bandwidth limited)
Latency: ~70 us
Designing a High-Performance Switch Router 6
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
LC1 LCNR R
CPU
MemoryRouting/Lookup
Buffering
Buffering
DMA DMA
Full RoutingFunction
Switch Architectures: Shared-bus, Distributed Memory & Processing
1
Fast Path
2
Slow Path
Backplane
Line Card 2
Line Card 1
CPU
Memory
DMA
Designing a High-Performance Switch Router 7
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Shared-bus with Distributed Memory
Blocking if bus b/w or CPU processing < 2.N.R (N.R with DMA)
Delay: function of memory I/O speed and CPU processing
Packet forwarding via dedicated engines, one per line card (LC) Allows line rate forwarding, even with small packets
Enables design parameter adjustment based on LC type
Throughput: upper-bounded by min(bus speed, forwarding engine)
Designing a High-Performance Switch Router 8
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Shared-bus with Distributed Memory
Example: 3Com CoreBuilder 5000 Switching System 17 slot/chassis, 24 10BaseT’s/slot or 4 100BaseT’s/slot (or port)
17x24x10 = 4.08 Gb/s minimum bus throughput required!
Bus bandwidth: 2 Gb/s max. 3.9 Mpps @ 64B/packet
CPU + 18MB DRAM: for address learning, fragmentation, SPT algorithm
Packet switching:custom ASIC + 4MB DRAM per slot: for forwarding, filtering
Observations: Require: 480 Kpps/slot (Eth.) or 800 Kpps/slot (Fast Eth.)
Available: 650 Kpps per switching ASIC
Performance here is bus-bandwidth limited (not forwarding limited)
Latency: ~45-100 us
Jitter: ~ 5 us
Designing a High-Performance Switch Router 9
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Inter-connect Fabric with Multiple Parallel Paths
Backplane I/F
Line Card 1
Forwarding
I/F
Line Card N
ForwardingCPU
Memory
Switch Interconnect
Switch Interconnect
Switch InterconnectSwitch Interconnect
MidplaneI/F
Line Card 1
Forwarding
I/F
Line Card N
ForwardingCPU
Memory
Memory
Full RoutingFunction
CPU
I/F
Forwarding
LocalMemory
I/F
MAC
Interconnect
LC1 LCN
MAC
Designing a High-Performance Switch Router 10
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Inter-connect Fabric with Multiple Parallel Paths
Non-blocking (for unicast) if crossbar or shared memory with adequate bandwidth (2NR)
Delay: 10s of us (in an unloaded system)
Throughput: full line rate, subject to queueing discipline Provided LC processing & interconnect scheduling keep up
Note that this is not always the case!
Applicability: state of the art for many current switches/routers Cisco GSR 12000 family (high-end, core router 98-99), Ascend GRF
(mid-end router, 96-97), Cisco Catalyst 8500 (low-end, enterprise router 97-98),
Designing a High-Performance Switch Router 11
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Distributed Switch
Interconnect smaller switches, each with the architecture of a 1st, 2nd, or 3rd generation switch.
The smaller switches are usually ASIC based
Connected in a specific topology, such as a hypercube or mesh (more on this ahead)
1st, 2nd, or 3rd
gen. switch
Distributedinterconnect
RP Mem
Route Processorwith Memory
Designing a High-Performance Switch Router 12
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Switch Architectures: Distributed Switch
Line Card 1
Line Card 2
Line Card N
Line Card 1
Line Card 2
Line Card N
Switch Core
RP Mem
Electrical or OpticalConnections
Centralized, high-performance switch core, with distributed line cards
Switch core and line cards may be in different chassis
Interconnect composed of optical or electronic links
Designing a High-Performance Switch Router 13
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Functional Map of Processing in a Typical IP Router
PhysicalLayer
InputFraming
OutputFraming
LookupEngine
TrafficManager
LookupTables
Buffer/StateMemory
FabricI/F
uP
LinkScheduler
To RouteProcessor
FabricI/F
Buffer/StateMemory
Packet Processing
To Fabric
FromFabric
O/E
E/O
Designing a High-Performance Switch Router 14
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
A Canonical Realization of the Functional Map
Trans-ceiver
InputFramer
OutputFramer
NetworkProc.
TrafficManager
SDRAMDRAM
FabricI/F
LCP
TrafficManager
To RouteProcessor
FabricI/F
Buffer/StateMemory
Packet Processing SwitchFabric
Co-Proc.
Trans-ceiver
PCI
SPI-4
SFI-4
3.125 Gb/sSERDES
Lookup TableBuffer Memory
Designing a High-Performance Switch Router 15
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Juniper M40 and M160: A Comparison M40 M160
Throughput 20 Gb/s 80 Gb/s
Processing @ 64B packets
40 Mpps (1 pkt. proc.)
160 Mpps (4 pkt. procs.)
Back/mid-plane (full duplex)
25.6 Gb/s 102.4 Gb/s
Data Slots 8 (4 ports/slot) 8 (4 ports/slot)
Data Ports (max.) 8 OC-48 8 OC-192
Power (max.) 1.7 KW 3.4 KW
Weight 280 lb 370 lb
Size Half telco rack Half telco rack
Dimensions (HxWxD in.)
35x19x23.5 35x19x29
M40 M160
Designing a High-Performance Switch Router 16
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Juniper M-Series System Architecture
RoutingProcessUser
Interface
ChassisMgmt.
InterfaceMgmt.
Routing Engine(CPU-based)
Forwarding Engine(ASIC-based)
JUNOS Router OS(routing & signalingprotocols, system
management)
Computer-scale ASIC-based centralizedpacket processor
RoutingTable
Packets In Packets Out
PacketProcessing
Line Card
Line Card
Line Card
Line Card
Switch Fabric
ForwardingTable
Designing a High-Performance Switch Router 17
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Juniper M-Series Functional System Operation
ControllerASIC
FPC
PIC
FPC
PIC
Input Port Output Port
Backplane orMidplane
1 2
3
4a
4b
5
6
7
8
9 10I/O Manager
ASIC
Shared Memory (distributed on FPCs)
I/O ManagerASIC
Distributed BufferManager ASIC
Distributed BufferManager ASIC
ForwardingTable
InternetProcessor II ASIC
Packets
Notification
64B Blocks
Packets
Designing a High-Performance Switch Router 18
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
#4
#2
Juniper M-Series Module Organization
FPC #8
Switching &Forwarding Module
FPC
uP
M40 Backplane (51.2 Gb/s)
#2
FPC
PIC#4Cntlr.
#1
Control Plane
Data Plane
#2
#1
PCI
3.2 Gb/sfull duplex
DistributedBuffer Mgr.
InternetProc. II
100 Mb/sEthernet
PIC#1Cntlr. I/O
Manager
JUNOS Internet S/W
128MB
Routing Engine
Misc. Control Subsys.
FT
12.8 Gb/sfull duplex
I/OManager
M160 Midplane (204.8 Gb/s)
#1
PacketDirector
Designing a High-Performance Switch Router 19
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst 6000 Family: A Comparison 6009 6513
Throughput (non-blocking)
32 Gb/s 128 Gb/s
Processing @ 256B packets
15 Mpps 100 Mpps (?)
Back/mid-plane 32 Gb/s (bus) 128 Gb/s (switch) 32 Gb/s (bus)
Data Slots† 8 10
Data Ports (max.) 128 GbE†† 128 GbE
Power (max.) ~1.3 KW > 2.5 KW
Weight ~166 lb 240 lb
Size >1/3 telco rack ~Half telco rack
Dimensions (HxWxD in.)
25.2x17.2x18.1 33.3x17.2x18.1
† Only includes usable data slots
† † This number of max. ports means an oversubscription of 4x (so not non-blocking!)
6000 Family
6513
Designing a High-Performance Switch Router 20
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst Family System Architecture
NetworkManagement
Forwarding Engine(CPU-based)
Supervisor Engine
RoutingTable
Packets InLine CardLine Card
ForwardingTable
Bus
MSFC
PFC
Routing Engine(CPU-based)
Management Engine
Packets Out
Data Plane
Control Plane
First Generation of Catalyst: Catalyst 6000
Designing a High-Performance Switch Router 21
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst Family Functional System Operation
Results Bus
NetworkManagement
MSFC
PFC
FabricArbitration
Control Bus
64KB
448KB
64KB
448KB
#1
#4
#1
#4
SupervisorEngine
ControllerASIC
ControllerASIC1
2
3
5
4
5
6
Data Bus 32 Gb/s
Designing a High-Performance Switch Router 22
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst 6500 System Architecture
NetworkManagement
Forwarding Engine(ASIC-based)
Supervisor Engine
RoutingTable
Packets InLine CardLine Card
ForwardingTable
Bus
MSFC
PFC
Routing Engine(CPU-based)
Management Engine
Packets Out
Data Plane
Control Plane
Headers
Data
SwitchingFabric
Second Generation of Catalyst: Catalyst 6500
Designing a High-Performance Switch Router 23
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst 6500 Functional System Operation
Results Bus
Control Bus
SupervisorEngine
5Data Bus 32 Gb/s
NetworkMgt.
MSFC
PFC
FabricArb.
512KB
ASIC#4
#1
#4
512KB
ASIC#1
ASIC#4
Line Card Line Card
Fabric I/F
SwitchingFabric
ASIC#1 Fabric I/F
1
2
3
4
6
78
9
6
8 Gb/s
16 Gb/s
Second Generation Catalyst: Catalyst 6500
Designing a High-Performance Switch Router 24
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst 6500 System Architecture
NetworkManagement
ForwardingEngine
Supervisor Engine
RoutingTable
Packets In
MSFCRouting Engine
ManagementEngine
Packets Out
Data Plane
Control Plane
SwitchingFabric
Packets In
Packets OutLine Card
PFC PFC
ForwardingEngine
Third Generation of Catalyst: Catalyst 6500+
Designing a High-Performance Switch Router 25
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Cisco Catalyst Family Functional System Operation
#1
#4
1
ASIC#4
ASIC#1
DFC
512KB
Line Card
2
3
4
57
#1
#4
9
ASIC#4
ASIC#1
DFC
512KB
Line Card
8
Supervisor Engine
NetworkMgt.
MSFC
Fabric Arb.
Fabric I/F
SwitchingFabric
Fabric I/F6
Third Generation of Catalyst: Catalyst 6500+
Designing a High-Performance Switch Router 26
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Building Very High-Speed Switches from Low-speed Components
Problem: scale this architecture to handle higher link speeds Emulate output queueing Provide some measure of perf., such as bounded delay
Virtual OutputQueues Switch Fabric
InputLinks
OutputQueues
OutputLinks
1
N
N
InputQueues
1
NN
OQN
VOQ1,1
VOQ1,N
VOQN,1
VOQN, N
Scheduler
11OQ
1
221
Designing a High-Performance Switch Router 27
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Building Very High-Speed Switches from Low-speed Components
InputDemultiplexers
D 1
D N
M 1
M N
1
N
1
N
i
j
D i
M j
S1
Sk
ParallelSwitches
OutputMultiplexers
Operate parallel switch system under control of a global scheduler
Requires No speedup in the system No reordering at outputs
Mneimneh, Sharma & Siu
GlobalScheduler Operate parallel switches s. t. they
collectively mimic an OQ switch Requires
Speedup in the system Emulation of shadow OQ switch
Iyer, Awadallah & McKeown
Designing a High-Performance Switch Router 28
Metanoia, Inc.Critical Systems Thinking™
Copyright 2002, All Rights Reserved
Building Very High-Speed Switches: References
[SAN00] S. Iyer, A. Awadallah, N. McKeown, ““Analysis of a packet switch with memories running slower than the line rate,” Proc. IEEE Infocom’00, March 2000.
[Sun00] S. Iyer, “Analysis of a packet switch with memories running slower than the line rate,” MS Thesis, Stanford University, May 2000.
[SuM03] S. Iyer, N. McKeown, “Analysis of the parallel packet switch architecture,” to appear IEEE/ACM Trans. on Networking, April 2003.
[MSS01] S. Mneimneh, V. Sharma, K. Y. Siu, “On scheduling using parallel input-output queued crossbar switches with no speedup,” Proc. IEEE Workshop on High Performance Switching & Routing (HPSR’01), May 2001.
[MSS02] S. Mneimneh, V. Sharma, K. Y. Siu, “Switching using parallel input-output queued switches with no speedup,” IEEE/ACM Trans. on Networking, vol. 10, no. 5, Oct. 2002.
[Mne02] S. Mneimneh, “Algorithms for high-speed switching and routing,” Ph.D. Thesis, MIT, June 2002.