Programmable Switch Hardware
Transcript of Programmable Switch Hardware
![Page 1: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/1.jpg)
Programmable Switch Hardware
ECE/CS598HPN
Radhika Mittal
![Page 2: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/2.jpg)
Conventional SDN
• Programmable control plane.
• Data plane can support high bandwidth.• But has limited flexibility.
• Restricted to conventional packet protocols.
![Page 3: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/3.jpg)
Software Dataplane
• Very extensible and flexible.
• Extensive parallelization to meet performance requirements.• Might still be difficult to achieve 100’s of Gbps.
• Significant cost and power overhead.
![Page 4: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/4.jpg)
Programmable Hardware
• More flexible than conventional switch hardware. • Less flexible than software switches.
• Slightly higher power and cost requirements than conventional switch hardware.• Significantly lower than software switches.
![Page 5: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/5.jpg)
Other alternatives?
Image copied from somewhere on the web.
![Page 6: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/6.jpg)
Forwarding Metamorphosis: Fast Programmable Match-
Action Processing in Hardware for SDN
Pat Bosshart, Glen Gibb, Hun-Seok Kim, George Varghese, Nick McKeown, Martin Izzard,
Fernando Mujica, Mark Horowitz
Acknowledgements: Slides from Pat Bosshart’s SIGCOMM’13 talk
6
![Page 7: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/7.jpg)
Fixed function switch
7
Deparser
In
Queues
Data
OutACLStage
L3Stage
L2Stage
Actio
n:se
tL2D
Stage1
L2Table
L2:128kx48Exactmatch
Actio
n:se
tL2D
,dec
TTL
Stage2
L3Table
L3:16kx32Longestprefixmatch
Actio
n:permit/de
ny
Stage3AC
LTable
ACL:4kTernarymatch
?????????
PBBStage
Parser
X X X X X
![Page 8: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/8.jpg)
What if you need flexibility?
• Flexibility to:• Trade one memory size for another• Add a new table• Add a new header field• Add a different action
• SDN accentuates the need for flexibility• Gives programmatic control to control plane, expects to
be able to use flexibility• OpenFlow designed to exploit flexbility.
8
![Page 9: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/9.jpg)
What about Alternatives?Aren’t there other ways to get flexibility?
• Software? 100x too slow, expensive• NPUs? 10x too slow, expensive• FPGAs? 10x too slow, expensive
9
![Page 10: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/10.jpg)
What the Authors Set Out To Learn• How to design a flexible switch chip?• What does the flexibility cost?
10
![Page 11: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/11.jpg)
RMT Switch ModelEnables flexibility through?
• Programmable parsing: support arbitrary header fields
• Ability to configure number, topology, width, and depths of match-tables.
• Programmable actions: allow a flexible set of actions (including arbitrary packet modifications).
11
![Page 12: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/12.jpg)
What’s Hard about a Flexible Switch Chip?
• Big chip• High frequency• Wiring intensive• Many crossbars• Lots of TCAM• Interaction between physical design and architecture
12
![Page 13: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/13.jpg)
The RMT Abstract Model
• Parse graph• Table graph
13
![Page 14: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/14.jpg)
Arbitrary Fields: The Parse Graph
Ethernet
IPV4 IPV6
TCP UDP
EthernetIPV4TCPPacket:
14
![Page 15: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/15.jpg)
Arbitrary Fields: The Parse Graph
Ethernet
IPV4
TCP UDP
EthernetIPV4TCPPacket:
15
![Page 16: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/16.jpg)
Arbitrary Fields: The Parse Graph
Ethernet
IPV4
TCP UDP
EthernetIPV4RCPTCPPacket:
RCP
16
![Page 17: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/17.jpg)
Arbitrary Fields: Programmable Parser
17
![Page 18: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/18.jpg)
Reconfigurable Match Tables:The Table Graph
18
VLAN
MACFORWARD IPV4-DA
ETHERTYPE
RCPACL
IPV6-DA
![Page 19: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/19.jpg)
Changes to Parse Graph and Table Graph
19
Ethernet
IPV4
TCP UDP
Done
ACL
L2S
L2D
IPV4-DA
ETHERTYPE
TableGraph
VLANVLAN
IPV6-DAIPV6
RCPRCP
ParseGraph
MY-TABLE
![Page 20: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/20.jpg)
But the Parse Graph and Table Graphdon’t show you how to build a switch
20
![Page 21: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/21.jpg)
Match/Action Forwarding Model
ProgrammableParser
DeparserIn
Queues
Actio
n
Stage1
MatchTable
Actio
n
Stage2MatchTable
…
Data
Out
21
Actio
n
StageN
MatchTableMatch
ActionStage
MatchActionStage
MatchActionStage
![Page 22: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/22.jpg)
Performance vs Flexibility• Multiprocessor: memory bottleneck• Change to pipeline• Fixed function chips specialize processors• Flexible switch needs general purpose CPUs
22
Memory
Memory
Memory CPU
CPU
CPU
L2 L3 ACL
![Page 23: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/23.jpg)
TCAM
640b
640b
PhysicalStagen
PhysicalStage2
LogicalTable1
Ethertype
LogicalTable6L2D
8UDP
2VLAN
3IPV4
5IPV6
4L2S
7TCPSRAMHASH
PhysicalStage1
RMT Logical to Physical Table Mapping
ACL
UDPTCP
L2S
L2D
IPV4
ETHVLAN
IPV6
9ACL
TableGraph
23
Actio
n
MatchTable
Actio
n
MatchTable
Actio
n
MatchTable
![Page 24: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/24.jpg)
Detour: CAMs and RAMs
• RAM:• Looks up the value associated with a memory address.
• CAM • Looks up memory address of a given value.• Two types:
• Binary CAM: Exact match (matches on 0 or 1)• Can be implemented using SRAM.
• Ternary CAM (TCAM): Allows wildcard (matches on 0, 1, or X).
![Page 25: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/25.jpg)
Detour: CAMs
Source: https://www.pagiamtzis.com/cam/camintro/
![Page 26: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/26.jpg)
Detour: CAMs
Source: https://www.pagiamtzis.com/cam/camintro/
![Page 27: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/27.jpg)
Detour: CAMs
Source: https://www.pagiamtzis.com/cam/camintro/
![Page 28: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/28.jpg)
TCAM
640b
640b
PhysicalStagen
PhysicalStage2
LogicalTable1
Ethertype
LogicalTable6L2D
8UDP
2VLAN
3IPV4
5IPV6
4L2S
7TCPSRAMHASH
PhysicalStage1
RMT Logical to Physical Table Mapping
ACL
UDPTCP
L2S
L2D
IPV4
ETHVLAN
IPV6
9ACL
TableGraph
28
Actio
n
MatchTable
Actio
n
MatchTable
Actio
n
MatchTable
![Page 29: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/29.jpg)
Instruction
ALU
Matchresult
Action Processing Model
HeaderIn Field
HeaderOut
Field
Data
29
![Page 30: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/30.jpg)
ALU
VLIWInstructionsMatchresult
ModeledasMultipleVLIWCPUsperStage
ALUALUALUALUALUALUALUALU
30
![Page 31: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/31.jpg)
RMT Switch Design
• 64 x 10Gb ports• 960M packets/second• 1GHz pipeline
• Programmable parser• 32 Match/action stages
31
• Huge TCAM:10xcurrentchips• 64KTCAMwordsx640b
• SRAMhash tablesforexactmatches• 128Kwordsx640b
• 224actionprocessorsperstage• AllOpenFlow statisticscounters
![Page 32: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/32.jpg)
Outline• Conventional switch chip are inflexible• SDN demands flexibility…sounds expensive…• How do I do it: The RMT switch model• Flexibility costs less than 15%
32
![Page 33: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/33.jpg)
Cost of Configurability:Comparison with Conventional Switch• Many functions identical: I/O, data buffer, queueing…• Make extra functions optional: statistics• Memory dominates area
• Compare memory area/bit and bit count
• RMT must use memory bits efficiently to compete on cost• Techniques for flexibility
• Match stage unit RAM configurability• Ingress/egress resource sharing• Allows multiple tables per stage• Match memory overhead reduction and multi-word packing
33
![Page 34: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/34.jpg)
Chip Comparison with Fixed Function Switches
Section Area%ofchip ExtraCost
IO, buffer,queue,CPU,etc 37% 0.0%
Matchmemory&logic 54.3% 8.0%
VLIWaction engine 7.4% 5.5%
Parser+deparser 1.3% 0.7%
Totalextra areacost 14.2%
Section Power%ofchip ExtraCost
I/O 26.0% 0.0%
Memoryleakage 43.7% 4.0%
Logicleakage 7.3% 2.5%
RAMactive 2.7% 0.4%
TCAMactive 3.5% 0.0%
Logicactive 16.8% 5.5%
Totalextrapowercost 12.4%
Area
Power
34
![Page 35: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/35.jpg)
Conclusion
• How do we design a flexible chip?• The RMT switch model• Bring processing close to the memories:
• pipeline of many stages• Bring the processing to the wires:
• 224 action CPUs per stage
• How much does it cost?• 15%
• Lots of the details how we designed this in 28nm CMOS are in the paper
35
![Page 36: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/36.jpg)
Limitations on Flexibility
• Your thoughts!
36
![Page 37: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/37.jpg)
Since 2013….
• RMT switch has been commercialized• Barefoot Tofino• 6.5Tb/s
• Adoption of these swiches?
37
![Page 38: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/38.jpg)
Your opinions
• Pros• Proposes RMT as a more flexible alernative to SMT and
MMT. • Shows viability of a flexible design.• Evaluates cost and power requirements, shows they are
not significantly high.• (In contrast to RouteBricks)
• Flexible memory allocation mechanism is innovative and efficient.
38
![Page 39: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/39.jpg)
Your opinions
• Cons• Programmability limitations not discussed? Is it Turing-
complete? • What are the scalability bottlenecks?• Why N=32?• Conflates memory allocation with match-action
processing.• No programmability interface.
• How are low-level configurations generated? • No actual hardware• Security?
39
![Page 40: Programmable Switch Hardware](https://reader033.fdocuments.in/reader033/viewer/2022041506/6252466aa3eacb19a06dfa79/html5/thumbnails/40.jpg)
Your opinions
• Ideas• A compiler for RMT• What can RMT’s programmability enable?• Extending the level of programmability / lifting restrictions.
40