Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...
Transcript of Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...
![Page 1: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/1.jpg)
How NICs Work Today
IETF 105, Montreal, Tuesday July 23, 20191
Tom Herbert, IntelSimon Horman, NetronomeAndy Gospodarek, Broadcom
![Page 2: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/2.jpg)
Fundamentals
2
![Page 3: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/3.jpg)
Terminology● Network Interface Card (NIC): Host’s interface to
physical network● Host Stack: Software stack that performs host side
processing of L2, L3, or L4 protocols● Kernel Stack: Host stack implemented in an OS kernel● Offload: Do something in NIC HW that could be done in
host SW stack● Acceleration: Offload for performance gains
3
![Page 4: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/4.jpg)
Network Interface Cards
4
![Page 5: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/5.jpg)
Evolution of Network Interface Cards● Fundamental support (1990s)
○ Transmit and receive packets○ Basic offloads (Ethernet Checksum Offload!)
● Data plane acceleration (early to mid 2000s)○ Optimization for multi-core CPUs○ Hardware data plane offload — mostly fixed function devices○ Tunneling, IPsec, QoS offloads
● Programmability (2010 onwards)○ FPGAs and NPUs with programmable data plane○ General purpose processor with programmable data and control planes
5
![Page 6: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/6.jpg)
Offload: Motivation
6
● Free up CPU cycles for application● Specialized processing can be more efficient● Save host resources● Scaling performance (low latency/high throughput)● Power savings for some use cases
=> Reduced TCO (marketing slant!)
![Page 7: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/7.jpg)
Less is More Principle● Protocol agnostic is better than protocol specific
○ Avoid protocol ossification○ New protocol support without needing completely new solutions
● Common open APIs are better than proprietary ones○ Avoid vendor lock in○ Differentiation by features, performance, implementation
● Programmability is (generally) good○ Be adaptable, don’t dictate to the user what they are allowed to do○ Aspiration: “write once, run anywhere” model across devices
7
![Page 8: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/8.jpg)
Basic offloads
8
![Page 9: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/9.jpg)
Offload Considerations● TX and RX ● Protocol agnostic versus protocol specific● Stateful versus stateless● Encapsulation● “Always on” versus “opportunistic”● IPv6 and IPv4● How to build protocols to be NIC offload friendly
9
![Page 10: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/10.jpg)
Basic Offloads● Checksum offload● Segmentation offload● Multi-queue and packet steering
10
![Page 11: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/11.jpg)
Checksum Offload
● TCP, UDP, GRE, etc...● NIC offload calculation over data● Checksum offload is ubiquitous● Encapsulation allows multiple checksums in same packet
11
![Page 12: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/12.jpg)
TX Checksum Offload● Device parses tranports and set checksum
○ Device parse packets and set TCP or UDP checksum
● Instruct device where to start and write checksum○ Init csum field, indicate start offset and offset to write csum○ Generic method
12
![Page 13: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/13.jpg)
RX Checksum Offload● Checksum unnecessary
○ Device parses packet and verifies UDP or TCP checksum
● Checksum complete○ Device return 1’s complement sum across words in the packet○ Generic method
13
![Page 14: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/14.jpg)
Segmentation Offload● Stack operates more efficiently
on large packets● Combines with checksum offload to
minimize header processing and per packet overhead
14
![Page 15: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/15.jpg)
Transmit Segmentation Offload● Split big packet into smaller one low in the stack● GSO, Generic Segmentation Offload: SW variants● LSO: HW variant
15
![Page 16: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/16.jpg)
Receive Segmentation Offload
● Coalesce small packets into bigger ones low in stack ● Generic Receive Offload, GRO: SW variant● Large Receive Offload, LRO: HW variant● Difficult to make protocol agnostic!
16
![Page 17: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/17.jpg)
Multi-Queue● Multiple queues exposed by NIC● Queues processed by CPUs● Queues can be accessed and processed in parallel,
technique for load balancing● Queues can also have different properties, e.g. priority● Avoid OOO packets, maintain flow to queue affinity
17
![Page 18: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/18.jpg)
Transmit Queue Selection● XPS, Transmit packet steering
○ Send packets on queue associated with CPU or thread
● Driver selects queue○ Device driver operation○ ndo_select_queue in Linux○ Arbitrary properties (e.g. priority)
18
![Page 19: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/19.jpg)
Receive Packet Steering
● Receive Packet Steering○ Steer to queue based on hash○ RPS is SW variant○ RSS, Receive Side Scaling, is HW
● Receive Flow Steering○ Flow to queue association○ RFS is SW variant○ aRFS, accelerated RFS, is HW variant
19
![Page 20: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/20.jpg)
Data Plane in Hardware
20
![Page 21: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/21.jpg)
Data Plane in Hardware● Fixed or minimally configurable pipeline
○ ASIC with TCAM tables used for configuring pipeline
● Programmable Pipeline○ Network Processing Unit/Network Flow Processor
■ Multi-threaded execution environment for data plane programs
○ FPGA■ Gate-Level Programmable
○ General Purpose Processor■ CPU Complex separate from host
Offload NIC
PHY0
PHY1
PCIeOffloaded Data Plane
PHY0
PHY1
ASIC, FPGA, NPU, or CPU
![Page 22: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/22.jpg)
● Control plane stays in host software stack
● Offload data plane● Hardware Fallback/
Assist datapath in host software stack
● Host software stack implements features of offload data plane
Data Plane Acceleration
22
Offload NICPHY0
PHY1
Host
User Space
Applications
Kernel
Linux
Offloaded Data Plane
Control / DataplaneSoftware Datapath
PHY0
PHY1
PCIe
![Page 23: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/23.jpg)
Data Plane Acceleration● Match/Action● Forwarding● QoS ● TLS and IPsec
23
![Page 24: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/24.jpg)
● Match packet based on headers and metadata
○ e.g: input-device + 5-tuple
● Execute actions based on match○ Forward / Mirror○ Drop ○ Packet/metadata modification
● Stateful actions○ Policing○ Connection tracking
Host Linux
Offload NIC
Software DatapathMatch Tables Actions
Offload Data PlaneMatch Tables Actions
Match/Action
24
![Page 25: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/25.jpg)
● L2 -> Ln● Between physical and logical
devices● HW datapath misses can fall
back to host● Optional tunnel encap/decap
○ VXLAN, GRE, Geneve, …
● And tagging: VLAN, MPLS
Forwarding
25
Host Linux
Offload NIC
Software DatapathMatch Tables Actions
Offload Data PlaneMatch Tables Actions
Tunnel
Tunnel
![Page 26: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/26.jpg)
● Ingress○ No queue○ Police/Meter/Filter
● Egress○ Classifier selects priority○ Scheduler
■ Priority Scheduler, f.e. 802.1p■ Deficit round robin■ TSN■ Shaping: DCB, …
QoS
26
Host Linux
Offload NIC
Software DatapathClassifier Scheduler
Offload Data PlaneClassifier Scheduler
Egress QoS
![Page 27: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/27.jpg)
User Space
Host Linux
Offload SmartNIC
Fallback PathDevice
OffloadFast Path MQ
RED RED RED
Application
Kernel
ApplicationApplicationMQ + RED Offload
27
● Per-device RED in HW● May ECN mark or drop packets
![Page 28: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/28.jpg)
TLS Acceleration
Offload NIC
Host
User Space
Application
Kernel
Linux
FramingkTLS Module
Symmetric Crypto Offload
Encrypt/Decrypt
TLS HeaderRecord Payload
Ciphertext Auth Hash
Established TLS be passed to kTLS
TX Path:
● NIC driver marks packets for crypto offload based on packet socket
● NIC performs encrypt and TX
RX Path:
● NIC performs decrypt and auth● Notifies kTLS of queued data● kTLS skips decrypt of plaintext● Handle Out-Of-Order
TLS HeaderRecord Payload
Plaintext 0000 0000
Auth Hash
28
![Page 29: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/29.jpg)
Crypto Offload
● HW: Encrypt/Decrypt/Integrity/LSO/Checksum● Kernel: Padding/Anti-replay/Counters/Security Policy DB ● User-Space: IKE
Full Offload
● HW: Replay/Encap/Decap/SPD/LSO/Checksum/LRO● Kernel: IP fragmentation/Counters/Configuration● User-Space: IKE
IPsec Acceleration
29
![Page 30: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/30.jpg)
Programmability
30
![Page 31: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/31.jpg)
Programmability● Facilitates rapid protocol development● Quickly fix bugs and security problems● Two main types used today:
○ FPGA/NPU
○ General Purpose Processors
● Emerging trend: What is niche today can be broad tomorrow○ IETF 104 “Forwarding Plane Realities”
31
![Page 32: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/32.jpg)
● Control plane stays in host● Flexible offload data plane controlled
through kernel or user space● Data plane could be expressed by
P4, eBPF, NPL, or other native instruction set
● Dynamically programmed
Programmability with FPGA or NPU
32
User Programmable NICPHY0
PHY1
Host
User Space
Applications
Kernel
Linux
FPGA/NPU Data Plane
Control/Data PlaneSoftware Datapath?
PHY0
PHY1
PCIe
![Page 33: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/33.jpg)
General Purpose Processor
33
● Move host software stack down to the NIC● Dataplane offload to general purpose processor on NIC● Control plane offload
○ Useful in bare metal or multi-tenant deployments○ Network admin can control server networking
● No host resources consumed forwarding network traffic
![Page 34: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/34.jpg)
● Capable of running complete Operating System
● Forwarding functionality moved completely away from server cores down to NIC
Programmable NIC with General Purpose Processor
34
Programmable NICPHY0
PHY1
Host
User Space
Applications
Kernel
Linux
Software Datapath
Control/Data Plane
PHY0
PHY1
General Purpose Processor
![Page 35: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/35.jpg)
Programmable NIC with General Purpose Processor
35
● Programmable NICs also have offload-capable devices
Programmable NIC
Host
User Space
Applications
Kernel
Linux
FPGA/NPU/ASIC
Offloaded Datapath
Control/Data Plane
PHY0
PHY1
Control/Data Plane
Software Datapath
General Purpose Processors
![Page 36: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/36.jpg)
Conclusion and Futures● Networking trends
○ Insatiable need for more bandwidth and lower latency○ Deployment of forward looking IETF protocols
● NICs work with hosts to make this happen○ Offloads will be relevant for foreseeable future○ Programmability and flexibility spur innovation
36
![Page 37: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom ...](https://reader031.fdocuments.in/reader031/viewer/2022020706/61fd208786f1450f37598bbf/html5/thumbnails/37.jpg)
Thank You!
37