Post on 26-Feb-2022
A Guide to Today’s Class
I Quick Ethernet OverviewI Basic Data StructuresI BreakI Device Startup and InitializationI BreakI Packet ReceptionI Packet TransmissionI BreakI Device ControlI Special Features
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 1 / 68
Overview
Introduction
I Networking begins and ends and the driver layerI A day in the life of a packetI Look into many code files in the kernelI We will use FreeBSD 7.2 (STABLE) as our reference
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 2 / 68
Overview
Device Driver Section Intro
I Lowest level of code in the kernelI Deal directly with the hardwareI Use a well defined API when interfacing to the kernelI Are rarely written from scratchI We will only describe Ethernet drivers in this class
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 3 / 68
Overview
Network Layering
I ApplicationI PresentationI SessionI TransportI NetworkI Data LinkI Physical
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 4 / 68
Overview
Network Layering
I Application (All)I Presentation (Protocols)I Session (Should)I Transport (Transport)I Network (Network)I Data Link (Data)I Physical (Properly)
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 5 / 68
Overview
The Four Paths
I Packets traverse four possible paths in the network codeI Inbound (for this host)I Outbound (from this host)I Forwarding (between two interfaces on this host)I Error
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 6 / 68
Overview
Four Paths Through The Stack
Network ProtocolIPv4, v6, etc.
interface0
inbound
outbound
interface1forwarding
error
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 7 / 68
Overview
Ethernet Overview
I Data Link Layer ProtocolI The most common form of wired networkingI Available in many speeds, now up to 10GbpsI A simple header followed by data
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 8 / 68
Overview
Ethernet Packet and Encapsulation
Dest Source Type IP Header TCP Header Data ...
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 9 / 68
Data Structures
Memory for Packets
I Packets need to be stored for reception and transmissionI The basic packet memory stuctures are the mbuf and cluster
I mbuf structures have several types and purposesI Clusters hold only dataI History dictates that mbufs are named m
I In the kernel we will see many pointers to mbufs
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 10 / 68
Data Structures
Types of mbufs
I Wholly containedI Packet HeaderI Using a cluster
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 11 / 68
Data Structures
Welcome to SMP
I FreeBSD is a multi-threaded, re-entrant kernelI Only way to scale on multicore and multi-processor systemsI Kernel is full of cooperating tasksI Inter process synchronization is required
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 12 / 68
Data Structures
Kernel Synchronization Primitives
I Spin LocksI MutexesI Reader/Writer LocksI Shared/Exclusive LocksI Drivers use mostly spin locks or mutexes
I See locking(9) for more information
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 13 / 68
Data Structures
Ethernet Drivers, an Overview
I Implemented in the kernelI May be kernel loadable modules (KLD)
I Responsible for getting packets into and out of the systemI Follow a well known set of Kernel APIsI May drop packets
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 14 / 68
Data Structures
Introducing, the Intel Gigabit Ethernet Driver
I Supports modern Intel ethernet hardwareI Parts available on motherboards and PCI cardsI A typical example of a modern Ethernet chipI Driver is well written and maintained by an Intel developerI A good example to start withI Data book available at intel.comI Referred to as igb for short
I The em driver is the previous incarnation
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 15 / 68
Data Structures
IGB Features
I Various types of media supportI MSI-X InterruptsI Jumbo FramesI Adaptive Interrupt ModulationI IEEE-1588 (some chips only)
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 16 / 68
Data Structures
Code Overview
I All FreeBSD device drivers are kept in /usr/src/sys/devI The IGB driver resides in /usr/src/sys/dev/e1000/if_igb.[ch]I Other supporting files also exist but will not be necssary for this
classI The main data structures are in the header file and the main body
of the driver is in if_igb.cI Generic code to support all network drivers is in the
/usr/src/sys/net* directories
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 17 / 68
Data Structures
Network Driver Data Structures
I There are two main data-structures in every network driverI ifnet and adapter
I The ifnet structure is used to hook the device into the networkprotocols
I The adapter structure is private to the device.I The adapter structure is often called the softc
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 18 / 68
Data Structures
Objects in C and the BSD Kernels
I Since the early days of the BSDs many kernel data structureshave contained both data and function pointers
I A clever and cheap way to get the benefits of object orientationwithout paying for unwanted features
I Function pointers in structures are used throughout the kernel, notjust in the network devices.
I No need to be alarmed
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 19 / 68
Data Structures
ifnet Overview
I The main interface between the driver and the kernelI Contains data and functions that are generic to all network devicesI Each device instance must have at least one ifnet
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 20 / 68
Data Structures
adapter
I Contains device specific dataI Hardware registersI Device control functionsI Pointers to packet ringsI Interrupt vectorsI Statistics
I Always points back to the ifnet structure
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 21 / 68
Data Structures
IGB adapter structure
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 22 / 68
Data Structures
Break
I Please take a 10 minute break
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 23 / 68
Device Initialization
Relevant APIs
I igb_attach()
I igb_ioctl()
I igb_msix_rx()
I igb_msix_tx()
I igb_msix_link()
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 24 / 68
Device Initialization
attach()
I Each device driver must have a way to connect to the kernelI The igb_attach routine is used to activate a deviceI Setup sysctl variablesI Allocate memoryI Set up device registersI Hook function pointers into placeI Start the device running
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 25 / 68
Device Initialization
Setup Control Variables
I Kenel code can expose controls via sysctlI Tunables are like sysctls but can only be set at bootI Used mostly to communicate integers into and out of the kernelI Also support more complex data structures
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 26 / 68
Device Initialization
Tunables
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 27 / 68
Device Initialization
sysctls
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 28 / 68
Device Initialization
Rings of Packets
I CPU and device share a ring of packet descriptorsI Each descriptor points to a packet bufferI Used for transmission and receptionI Allows decoupling of the CPU and the device
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 29 / 68
Device Initialization
Packet Ring Structures
packetdescriptor
descriptor
descriptor
descriptor
descriptor
descriptor
packet
packet
packet
packet
packet
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 30 / 68
Device Initialization
Tx Ring Allocation
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 31 / 68
Device Initialization
Allocate Receive Ring
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 32 / 68
Device Initialization
Set Device Registers
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 33 / 68
Device Initialization
Hook in function pointers
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 34 / 68
Device Initialization
Set device capabilities and Media Type
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 35 / 68
Device Initialization
Add Media Types
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 36 / 68
Device Initialization
Start the device
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 37 / 68
Device Initialization
Break
I Please enjoy a 15 minute break
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 38 / 68
Packet Reception
rx()
I Interrupt processingI Work deferralI Handling basic errorsI Passing packets into the kernel
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 39 / 68
Packet Reception
Message Signalled Interrupts (MSI/X)
I Old style interrupts required raising a line on a chipI Old style interrupt routine had to be all things to all peopleI MSI allows for different functions to be assigned to different
channelsI The IGB driver has one channel per receive or transmit queue and
a single interrupt for link state changes
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 40 / 68
Packet Reception
Receive Interrupt
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 41 / 68
Packet Reception
Recieving a Frame
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 42 / 68
Packet Reception
Recieving a Frame (End of Packet)
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 43 / 68
Packet Reception
Passing in the Packet
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 44 / 68
Packet Transmission
tx()
I Packets from aboveI Work deferralI Error handling
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 45 / 68
Packet Transmission
Protocols Pass Packets Down
I ip_output()
I ether_output()
I ether_output_frame()
I IFQ_HANDOFF()/IFQ_HANDOFF_ADJ()
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 46 / 68
Packet Transmission
Handing a Packet Off
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 47 / 68
Packet Transmission
A word about queues
I Queues of packets are used throughout the networking stackI Prevent overuse of resourcesI Allow for work deferralI A good way to connect lightly related modulesI Allow the administrator to tune the system
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 48 / 68
Packet Transmission
The IGB start routine
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 49 / 68
Packet Transmission
Draining the Queue
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 50 / 68
Packet Transmission
Watchdogs and Drivers
I Hardware is not as perfect as softwareI One failure mode is freezing upI Watchdog routines can be quite harshI Continuously resetting a device is not the best way to fix itI Reading igb_watchdog is left to the reader
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 51 / 68
Packet Transmission
Cleaning up first
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 52 / 68
Packet Transmission
Checksum Offloading
I Many protocols required a packet checksum calculationI Math is hard, and also expensiveI Many 1Gig chips can calculate the checksum in hardwareI For 10Gig this is required to operate at full speedI A layering violation in the stack
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 53 / 68
Packet Transmission
Checksum Offload Code
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 54 / 68
Packet Transmission
Setup the Transmit Descriptors
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 55 / 68
Packet Transmission
Really transmit the packet
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 56 / 68
Packet Transmission
Break
I Please enjoy a 10 minute break
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 57 / 68
Device Control
Controlling the Device
I Devices need to be controlledI Setting network layer addressesI Bringing the interface up and downI Retrieving the device stateI The ioctl routine is the conduit for control messages and data
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 58 / 68
Device Control
Data in/data out
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 59 / 68
Device Control
The Big Switch
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 60 / 68
Device Control
Setting the MTU
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 61 / 68
Special Features
Special Features
I MulticastI Interrupt ModerationI Checksumming
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 62 / 68
Special Features
Multicast
I One to many transmissionI Mostly handled by hardwareI Table size is important for performance
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 63 / 68
Special Features
Interrupt Moderation
I System can easily be overwhelmed by interruptsI Different types of traffic have different needs
I Low LatencyI Average LatencyI Bulk Transmission
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 64 / 68
Special Features
Checksumming
I Difficult to get line rate TCP without hardware helpI Leads to a layering violationI TCP must be aware of hardware checksumming abilities
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 65 / 68
Special Features
Section Summary
I All networking device drivers have similar structureI The hardware details should be hiddenI Drivers are rarely written from scratch
I Copy when write
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 66 / 68
Special Features
Questions?
George Neville-Neil (gnn@neville-neil.com) Networking from the Bottom Up: Device Drivers January 30, 2010 67 / 68