Post on 07-Mar-2018
Berlin – November 10th, 2011
NetFPGA Programmable Networking for High-Speed Network Prototypes, Research and Teaching
Presented by: Andrew W. Moore
(University of Cambridge)
CHANGE/OFELIA Berlin, Germany
November 10th, 2011
http://NetFPGA.org
Berlin – November 10th, 2011
Tutorial Outline
• Motivation – Introduction – The NetFPGA Platform
• Hardware Overview – NetFPGA 1G – NetFPGA 10G
• The Stanford Base Reference Router – Motivation: Basic IP review – Example 1: Reference Router running on the NetFPGA – Example 2: Understanding buffer size requirements using NetFPGA
• Community Contributions – Altera-DE4 NetFPGA Reference Router (UMassAmherst) – NetThreads (University of Toronto)
• Concluding Remarks
Berlin – November 10th, 2011
Section I: Motivation
Berlin – November 10th, 2011
NetFPGA = Networked FPGA A line-rate, flexible, open networking
platform for teaching and research
Berlin – November 10th, 2011
NetFPGA 1G Board
NetFPGA consists of…
Four elements: • NetFPGA board
• Tools + reference designs
• Contributed projects
• Community
NetFPGA 10G Board
Berlin – November 10th, 2011
NetFPGA 1G NetFPGA 10G 4 x 1Gbps Ethernet
Ports 4 x 10Gbps SFP+
4.5 MB ZBT SRAM 64 MB DDR2 SDRAM
27 MB QDRII-SRAM 288 MB RLDRAM-II
PCI PCI Express x8 Virtex II-Pro 50 Virtex 5 TX240T
NetFPGA Board Comparison
Berlin – November 10th, 2011
FPGA
Memory
1GE
1GE
1GE
1GE
NetFPGA board
PCI
CPU Memory
NetFPGA Board
PC with NetFPGA
Networking Software running on a standard PC
A hardware accelerator built with Field Programmable Gate Array driving Gigabit network links
Berlin – November 10th, 2011
Tools + Reference Designs
Tools: • Compile designs • Verify designs • Interact with hardware
Reference designs: • Router (HW) • Switch (HW) • Network Interface Card (HW) • Router Kit (SW) • SCONE (SW)
Berlin – November 10th, 2011
Contributed Projects
More projects: http://netfpga.org/foswiki/NetFPGA/OneGig/ProjectTable
Project Contributor OpenFlow switch Stanford University Packet generator Stanford University NetFlow Probe Brno University NetThreads University of
Toronto zFilter (Sp)router Ericsson Traffic Monitor University of
Catania DFA UMass Lowell
Berlin – November 10th, 2011
Community
Wiki • Documentation
– User’s Guide – Developer’s Guide
• Encourage users to contribute
Forums • Support by users for users • Active community - 10s-100s of posts/week
Berlin – November 10th, 2011
International Community
Over 1,000 users, using 1,900 cards at 150 universities in 32 countries
Berlin – November 10th, 2011
NetFPGA’s Defining Characteristics • Line-Rate
– Processes back-to-back packets • Without dropping packets • At full rate of Gigabit Ethernet Links
– Operating on packet headers • For switching, routing, and firewall rules
– And packet payloads • For content processing and intrusion prevention
• Open-source Hardware – Similar to open-source software
• Full source code available • BSD-Style License
– But harder, because • Hardware modules must meeting timing • Verilog & VHDL Components have more complex interfaces • Hardware designers need high confidence in specification of modules
Berlin – November 10th, 2011
Test-Driven Design
• Regression tests – Have repeatable results – Define the supported features – Provide clear expectation on functionality
• Example: Internet Router
– Drops packets with bad IP checksum – Performs Longest Prefix Matching on destination address – Forwards IPv4 packets of length 64-1500 bytes – Generates ICMP message for packets with TTL <= 1 – Defines how packets with IP options or non IPv4
… and dozens more … Every feature is defined by a regression test
Berlin – November 10th, 2011
Who, How, Why
Who uses the NetFPGA? – Teachers – Students – Researchers
How do they use the NetFPGA?
– To run the Router Kit – To build modular reference designs
• IPv4 router • 4-port NIC • Ethernet switch, …
Why do they use the NetFPGA? – To measure performance of Internet systems – To prototype new networking systems
Berlin – November 10th, 2011
Section II: Hardware Overview
Berlin – November 10th, 2011
NetFPGA-1G
Berlin – November 10th, 2011
Xilinx Virtex II Pro 50
• 53,000 Logic Cells • Block RAMs • Embedded PowerPC
Berlin – November 10th, 2011
Network and Memory
• Gigabit Ethernet
– 4 RJ45 Ports – Broadcom PHY
• Memories
– 4.5MB Static RAM – 64MB DDR2 Dynamic
RAM
Berlin – November 10th, 2011
Other IO
•PCI
– Memory Mapped Registers
– DMA Packet Transferring
•SATA – Board to Board
communication
Berlin – November 10th, 2011
NetFPGA-10G
• A major upgrade • State-of-the-art technology
Berlin – November 10th, 2011
NetFPGA 1G NetFPGA 10G
4 x 1Gbps Ethernet Ports 4 x 10Gbps SFP+
4.5 MB ZBT SRAM 64 MB DDR2 SDRAM
27 MB QDRII-SRAM 288 MB RLDRAM-II
PCI PCI Express x8
Virtex II-Pro 50 Virtex 5 TX240T
Comparison
Berlin – November 10th, 2011
10 Gigabit Ethernet
• 4 SFP+ Cages • AEL2005 PHY • 10G Support
– Direct Attach Copper – 10GBASE-R Optical
Fiber • 1G Support
– 1000BASE-T Copper – 1000BASE-X Optical
Fiber
Berlin – November 10th, 2011
Others
• QDRII-SRAM – 27MB – Storing routing tables,
counters and statistics • RLDRAM-II
– 288MB – Packet Buffering
• PCI Express x8 – PC Interface
• Expansion Slot
Berlin – November 10th, 2011
Xilinx Virtex 5 TX240T
• Optimized for ultra
high-bandwidth applications
• 48 GTX Transceivers • 4 hard Tri-mode
Ethernet MACs • 1 hard PCI Express
Endpoint
Berlin – November 10th, 2011
Beyond Hardware
• NetFPGA-10G Board • Xilinx EDK based IDE • Reference designs with
ARM AXI4 • Software (embedded and
PC) • Public Repository
(GitHub) • Public Wiki (PBWorks)
Reference Designs AXI4 IPs
Xilinx EDK
MicroBlaze SW PC SW
Wiki, GitHub, User Community
Berlin – November 10th, 2011
NetFPGA-1G Cube Systems
• PCs assembled from parts – Stanford University – Cambridge University
• Pre-built systems available – Accent Technology Inc.
• Details are in the Guide http://netfpga.org/static/guide.html
Berlin – November 10th, 2011
Rackmount NetFPGA-1G Servers
NetFPGA inserts in PCI or PCI-X slot
2U Server (Dell 2950)
Thanks: Brian Cashman for providing machine
1U Server (Accent Technology Inc.)
Berlin – November 10th, 2011
Stanford NetFPGA-1G Cluster
Statistics • Rack of 40 • 1U PCs with
NetFPGAs
• Managed • Power • Console • LANs
• Provides
4*40=160 Gbps of full line-rate processing bandwidth
Berlin – November 10th, 2011
Section III: Network review
Berlin – November 10th, 2011
Internet Protocol (IP)
Data
Data IP Hdr
Eth Hdr Data IP
Hdr
Data to be transmitted:
IP packets:
Ethernet Frames:
Data IP Hdr Data IP
Hdr
Eth Hdr Data IP
Hdr Eth Hdr Data IP
Hdr
…
…
Berlin – November 10th, 2011
Internet Protocol (IP)
Data
Data IP Hdr
…
16 32 4 1
Options (if any)
Destination Address
Source Address
Header Checksum Protocol TTL
Fragment Offset Flags Fragment ID
Total Packet Length T.Service HLen Ver
20 bytes
Berlin – November 10th, 2011
Basic operation of an IP router R3
A
B
C
R1
R2
R4 D
E
F R5
R5 F R3 E R3 D Next Hop Destination
Berlin – November 10th, 2011
Basic operation of an IP router
A
B
C
R1
R2
R3
R4 D
E
F R5
Berlin – November 10th, 2011
Forwarding tables
Entry Destination Port 1 2 ⋮
232
0.0.0.0 0.0.0.1
⋮ 255.255.255.255
1 2 ⋮
12
~ 4 billion entries
Naïve approach: One entry per address
Improved approach: Group entries to reduce table size
Entry Destination Port 1 2 ⋮
50
0.0.0.0 – 127.255.255.255 128.0.0.1 – 128.255.255.255
⋮ 248.0.0.0 – 255.255.255.255
1 2 ⋮
12
IP address 32 bits wide → ~ 4 billion unique address
Berlin – November 10th, 2011
IP addresses as a line
0 232-1
Entry Destination Port 1 2 3 4 5
Stanford Berkeley
North America Asia
Everywhere (default)
1 2 3 4 5
All IP addresses
North America Asia
Berkeley Stanford
Your computer My computer
Berlin – November 10th, 2011
Longest Prefix Match (LPM)
Entry Destination Port 1 2 3 4 5
Stanford Berkeley
North America Asia
Everywhere (default)
1 2 3 4 5
Universities
Continents
Planet
Data To: Stanford
Matching entries: •Stanford •North America •Everywhere
Most specific
Berlin – November 10th, 2011
Longest Prefix Match (LPM)
Entry Destination Port 1 2 3 4 5
Stanford Berkeley
North America Asia
Everywhere (default)
1 2 3 4 5
Universities
Continents
Planet
Data To: Canada
Matching entries: •North America •Everywhere
Most specific
Berlin – November 10th, 2011
Implementing Longest Prefix Match
Entry Destination Port 1 2 3 4 5
Stanford Berkeley
North America Asia
Everywhere (default)
1 2 3 4 5
Most specific
Least specific
Searching
FOUND
Berlin – November 10th, 2011
Basic components of an IP router
Control Plane
Data Plane per-packet processing
Switching Forwarding Table
Routing Table
Routing Protocols
Management & CLI
Softw
are H
ardware
Queuing
Berlin – November 10th, 2011
IP router components in NetFPGA
SCONE
Routing Table
Routing Protocols
Management & CLI
Output Port Lookup
Forwarding Table
Input Arbiter
Output Queues
Switching Queuing
Linux
Routing Table
Routing Protocols
Management & CLI
Router Kit
OR
Softw
are H
ardware
Berlin – November 10th, 2011
Section IV: Example I
Berlin – November 10th, 2011
Operational IPv4 router
Control Plane
Data Plane per-packet processing
Softw
are H
ardware
Routing Table
Routing Protocols
Management & CLI
SCONE
Switching Forwarding Table Queuing
Reference router
Java GUI
Berlin – November 10th, 2011
Streaming video
Berlin – November 10th, 2011
Streaming video
PC & NetFPGA (NetFPGA in PC)
NetFPGA running reference router
Berlin – November 10th, 2011
Streaming video
Video streaming over shortest path
Video client
Video server
Berlin – November 10th, 2011
Streaming video
Video client
Video server
Berlin – November 10th, 2011
Observing the routing tables
Columns: •Subnet address •Subnet mask •Next hop IP •Output ports
Berlin – November 10th, 2011
Example 1
http://www.youtube.com/watch?v=xU5DM5Hzqes
Berlin – November 10th, 2011
Review Exercise 1
NetFPGA as IPv4 router: •Reference hardware + SCONE software •Routing protocol discovers topology
Example 1: •Ring topology •Traffic flows over shortest path •Broken link: automatically route around failure
Berlin – November 10th, 2011
Section IV: Example II
Berlin – November 10th, 2011
Buffers in Routers • Internal Contention
• Congestion
• Pipelining
Berlin – November 10th, 2011
Buffers in Routers
Rx
Rx
Rx
Tx
Tx
Tx
Berlin – November 10th, 2011
Buffers in Routers
• So how large should the buffers be?
Buffer size matters – End-to-end delay
• Transmission, propagation, and queueing delay • The only variable part is queueing delay
– Router architecture • Board space, power consumption, and cost • On chip buffers: higher density, higher capacity • Optical buffers: all-optical routers
Berlin – November 10th, 2011
Buffer Sizing Story
Rule for adjusting W – If an ACK is received: W ← W+1/W – If a packet is lost: W ← W/2
Why 2TxC for a single TCP Flow?
Only W packets may be outstanding
Berlin – November 10th, 2011
Berlin – November 10th, 2011
Rule-of-thumb – Intuition
Rule for adjusting W If an ACK is received: W ← W+1/W If a packet is lost: W ← W/2
Only W packets may be outstanding
Source Dest
t
Window size
Berlin – November 10th, 2011
Synchronized Flows Many TCP Flows • Aggregate window has same
dynamics • Therefore buffer occupancy has
same dynamics • Rule-of-thumb still holds.
• Independent, desynchronized • Central limit theorem says the
aggregate becomes Gaussian • Variance (buffer size)
decreases as N increases
Small Buffers – Intuition
Probability Distribution
t
Buffer Size
t
Berlin – November 10th, 2011
Poisson Traffic Smooth Traffic • Theory. For Poisson arrivals tiny
buffers are enough.
• Example: ρ = 80%, B = 20 pkts loss < 1%
• Loss independent of link rate, RTT, number of flows, etc.
• Question. Can we make traffic look like Poisson when it arrives to the core routers?
• Assumptions: – Minimum distance between
consecutive packets of the same flow;
– Desynchronized flows – Random and independent start
times for flows • Under these assumptions
traffic is be smooth-enough. • In practice: – Slow access links – TCP Pacing
Tiny Buffers – Intuition
M/D/1 Poisson
B D loss < ρB
Berlin – November 10th, 2011
Buffer Sizing Experiments are Difficult
Problem • Convincing network operators not easy • Packet drops are scary • Varying traffic (shape, load, ...) extremely
difficult • Tiny buffers: no guarantees on assumptions
– i.e. slow access or pacing
Berlin – November 10th, 2011
Using NetFPGA to explore buffer size
• Need to reduce buffer size and measure occupancy
• Alas, not possible in commercial routers • So, we will use the NetFPGA instead
Objective:
– Use the NetFPGA to understand how large a buffer we need for a single TCP flow.
Berlin – November 10th, 2011
Reference Router Pipeline
• Five stages – Input interfaces – Input arbitration – Routing decision and
packet modification – Output queuing – Output interfaces
• Packet-based module interface
• Pluggable design
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
Input Arbiter
Output Port Lookup
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
Output Queues
Berlin – November 10th, 2011
Extending the Reference Pipeline
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
Input Arbiter
Output Port Lookup
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
Output Queues
Rate Limiter
Berlin – November 10th, 2011
Extending the Reference Pipeline
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
Input Arbiter
Output Port Lookup
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
Output Queues
Rate Limiter
Event Capture
Berlin – November 10th, 2011
Enhanced Router Pipeline
Two modules added
1. Event Capture to capture output queue events (writes, reads, drops)
2. Rate Limiter to create a bottleneck
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
MAC RxQ
CPU RxQ
Input Arbiter
Output Port Lookup
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
MAC TxQ
CPU TxQ
Output Queues
Rate Limiter
Event Capture
Berlin – November 10th, 2011
Topology for Exercise 2
Iperf Client
Iperf Server
Recall: NetFPGA host PC is life-support: power & control So: The host PC may physically route its traffic through the local NetFPGA
PC & NetFPGA (NetFPGA in PC)
NetFPGA running extended reference router
nf2c2
eth1
nf2c1
eth2
Berlin – November 10th, 2011
Review
NetFPGA as flexible platform: •Reference hardware + SCONE software •new modules: event capture and rate-limiting
Example 2: Client Router Server topology
– Observed router with new modules – Started tcp transfer, look at queue occupancy – Observed queue change in response to TCP ARQ
Berlin – November 10th, 2011
Section V: Community Contributions
Berlin – November 10th, 2011
FPGA
Memory
1GE
1GE
1GE
1GE
Running the Router Kit
User-space development, 4x1GE line-rate forwarding
PCI
CPU Memory
OSPF BGP
My Protocol user
kernel Routing
Table
IPv4 Router
1GE
1GE
1GE
1GE
Fwding Table
Packet Buffer
“Mirror”
Berlin – November 10th, 2011
Altera-DE4 NetFPGA Reference Router
• Migration of NetFPGA infrastructure to DE4 Stratix IV – 4X logic vs. Virtex 2
• PCI Express Gen2 – 5.0Gbps/lane data • External DDR2 RAM – 8-Gbyte capacity. • Status: Functional – basic router
performance matches current NetFPGA • Lots of logic for additional functions • Russ Tessier (tessier@ecs.umass.edu)
http://keb302.ecs.umass.edu/de4web/DE4_NetFPGA/
UMassAmherst
This provides a template for all NetFPGA 1G projects
Berlin – November 10th, 2011
FPGA
Memory
1GE
1GE
1GE
1GE
Enhancing Modular Reference Designs
PCI
CPU Memory
NetFPGA Driver
Java GUI Front Panel (Extensible)
PW-OSPF
In Q Mgmt
IP Lookup
L2 Parse
L3 Parse
Out Q Mgmt
1GE
1GE
1GE
1GE Verilog modules interconnected by FIFO interfaces
My Block
Verilog EDA Tools
(Xilinx, Mentor, etc.)
1. Design 2. Simulate 3. Synthesize 4. Download
Berlin – November 10th, 2011
FPGA
Memory
1GE
1GE
1GE
1GE
Creating new systems
PCI
CPU Memory
NetFPGA Driver
1GE
1GE
1GE
1GE
My Design
(1GE MAC is soft/replaceable)
Verilog EDA Tools
(Xilinx, Mentor, etc.)
1. Design 2. Simulate 3. Synthesize 4. Download
Berlin – November 10th, 2011 74
NetThreads, NetThreads-RE, NetTM
Martin Labrecque Gregory Steffan
ECE Dept.
Geoff Salmon Monia Ghobadi Yashar Ganjali
CS Dept. U. of Toronto
•Efficient multithreaded design –Parallel threads deliver performance •System Features –System is easy to program in C –Time to results is very short
Berlin – November 10th, 2011 75
FPGA
Soft processors: processors in the FPGA fabric User uploads program to soft processor Easier to program software than hardware in the FPGA Could be customized at the instruction level
Processor(s) DDR controller
Ethernet MAC
Soft Processors in FPGAs
Berlin – November 10th, 2011
NetThreads
NetThreads, NetThreads-RE & NetTM available with supporting software at:
http://www.netfpga.org/foswiki/bin/view/NetFPGA/OneGig/NetThreads http://www.netfpga.org/foswiki/bin/view/NetFPGA/OneGig/NetThreadsRE
http://netfpga.org/foswiki/bin/view/NetFPGA/OneGig/NetTM
Martin Labrecque martinL@eecg.utoronto.ca
Berlin – November 10th, 2011
Section VI: What to do next?
Berlin – November 10th, 2011
To get started with your project
1. Get familiar with hardware description language
2. Prepare for your project
b) Get a hands-on tutorial
a) Learn NetFPGA by yourself
Berlin – November 10th, 2011
Learn by yourself
Users Guide
NetFPGA website (www.netfpga.org)
Berlin – November 10th, 2011
Learn by yourself
Developers Guide
NetFPGA website (www.netfpga.org)
Forums
Berlin – November 10th, 2011
Support for NetFPGA enhancements provided by
Learn by Yourself
Online tutor – coming soon!
Berlin – November 10th, 2011
Get a hands-on tutorial
Stanford
Cambridge
Berlin – November 10th, 2011
Get a hands-on tutorial
Events
NetFPGA website (www.netfpga.org)
Berlin – November 10th, 2011
Section VII: Conclusion
Berlin – November 10th, 2011
Conclusions
• NetFPGA Provides – Open-source, hardware-accelerated Packet
Processing – Modular interfaces arranged in reference pipeline – Extensible platform for packet processing
• NetFPGA Reference Code Provides – Large library of core packet processing functions – Scripts and GUIs for simulation and system operation – Set of Projects for download from repository
• The NetFPGA Base Code – Well defined functionality defined by regression tests – Function of the projects documented in the Wiki Guide
Berlin – November 10th, 2011
Nick McKeown, Glen Gibb, Jad Naous, David Erickson, G. Adam Covington, John W. Lockwood, Jianying Luo, Brandon Heller, Paul
Hartke, Neda Beheshti, Sara Bolouki, James Zeng, Jonathan Ellithorpe, Sachidanandan Sambandan, Eric Lo
Acknowledgments NetFPGA Team at Stanford University (Past and Present):
NetFPGA Team at University of Cambridge (Past and Present): Andrew Moore, Shahbaz Muhammad, David Miller, Martin Zadnik
All Community members (including but not limited to):
Paul Rodman, Kumar Sanghvi, Wojciech A. Koszek, Yahsar Ganjali, Martin Labrecque, Jeff Shafer,
Eric Keller , Tatsuya Yabe, Bilal Anwer, Yashar Ganjali, Martin Labrecque
Kees Vissers, Michaela Blott, Shep Siegel
Berlin – November 10th, 2011
Special thanks to our Partners:
Other NetFPGA Tutorial Presented At:
SIGMETRICS
Ram Subramanian, Patrick Lysaght, Veena Kumar, Paul Hartke, Anna Acevedo
Xilinx University Program (XUP)
See: http://NetFPGA.org/tutorials/
Berlin – November 10th, 2011
Thanks to our Sponsors: • Support for the NetFPGA project has been provided by
the following companies and institutions Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these
materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project.