Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.
-
Upload
christian-rodney-harrison -
Category
Documents
-
view
212 -
download
0
Transcript of Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.
![Page 1: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/1.jpg)
Eric Keller
Oral General Exam
5/5/08
Multi-Level Architecture for Data Plane Virtualization
![Page 2: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/2.jpg)
2
The Internet (and IP)
• Usage of Internet continuously evolving
• The way packets forwarded hasn’t (IP)– Meant for communication between machines– Address tied to fixed location– Hierarchical addressing– Best-effort delivery– Addresses easy to spoof
• Great innovation at the edge (Skype/VoIP, BitTorrent)– Programmability of hosts at application layer– Can’t add any functionality into network
![Page 3: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/3.jpg)
3
Proposed Modifications• Many proposals to modify some aspect of IP
– No single one is best– Difficult to deploy
• Publish/Subscribe mechanism for objects– Instead of routing on machine address, route on object ID– e.g. DONA (Data oriented network architecture), scalable
simulation
• Route through intermediary points– Instead of communication between machines– e.g. i3 (internet indirection infrastructure), DOA (delegation
oriented architecture)
• Flat Addressing to separate location from ID– Instead of hierarchical based on location– e.g. ROFL (routing on flat labels), SEIZE (scalable and efficient,
zero-configuration enterprise)
![Page 4: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/4.jpg)
4
Challenges
• Want to Innovate in the Network– Can’t because networks are closed
• Need to lower barrier for who innovates– Allow individuals to create a network and define its
functionality
• Virtualization as a possible solution– For both network of future and overlay networks– Programmable and sharable – Examples: PlanetLab, VINI
![Page 5: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/5.jpg)
5
Network Virtualization
• Running multiple virtual networks at the same time over a shared physical infrastructure – Each virtual network composed of virtual routers having custom
functionality
Physical machine
Virtual router
Virtual network – e.g. blue virtual routers plus Blue links
![Page 6: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/6.jpg)
6
Virtual Network Tradeoffs
Performance
Programmability
Isolation
• Goal: Enable custom data planes per virtual network– Challenge: How to create the shared network nodes
![Page 7: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/7.jpg)
7
Virtual Network Tradeoffs
Performance
Programmability
Isolation
• Goal: Enable custom data planes per virtual network– Challenge: How to create the shared network nodes
How easy is it to add new functionality?What is the range of new functionality that can be added?Does it extend beyond “software routers”?
![Page 8: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/8.jpg)
8
Virtual Network Tradeoffs
Performance
Programmability
Isolation
• Goal: Enable custom data planes per virtual network– Challenge: How to create the shared network nodes
Does resource usage by one virtual networks have an effect on others? Faults?How secure is it given a shared substrate?
![Page 9: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/9.jpg)
9
Virtual Network Tradeoffs
Performance
Programmability
Isolation
• Goal: Enable custom data planes per virtual network– Challenge: How to create the shared network nodes
How much overhead is there for sharing? What is the forwarding rate? Throughput? Latency?
![Page 10: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/10.jpg)
10
Virtual Network Tradeoffs
• Network Containers– Duplicate stack or data structures– e.g. Trellis, OpenVZ, Logical Router
• Extensible Routers– Assemble custom routers from common functions– e.g. Click, Router Plug Ins, Scout
• Virtual Machines+Click– Run operating system on top of another operating system– e.g. Xen, PL-VINI (Linux-VServer)
Programability, Isolation, Performance
Programmability, Isolation, Performance
Programmability, Isolation, Performance
Performance
Programmability
Isolation
![Page 11: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/11.jpg)
11
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 12: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/12.jpg)
12
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 13: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/13.jpg)
13
• Custom functionality– Custom user environment on each node (for controlling virtual router)– Specify single node’s packet handling as graph of common functions
• Isolated from others sharing same node– Allocated share of resources (e.g. CPU, memory, bandwidth)– Protected from faults in others (e.g. another virtual router crashing)
• Highest performance possible
Config/Query interface
User Control Environment
User Experience (Creating a virtual network)
A1 A2 A3
A4 A5
To devices
From devices
Determine Shortest Path
Populate routing tables
Check Header, Destination Lookup
For example…
![Page 14: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/14.jpg)
14
• Combine graphs into single graph– Provides lightweight virtualization
• Add extra packet processing (e.g. mux/demux)– Needed to direct packets to the correct graph
• Add resource accounting
Lightweight Virtualization
Mastergraph
Graph 1
Graph 2combine
Graph 1
Graph 2
Inputport
Outputport
Master Graph
![Page 15: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/15.jpg)
15
Increasing Performance and Isolation
• Partition into multiple graphs across multiple targets– Each target with different capabilities
Performance, Programmability, Isolation
– Add connectivity between targets – Unified run-time interface (it appears as a single graph)
To query and configure the forwarding capabilities
Mastergraph
partition
Target0graph
Target1graph
Target2graph
Graph 1
Graph 2combine
![Page 16: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/16.jpg)
16
Examples of Multi-Level
• Fast Path/Slow Path– IPv4: forwarding in fast path, exceptions in slow path– i3: Chord ring lookup function in fast path, handling
requests in slow path
• Preprocessing– IPSec – do encryption/decryption in HW, rest in SW
• Offloading– TCP Offload– TCP Splicing
• Pipeline of coarse grain services– e.g. transcoding, firewall– SoftRouter from Bell Labs
![Page 17: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/17.jpg)
17
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 18: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/18.jpg)
18
Implementation
• Each network has custom functionality– Specified as graph of common functions– Click modular router
• Each network allocated share of resources – e.g. CPU– Linux-VServer – single resource accounting for both
control and packet processing
• Each network protected from faults in others– Library of elements considered safe– Container for unsafe elements
• Highest performance possible– FPGA for modules with HW option, Kernel for
modules without
![Page 19: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/19.jpg)
19
Click Background: Overview
• Software architecture for building flexible and configurable routers– Widely used – commercially and in research– Easy to use, flexible, high performance (missing sharable)
• Routers assembled from packet processing modules (Elements)– Simple and Complex
• Processing is directed graph
• Includes a scheduler– Schedules tasks (a series of elements)
FromDevice(eth0) DiscardCounter
![Page 20: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/20.jpg)
20
Linux-VServer
![Page 21: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/21.jpg)
21
Linux-VServer + Click + NetFPGA
Click
Coordinating Process
Installer Installer Installer
Click on NetFPGA
click click click
![Page 22: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/22.jpg)
22
Outline
• Architecture
• Implementation– Virtualizing Click in the Kernel– Challenges with kernel execution– Extending beyond software routers
• Evaluation
• Conclusion/Future Work
![Page 23: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/23.jpg)
23
Virtual Kernel Mode Click
• Want to run in Kernel mode– Close to 10x higher performance than user mode
• Use library of ‘safe’ elements– Since Kernel is shared execution space
• Need resource accounting– Click scheduler does not do resource accounting– Want resource accounting system-wide
(i.e. not just inside of packet processing)
![Page 24: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/24.jpg)
24
Resource Accounting with VServer
• Purpose of Resource Accounting– Provides isolation between virtual networks
• Unified resource accounting– For packet processing and control
• VServer’s Token Bucket Extension to Linux Scheduler– Controls eligibility of processes/threads to run
• Integrating with Click– Each individual Click configuration assigned to its own
thread– Each thread associated with VServer context
Basic mechanism is to manipulate the task_struct
![Page 25: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/25.jpg)
25
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond software routers
• Evaluation
• Conclusion/Future Work
![Page 26: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/26.jpg)
26
Unyielding Threads
• Linux kernel threads are cooperative (i.e. must yield)– Token scheduler controls when eligible to start
• Single long task can have short term disruptions– Affecting delay and jitter on other virtual networks
• Token bucket does not go negative– Long term, a virtual network can get more than its share
Tokens added (rate A)
Min tokens to exec (M)
Tokens consumed (1 per scheduler tick)
Size of Bucket (S)
![Page 27: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/27.jpg)
27
Unyielding Threads (solution)
• Determine maximum allowable execution time– e.g. from token bucket parameters, network guarantees
• Determine pipeline’s execution time– Elements from library have known execution times– Custom elements execution times are unknown
• Break pipeline up (for known)
• Execute inside of container (for unknown)
elem1 elem2 elem3
elem1 elem2 elem3
elem1 elem2 elem3FromKern
ToUser
![Page 28: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/28.jpg)
28
Custom Elements Written in C++
• Elements have access to global state– Kernel state/functions– Click global state
• Could…– Pre-compile in user mode – Pre-compile with restricted header files
• Not perfect:– With C++, you can manipulate pointers
• Instead, custom elements are unknown (“unsafe”)– Execute in container in user space
![Page 29: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/29.jpg)
29
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 30: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/30.jpg)
30
Extending beyond commodity HW
• PC + Programmable NIC (e.g. NetFPGA)– FPGA on PCI card– 4 GigE ports– On board SRAM and DRAM
• Jon Turner’s “Pool of Processing Elements” – with crossbar – PEs can be GPP, NPU, FPGA– Switch Fabric = Crossbar
Switch Fabric
LC1
PE1 PE2
LC2
PEm
LCn
. . .
. . .LineCards
ProcessingEngines
Partition between FPGA and SoftwareGeneralize: Partition among PEs
![Page 31: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/31.jpg)
31
FPGA Click
• Two previous approach– Cliff – Click graph to Verilog, standard interface on modules– CUSP – Optimize Click graph by parallelizing internal statements.
• Our approach:– Build on Cliff by integrating FPGAs into Click (the tool)
• Software Analogies– Connection to outside environment– Packet Transfer – Element specification and implementation– Run-time querying and configuration– Memory– Notifiers– Annotations
FromDevice(eth0)
Element(LEN 5)
ToDevice(eth0)
![Page 32: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/32.jpg)
32
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 33: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/33.jpg)
33
Experimental Evaluation
• Is multi-level the right approach?– i.e. is it worth effort to support kernel and FPGA– Does programmability imply less performance?
• What is the overhead of virtualization?– From container: when you need to go to user space.– From using multiple threads: when running in kernel.
• Are the virtual networks isolated in terms of resource usage?– What is the maximum short-term disruption from
unyeilding threads?– How long can a task run without leading to long-term
unfairness?
![Page 34: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/34.jpg)
34
Setup
PC3000 on Emulab3GHz, 2GB RAM
*Generates Packets fromn0 to n1, tagged with time
* Receives packets, diffs the current time and packet time (and stores avg in mem)
n0
n1
n2
n3
rtr
The router under test(Linux or a Click config)
Modify header (IP and ETH)To be from n1 to n2.
![Page 35: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/35.jpg)
35
Is multi-Level the right approach?
0
100000
200000
300000
400000
500000
600000
700000
FPGA Click-Kernel
Linux Click-User
Pea
k S
en
d R
ate
(64
by
te p
kts
/sec)
• Performance benefit going from user to kernel, and– Kernel to FPGA
• Programmability imply less performance? – Not sacrificing performance by introducing programmability
![Page 36: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/36.jpg)
36
What is the overhead of virtualization?From container• When you must go to user space, what is the cost
of executing in a container?
• Overhead of executing in a VServer is minimal
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
Click (in Container) Click (not in Container)
Pea
k R
ate
(64
Byt
e pk
ts/s
ec)
![Page 37: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/37.jpg)
37
What is the overhead of virtualization? From using multiple threads
4portRouter(compoundelement)
RoundRobinPollDevice
4portRouter(compoundelement)
ToDevice
Thread(each runs X tasks/yield)
Put same click graph in each thread
Round robin traffic between them
![Page 38: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/38.jpg)
38
How long to run before yielding
0
20000
40000
60000
80000
100000
120000
140000
160000
1 10 100 1000 10000 100000
tasks / yield
Pe
ak
Se
nd
Ra
te
(64
by
te p
kts
/se
c)
2 virt. Networks
4 virt. Networks
10 virt. Networks
• # tasks per yield:– Low => high context switching, I/O executes often– High => low context switching, I/O executes infrequently
![Page 39: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/39.jpg)
39
What is the overhead of virtualization? From using multiple threads
• Given sweet spot for each # of virtual networks– Increasing number of virtual networks from 1 to 10 does
not hurt aggregate performance significantly
• Alternatives to consider– Single threaded with VServer– Single threaded, modify Click to do resource accounting– Integrate polling into threads
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
1 2 4 6 8 10
# Virtual Networks
Pea
k S
end
Rat
e 64
byt
e p
kts/
sec
(Bar
s)
0
10
20
30
40
50
60
70
Tas
ks /
Yie
ld(L
ines
)
![Page 40: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/40.jpg)
40
What is the maximum short-term disruption from unyeilding threads?• Profile of (some) Elements
– Standard N port router example - ~ 5400 cycles (1.8us)– RadixIPLookup (167k entries) - ~1000 cycles– Simple Elements
CheckLength - ~400 cycles Counter - ~700 cycles HashSwitch - ~450 cycles
• Maximum Disruption is length of longest task– Possible to break up pipelines
RoundTripCycleCount
InfiniteSource
ElemDiscardNoFree
![Page 41: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/41.jpg)
41
Chewy
How long can a task run without leading to long-term unfairness?
4portRouter(compound
element)
Count cycles
InfiniteSource
Discard
4portRouter(compound
element)
InfiniteSource
Discard
Limited to 15%
![Page 42: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/42.jpg)
42
How long can a task run without leading to long-term unfairness?
• Tasks longer than 1 token can lead to unfairness
• Run long executing elements in user-space – performance overhead of user-space is not as big of an issue
00.10.20.30.40.50.60.70.8
0 100 200 300 400 500 600 700 800 900
Chewy Load
Frac
tion
CP
U(n
etw
ork
with
che
wy)
0
0.05
0.1
0.15
0.2
0 5 10 15 20 25 30 35 40 45 50
Chewy Load
Frac
tion
CP
U(n
etw
ork
with
che
wy)
Zoomed In
~10k extra cycles / task
![Page 43: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/43.jpg)
43
Outline
• Architecture
• Implementation– Virtualizing Kernel– Challenges with kernel execution– Extending beyond commodity hardware
• Evaluation
• Conclusion/Future Work
![Page 44: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/44.jpg)
44
Conclusion
• Goal: Enable custom data planes per virtual network
• Tradeoffs– Performance– Isolation– Programmability
• Built a multi-level version of Click– FPGA– Kernel– Container
![Page 45: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/45.jpg)
45
Future Work
• Scheduler– Investigate alternatives to improve efficiency
• Safety– Process to certify element as safe (can it be automated?)
• Applications– Deploy on VINI testbed– Virtual router migration
• HW/SW Codesign Problem– Partition decision making– Specification of elements (G language)
![Page 46: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/46.jpg)
46
Questions
Click!Click!
Multi Level
![Page 47: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/47.jpg)
47
Backup
![Page 48: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/48.jpg)
48
Signs of Openness
• There are signs that network owners and equipment providers are opening up
• Peer-to-peer and network provider collaboration– Allowing intelligent selection of peers– e.g. Pando/Verizon (P4P), BitTorrent/Comcast
• Router Vendor API – allowing creation of software to run on routers– e.g. Juniper PSDP, Cisco AXP
• Cheap and easy access to compute power– Define functionality and communication between machines– e.g. Amazon EC2, Sun Grid
![Page 49: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/49.jpg)
49
Example 1: User/Kernel Partition
• Execute “unsafe” elements in container– Add communication elements
s1 s2 s3
u1
s1 s2 s3
tu fu
fk tkUser
Kernel
container
u1
Safe (s1, s2, s3)Unsafe (u1)
ToUser (tu), FromKernel (fk)ToKernel(tk), FromUser (fu)
![Page 50: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/50.jpg)
50
Example 2: Non-Commodity HW
• PC + Programmable NIC (e.g. NetFPGA)– FPGA on PCI card– 4 GigE ports– On board SRAM and DRAM
• Jon Turner’s “Pool of Processing Elements” – with crossbar – PEs can be GPP, NPU, FPGA– Switch Fabric = Crossbar
Switch Fabric
LC1
PE1 PE2
LC2
PEm
LCn
. . .
. . .LineCards
ProcessingEngines
Partition between FPGA and SoftwareGeneralize: Partition among PEs
![Page 51: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/51.jpg)
51
Example 2: Non-Commodity HW
• Redrawing the picture for FPGA/SW…– Elements can have HW implementation, SW
implementation, or both (choose one)
hw1 hw2 hw3
sw1
hw1 hw2 hw3
tc fc
fd tdSoftware
FPGA
sw1
ToCPU (tc), FromDevice (fd)ToDevice(td), FromCPU (fc)
Software (sw1)Hardware (hw1, hw2, hw3)
![Page 52: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/52.jpg)
52
Connection to outside environment
• In Linux, the “Board” is set of devices (e.g. eth0)– Can query Linux for what’s available– Network driver (to read/write packets)– Inter process communication (for comm with handlers)
• FPGA is a chip on a board– Using “eth0” needs
Pins to connect to Some on chip logic (in form of IP Core)
• Board API– Specify available devices– Specify size of address block - used by char driver– Provide elaborate() function
Generates a top level Verilog module Generates a UCF file (pin assignments)
![Page 53: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/53.jpg)
53
Packet Transfer
• In software it is a function call
• In FPGA use a pipeline of elements with a standard interface
• Option1: Stream packet through, 1 word at a time– Could just be the header– Push/Pull a bit tricky
• Option2: Pass pointer– But would have to go to memory (inefficient)
Element1 Element2
datactrlvalidready
![Page 54: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/54.jpg)
54
Element specification and implementation• Need
– Meta-data– Specify packet processing– Specify run-time querying handling (next slide)
• Meta-data– Use Click C++ API– Ports– Registers to use specific devices
e.g. FromDevice(eth0) registers to use eth0
• Packet Processing– Use C++ to print out Verilog
Specialized based on instantiation parameters (config. string)– Standard interface for packet – Standard interface for handler
Currently memory mapped register
![Page 55: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/55.jpg)
55
Run-time querying and configuration
• Query state and update configuration in elements– e.g. “add ADDR/MASK [GW] OUT”
• When Creating Element– Request Addr Block– Specify software handlers– Uses read/write methods to
get data
• Allocating Addresses– Given total size, and – size of each elements
requested block
• Generating Decode Logic
click
chardriver
telnet
decode
elem1 elem2 elem3
PCI
kernel
user
FPGA
![Page 56: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/56.jpg)
56
Memory
• In software – malloc – static arrays– Share table through global variables or passing pointer– Elements that do no packet processing
(passed as configuration string to elements)
• In FPGA– Elements have local memory (registers/BRAM)– Unshared (off-chip) memories – treat like a device– Shared (off-chip) global memories (Unimplemented)
Globally shared vs. Shared between subset of elements
– Elements that do no packet processing (Unimplemented)
![Page 57: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/57.jpg)
57
Notifiers, Annotations
• Notifiers– Element registers as listener or notifier– In FPGA, create extra signal(s) from notifier to listener
• Annotations– Extra space in Packet data structure– Used to mark packet with info not in packet
Which input port packet arrived in Result of lookup
– In software fixed byte array
– In FPGA packet is streamed through, so adding extra bytes is simple
![Page 58: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/58.jpg)
58
User/Kernel Communication
• Add communication elements– Use mknod for each direction– ToUser/FromUser store packets
and provide file functions– ToKernel/FromKernel use file I/O
s1 s2 s3
u1
s1 s2 s3
tu fu
fk tkUser
Kernel
container
u1
Safe (s1, s2, s3)Unsafe (u1)
ToUser (tu), FromKernel (fk)ToKernel(tk), FromUser (fu)
![Page 59: Eric Keller Oral General Exam 5/5/08 Multi-Level Architecture for Data Plane Virtualization.](https://reader030.fdocuments.in/reader030/viewer/2022032803/56649e1b5503460f94b09849/html5/thumbnails/59.jpg)
59
FPGA/Software Communication
• Add communication elements– ToCPU/FromCPU uses device that communicates with
Linux over PCI bus– Network driver in Linux– ToDevice/FromDevice – standard Click element
hw1 hw2 hw3
sw1
hw1 hw2 hw3
tc fc
fd tdSoftware
FPGA
sw1
ToCPU (tc), FromDevice (fd)ToDevice(td), FromCPU (fc)
Software (sw1)Hardware (hw1, hw2, hw3)