Post on 30-Dec-2015
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Towards Effective Portability of
Packet Handling Applications Across
Heterogeneous
Hardware Platforms
Fulvio Risso (fulvio.risso@polito.it)
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
A typical packet-based application
Packet-based application
Packet Filtering
ClassificationPattern
MatchingConnection
tracking
Higher-level processing logic
Very common pieces of (simple) logic related to packet processing
. . .
Network packets
Generic output
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Application
The problem
• Many different kinds of processing
– may even require to be updated in real-time
• Processing efficiency issues
– we need to optimize these components and exploit some dedicated hardware components when available
Count traffic (bytes) according to the “ip.source” field of each packet
Count traffic (bytes) belonging to the following protocols:IP, IPv6, TCP, UDP
Extract the value of field tcp.seqnumber from TCP packets
Capture UDP packets whose udp.sport == 53
Application
Application
Application
Network packets
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
The (proposed) solution
• The solution– NetVM (Network Virtual Machine)
• Packet handling programming– Architecture– Instruction set– Programming language
NetVM : JavaVM = IXP2400 : Pentium
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Application
Implementing the NetVM ...
Hardware
Packets
Count traffic (bytes) according to the “ip.source” field of each packet
Count traffic (bytes) related to the following protocols: IP, IPv6, TCP, UDP
Extract the value of field tcp.seqnumber
from TCP packets
Capture UDP packets whose
udp.sport == 53
Application
Application
Application
User Level
Packet Processing
… in user (application) space
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Application
Implementing the NetVM ...
Hardware
Packets
Count traffic (bytes) according to the “ip.source” field of each packet
Count traffic (bytes) related to the following protocols: IP, IPv6, TCP, UDP
Extract the value of field tcp.seqnumber
from TCP packets
Capture UDP packets whose
udp.sport == 53
Application
Application
Application
User Level
Packet Processing
… in kernel space
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Application
Implementing the NetVM ...
Hardware (e.g. NIC)
Packets
Count traffic (bytes) according to the “ip.source” field of each packet
Count traffic (bytes) related to the following protocols: IP, IPv6, TCP, UDP
Extract the value of field tcp.seqnumber
from TCP packets
Capture UDP packets whose
udp.sport == 53
Application
Application
Application
User Level
Packet Processing
… in hardware
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
(Hardware)
The NetVM framework
PacketProcessing
Library
Higher-level code
Create program in “intermediate” assembler (NetVM assembler)
NetVM
Create native program forthe target hardware platform JIT compiler
Native code for general-purpose CPU (e.g. x86)
Native code for network processor (e.g. IXP 2400)
VHDL code for reprogrammable ASICs
Packets
Compiler
Define the processing through a high-level language
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Another Hourglass Model
Specialized networkhardware (vendor Y)
NetVM
NAT Firewall NIDS
New applications:- Content Delivery
Networks- Active Networking
Trafficmonitor
Packetcapture
Generic Hardware(e.g., PC)
Specialized networkhardware (vendor X)
High-level networking interface
Low-level networking interface
L4/7switches
Routeraccess
list
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Properties
• Optimized to operate on network packets• Lightweight• Efficient execution on
– Network processors– Systems with custom hardware – User programs can benefit from hw resources thanks
to the JIT compilation
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Benefits
• Fast network program development• Application portability• Hardware implementation
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
NetVM Architecture
NetVM
PE1
(e.g. filtering)
Inp
ut
So
cke
t
Out
put
Por
t
Inpu
tP
ort
IP Reassemblycoprocessor
TCPReassemblycoprocessor
ConnectionTracking
coprocessor
PE2
(e.g. classification)In
put
Por
t
Out
put
Por
t
Ou
tpu
tS
ock
et
NetworkPackets
Packets/Other Infos
NetIL Bytecode NetIL Bytecode
Shared Memory
LocalPU
LocalMemory
LocalPU
LocalMemory
Control Plane (API)
Data Plane Exchange Buffer Pool
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Processing Element Architecture
NetPE
Read-Only Registers
Data Memory
Code Memory
Local Processing Unit
PC program counter
PortTable
Evaluation Stack
SP stack pointer
Ne
tPE
inte
rna
l co
mm
un
ica
tion
bu
sLocal Variables PoolC
urr
ent
Exc
han
ge
Bu
ffer
Sh
ared
Mem
ory
Config. Registers
CML code memory length
DML data memory length
EBL exchange buffer length
PTL port table length
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Processing Element Interconnection
NetPE1
(e.g. filtering)NetPE2
(e.g. classification)
NetPE1
(e.g. IP stats)
NetPE2
(e.g. TCP stats)
Port1
Port2
Output1
Output2
NetPE3
(e.g. UDP stats)
. . . . . .
Input Output
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Exchange Buffer
Exchange Buffer
NetPE1
(e.g. field extraction)NetPE2
(e.g. classification)
Packet Data
Packet Buffer
IP.src: offset 26IP.dst: offset 30TCP.sport: offset 34TCP.dport: offset 36
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
High Level Code
• NetPFL (Network Packet Filtering Language)– Example: IPv4 filter
eth.type == 0x800 ReturnPacket on port 1
• Potentially, a C compiler can be created for generating NetVM code
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Corresponding NetVM Code
; Push Port Handler; triggered when data is present on a push-input portsegment .push.locals 5.maxstacksize 10 pop ; pop the "calling" port ID push 12 ; push the location of the ethertype upload.16 ; load the ethertype field push 2048 ; push 0x800 (=IP) jcmp.eq send_pkt ; compare the 2 topmost values; jump if true ret ; otherwise do nothing and return
send_pkt: pkt.send out1 ; send the packet to port out1 ret ; returnends
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Numerical results
• Filtering on “IP” packets– Interpreted code (no JIT)– PC Dual Xeon, 1GB RAM, 2GHz clock
• Promising performances– Stack-based architecture (less efficient)– NetVM interpreter not really optimized– NetVM interpreter is not the “definitive” target
platform
Berkeley Packet Filter (Win32) NetPE
64 CPU ticks1 392 CPU ticks1
1 3 BPF instructions against 7 NetIL instructions
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Distributed Packet Processing
Ethernet phone
Router
ADSL ModemRemote workstation
Local workstation
Count IPv6 and IPv6- in-IPv4 packets
Send an alarm when a SIP INVITE is received
Reassembly all TCP sessions on port 8888 and look for keyword “MP3” in there
Capture PPPOE packets
User application
Get a summary of each TCP session
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Some new ideas for discussion
• Is the NetVM suitable for implementing a complete packet-based application?
• Is the NetVM suitable for hiding the complexity of network processors?
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Portability
Developing a complete application
Performance
NetVM goals:
Other technologies already offer a solution to this problem (e.g. Java, CLR)
What we need is something that allows very high performance on packet-processing code
Can we create complete, portable
applications using the NetVM?
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
Developing a complete application
Packet-based application
Packet Filtering
ClassificationPattern
MatchingConnection
tracking
Higher-level processing logic
. . . NetVM
General-purposeCPU
Packet-based application
Packet Filtering
ClassificationPattern
MatchingConnection
tracking
Higher-level processing logic
. . .
NetVM
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
A note about high performance
• Is the NetVM suitable for hiding the complexity of network processors (or ASICs)?– Network Processors have different architectures for
being able to squeeze the last bit of performance out of them
• This is one of the reason a large number of companies are still developing ASICs
– Are engineers fancy developing NetVM code?• You cannot avoid some performance penalty with
NetVM
N E T G R O U P • P O L I T E C N I C O D I T O R I N O
The NetVM target
• Which is the most appropriate target for the NetVM?– Packet capture with some basic (and customizable)
packet processing– Anything else?
• What about “complex” applications (e.g. firewall)?– The current model cannot guarantee portability
• Should we stay with a “simple” NetPE model, or a “service processor” model may be better?– Requires at least a C compiler– Currently implementing Snort in the NetVM