4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.

34
4/19/2002 1 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler

Transcript of 4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.

4/19/2002 1

TCPSplitter: A Reconfigurable Hardware Based TCP Flow MonitorDavid V. Schuehler

4/19/2002 2

Outline

Motivation Target Platform Design Possible Applications Results Conclusion

4/19/2002 3

MOTIVATION

4/19/2002 4

Why work with TCP?

Over 85% on internet traffic is TCP based Internet is growing TCP is a proven reliable transport for data

delivery Provide high speed active networks the ability

work with TCP flows

4/19/2002 5

Why not implement a full TCP stack in hardware? Complex protocol stack Several interactions on client interface

(sockets?) Difficult to achieving high performance Large memories required for reassembly Limited number of simultaneous connections

4/19/2002 6

Solution

Develop TCP flow monitor - TCPSplitter Utilize existing hardware infrastructure (FPX) Expand upon Layered Protocol Wrappers

4/19/2002 7

TARGET PLATFORM

4/19/2002 8

Configuration

4/19/2002 9

Washington University Gigabit Switch

4/19/2002 10

FPX Module

4/19/2002 11

FPX Internal Structure

SRAM

EC

Mo

du

le

EC

NID

Switch LineCard

Mo

du

le

RAD

VC VC

VCVC

RAD

Program

SDRAM SDRAM

Data Data

SRAM

Data

SRAM

Data

RAD: Reprogrammable Application Device

•Xilinx XCV1000E FPGA

•External SRAM/SDRAM

•Reprogrammable

NID: Network Interface Device

•XCV600E FPGA

•Controls FPX

•Programs RAD

•Forwards traffic

4/19/2002 12

DESIGN

4/19/2002 13

Goals

High Speed Design Small FPGA Footprint Simple Client Interface Support Large Number of Flows

4/19/2002 14

Challenges

Dealing with dropped frames Packet reordering Maintaining state for large number of flows Developing an efficient implementation Processing data at line rates Minimizing resource requirements

4/19/2002 15

Assumptions/Limitations

All frames must flow through switch Frames traversing in opposite direction

handled as separate flow In-order processing of frames for each flow

4/19/2002 16

TCPSplitter Data Flow

IP Wrapper

Client Application

TCP SplitterIP frames IP frames

Byt

e S

trea

m

4/19/2002 17

Input Processing

Flow Classification TCP Checksum Engine Input State Machine Control FIFO Frame FIFO Output State Machine

4/19/2002 18

LayoutTCPProc

TCPInput

Frame FIFO

Input State Machine

Checksum Engine

Flow Classifier

Ou

tpu

t S

tate

Ma

chin

e

Control FIFO

TCPOutput

Packet Routing

ClientApplication

IP InputIP Output

4/19/2002 19

Packet Routing Decisions

Forward to outbound IP stack only Forward to both Client App and outbound

IP stack Discard packet

4/19/2002 20

Packet Routing

Non-TCP packets IP stack Invalid TCP checksum drop TCP SYN packets IP stack (Seq # < Expected Seq #) IP stack (Seq # > Expected Seq #) drop Else client AND IP stack

4/19/2002 21

Client Interface

1 bit Clock 1 bit Reset 32 bit Data Word 1 bit Data Enable 4 bit Start/End of Data Signals 2 bit Valid Data Bytes N bit Flow Identifier 2 bit Start/End of Flow Signals 1 bit TCA

ClientApplication

4/19/2002 22

POSSIBLE APPLICATIONS

4/19/2002 23

Possible Application 1

Simultaneous update of multiple active network nodes

ProgrammerConnection

EndpointSwitch A Switch B Switch C Switch D

4/19/2002 24

Possible Application 2

Dynamic loading of customizable QoS algorithms

Switch

F

A

D

C

B

E

Source

Destination

4/19/2002 25

Possible Application 3

Monitoring content of all TCP flows for security

Internet

Workstation Workstation Workstation

WorkstationWorkstation

Switch

4/19/2002 26

RESULTS

4/19/2002 27

Synthesis Results for Xilinx XCV1000E-7TCPSplitter Full Wrappers

(Cell + Frame + IP + TCP + Client)

Space/LUTs 617 (2%) 4954 (20%)

Register bits 503 (2%) 4933 (20%)

Input processing delay

7 clock cycles * 44-68 clock cycles *

* Plus length of packet in 32 bit words

4/19/2002 28

Sample RunStart of frame

Byte count

IP payload

TCP payload

End of frame

Flow ID

4/19/2002 29

Current State of Research

Developed and simulated design Handles 256 simultaneous flows

33 bits * 256 entries = 1,056 bytes Synthesizes at 74MHz Simple test client counts TCP data bytes

4/19/2002 30

Future Directions

Execute design in hardware Increase the number of simultaneous flows

262,144 flows require only 1 MByte (+) RAM Develop more elaborate client applications Improve processing performance Implement sliding window – passive solution Enhance frame generation utility for

simulations

4/19/2002 31

CONCLUSION

4/19/2002 32

Conclusion Runs on reconfigurable hardware platform Process packets at Gigabit line rates Monitors all TCP flows Generates proper byte stream for each flow Requires only minimal memory (33 bits/flow) Simple client interface demonstrated

4/19/2002 33

Acknowledgments

Advisor: Dr. John Lockwood

4/19/2002 34

Questions