High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

29
High-performance Architecture for Dynamically Updatable Packet Classification on FPGA Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher: ANCS 2013 Presenter: Chun-Sheng Hsueh Date: 2013/11/13 1

description

High-performance Architecture for Dynamically Updatable Packet Classification on FPGA. Author : Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher: ANCS 2013 Presenter: Chun-Sheng Hsueh Date: 2013/11/13. INTRODUCTION. - PowerPoint PPT Presentation

Transcript of High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Page 1: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

High-performance Architecture for Dynamically

Updatable Packet Classification on FPGA

Author:

Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna

Publisher:ANCS 2013

Presenter:Chun-Sheng Hsueh

Date:2013/11/13

1

Page 2: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

INTRODUCTION

This paper present a 2-dimensional pipelined architecture for packet

classification on FPGA, which achieves high throughput while

supporting dynamic updates.

The performance of the architecture does not depend on rule set

features.

The architecture also efficiently supports range searches in

individual fields.

2

Page 3: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

INTRODUCTION

The architecture consists of multiple self-reconfigurable Processing

Elements (PEs); it sustains high performance for packet

classification and supports efficient dynamic updates of the rule set.

3

Page 4: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

INTRODUCTION

The contributions in this work include:

◦ Scalable architecture

◦ Distributed update algorithms

◦ Implementation tradeoffs

◦ Superior throughput (190Gbps with 1million updates/s for a 1K

15-tuple rule set)

◦ Energy efficiency (Compared to TCAM, our architecture

sustains 2x throughput and supports fast dynamic updates with

4x energy efficiency)

4

Page 5: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

BACKGROUND Classic Packet Classification

OpenFlow Packet Classification

5

Page 6: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Packet Classification Techniques

Decision-tree based approaches

Decomposition based approaches

6

Page 7: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

BV-based Approaches on FPGA

7

Page 8: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

BV-based Approaches on FPGA

8

Page 9: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

ARCHITECTURE

Challenges:

◦ The classic packet classification requires range match to be

performed in the port number fields. Since FSBV as well as

StrideBV does not support range match directly, they often need

a range-to-prefix conversion; this can lead to rule set expansion.

◦ In BV-based approaches, no matter whether we use striding or

not, the bit vectors in each pipeline stage for N rules are N-bit

long. This means the length of the longest wire connecting

different memory modules together also increases at a rate of

O(N), which degrades the throughput performance of the

pipeline.

9

Page 10: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Modular PE

10

Page 11: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Modular PE

11

Page 12: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

2-dimensional Pipelined Architecture

Vertical the upward (up) and

downward

(down) direction in which the input

packet header bits in a subfield j is

propagated in a pipelined fashion

Horizontal the forward (right) and

backward (left) direction in which the

bit vectors are propagated in a

pipelined fashion

12

Page 13: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Striding and Clustering

13

Page 14: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Striding and Clustering

For prefix match, we construct a BV array consisting of 2^s bit

vectors, each of length n; this requires a data memory of size 2^s*n.

For range match, the data memory needs to store 2s-bit range

boundaries for each rule. Therefore the data memory has to be

configured as 2n *s.

To use the data memory as storage for both the BV array and the

range boundaries, the data memory is configured to be

max[2s,2n] * max[n, s].

14

Page 15: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

DYNAMIC UPDATES

15

Page 16: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

DYNAMIC UPDATES

16

Page 17: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Architecture for Dynamic Updates

Storing valid bits:

◦ This paper use an extra column of PEs, each storing n valid bits

for each horizontal pipeline. We place this column of PE as the

first vertical pipeline on the left of the 2-dimensional pipelined

architecture.

◦ Valid bits are extracted during run-time and output to the next PE

in the horizontal pipeline.

17

Page 18: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Architecture for Dynamic Updates

Logic-based rule decoder

◦ To save I/O pins on FPGA, we use the pins for packet header (in

total L pins) to input a new rule.

◦ During an update, the control signals are generated by the rule

decoder.

18

Page 19: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Architecture for Dynamic Updates

The rule decoder is in charge of:

◦ RID check (for all update operations, only in PEs of the first

column)

◦ Rule translation (for modification/insertion, in all PEs)

◦ Validity check (for deletion/insertion, only in PEs of the first

column)

◦ Construction of up-to-date valid bits (for insertion, only in PEs

of the first column)

19

Page 20: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Update Schedule and Overhead

20

Page 21: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

Update Schedule and Overhead

21

Page 22: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

The author conducted experiments using Xilinx ISE Design Suite 14.5,

targeting the Virtex 6 XC6VLX760 FFG1760-2 FPGA.

This device has 118560 logic slices, 1200 I/O pins, 26Mb BRAM, and

can be configured to realize large amount of distRAM (up to 8Mb).

It has 2 slices in a Configurable Logic Block (CLB), each slice having

4 LUTs and 8 flip-flops.

This paper use randomly generated bit vectors; we also generate

random packet headers for both classic (d = 5, L = 104) and OpenFlow

(d = 15, L = 356) packet headers 22

Page 23: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

23

We achieve very high clock rate (200 ~ 400MHz) with small

variations in various designs. They correspond to high throughput

for OpenFlow packet classification(128 ~ 256Gbps).

Page 24: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

24

Page 25: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

25

Page 26: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

26

Page 27: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

27

Page 28: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

28

Page 29: High-performance Architecture for Dynamically Updatable Packet Classification on FPGA

PERFORMANCE EVALUATION

29