Applied Research Laboratory Edward W. Spitznagel 7 October 20151 Packet Classification for Core...

28
Applied Research Laborato Applied Research Laborato Edward W. Spitznage Edward W. Spitznage March 22, 2022 1 Packet Classification for Core Routers: Is there an alternative to CAMs? Paper by: Florin Baboescu, Sumeet Singh, George Varghese Presentation by: Edward W. Spitznagel

Transcript of Applied Research Laboratory Edward W. Spitznagel 7 October 20151 Packet Classification for Core...

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 1

Packet Classification for Core Routers: Is there an alternative to CAMs?

Paper by:

Florin Baboescu, Sumeet Singh, George Varghese

Presentation by:

Edward W. Spitznagel

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 2

Outline• Introduction

• Packet Classification Problem

• Extended Grid-of-Tries (EGT)– Grid-of-Tries– Extending Grid-of-Tries into EGT– Path Compression– Results

• Summary

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 3

Packet Classification Problem• Suppose you are a firewall, or QoS router, or network monitor ...

• You are given a list of rules (filters) to determine how to process incoming packets, based on the packet header fields

• Goal: when a packet arrives, find the least-cost rule that matches the packet’s header fields

SourceAddress

11*

01*

0101

1101

DestinationAddress

01*

0010

*

101*

Filter

a

b

c

d

SourcePort

2-4

3-15

3

*

DestinationPort

0-15

3-15

*

*

Protocol

TCP

UDP

*

ICMP

Action

fwd 7

fwd 2

deny

fwd 5

Cost

2

10

5

7

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 4

Packet Classification Problem• Example: packet arrives with header (0101, 0010, 3, 5, UDP)

– classification result: filter c

– filter b also matches, but, c has lower cost

• Easy when we have only a few rules; very hard with 100,000 rules and packets arriving at 40 Gb/s

SourceAddress

11*

01*

0101

1101

DestinationAddress

01*

0010

*

101*

Filter

a

b

c

d

SourcePort

2-4

3-15

3

*

DestinationPort

0-15

3-15

*

*

Protocol

TCP

UDP

*

ICMP

Action

fwd 7

fwd 2

deny

fwd 5

Cost

2

10

5

7

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 5

Packet Classification - Metrics

• Metrics for evaluating classification algorithms:– Time complexity of classifying a packet

often expressed as the number of memory accesses required

– Storage requirements of data structures– Number of fields that can be handled

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 6

Packet Classification in Core Routers• Many core routers have “fairly large” (e.g. 2000

rule) databases– Expected to grow; in fact, may be limited by current

technology

• Classification in core routers must be done quickly– Emerging core routers operate at 40Gb/s. With 40-

byte packets, that means one packet every 8 nsec

• Thus the general belief that brute-force hardware (TCAMs) will be necessary to support packet classification in core routers

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 7

Packet Classification - TCAM disadvantages

• Ternary CAMs (TCAM) have disadvantages– Density Scaling: 10-12 transistors per bit of TCAM

(vs. 4-6 transistors per bit of SRAM)– Power Scaling: due to performing all comparisons in

parallel.– Time Scaling: 5-10 nsec for a TCAM operation– Extra Chips: requires TCAM chip(s) and bridge ASIC– Rule Multiplication for ranges: arbitrary ranges are

represented by sets of prefixes; very inefficient.

• Thus, we consider an algorithmic solution...

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 8

Packet Classification trends• Packet classification in 2D: several good methods

– Grid of Tries, Area-based QuadTrees, FIS-trees, Tuple-space search, range trees and fractional cascading

• Classification in k dimensions, where k>2, is hard– O(logK-1 N) time and linear space, or O(log N) time and

O(NK) space, for N filters in K dimensions

• Modern algorithms: use heuristics to exploit the structure and properties that real-world filter databases tend to have.– Example: RFC and HiCuts algorithms

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 9

Extended Grid of Tries (EGT)• Observation: Core router tables studied have a low

maximum filter depth in the 2D space defined by <Source IP Address, Destination IP Address>

in this case, “low” means20 or less

i.e. no point in this 2D plotof filters is covered by morethan 20 filters

0xFFFF

00

Source AddressD

est.A

ddre

ss

a

bc

d

0xFFFF

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 10

Extended Grid of Tries (EGT)• The Basic Idea:

– Use an existing 2D scheme to classify with respect to Source IP and Dest. IP

– Then, do linear search over asmall list of possible matches(at most 20, but typicallyaround 5)

• EGT: use Grid-of-Triesas the 2D scheme

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 11

Grid of Tries - Intuition• Imagine a search trie containing Dest. Address prefixes• Now add a Source Address trie under each Dest. prefix

– Filters are stored in these tries, perhaps multiple times

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 12

Grid of Tries - Intuition• Reduce storage by storing each filter only once

– But we now need to backtrack to ancestors’ source tries during a search...

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 13

Grid of Tries• Use switch pointers to improve search efficiency

– allows us to jump to the next source trie among ancestors, instead of backtracking

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 14

Extended Grid of Tries• EGT uses jump pointers instead of switch pointers

– EGT requires the 2D search to return all filters matching in those dimensions

– Thus, some of the nodes skipped by a switch pointer cannot be skipped in an EGT search

• So, search complexity is a bit higher than in ordinary Grid-of-Tries– worst case search takes W+(H+1)*W = (H+2)*W time, where

W=time to find best prefix in a single trie, and H=max trie height (H=32 for IPv4)

– but, the authors expect typically it takes L*W with L being a small value (reflecting the low maximum prefix containment seen in most filter databases)

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 15

EGT with Path Compression (EGT-PC)• EGT-PC adds Path Compression whereby single

branching paths are removed– Improves search time and storage requirements, particularly for

small filter sets

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 16

EGT-PC: Results

• Storage requirements: impressively low (almost as low as TCAM!)– since we store each filter only once

Storage, in terms of number of 32-bit words

• Classification time is good, but not as impressive– also a result of storing each filter once: we therefore may need to traverse

multiple Source tries

Memory accesses, in terms of 32-bit word accesses

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 17

EGT-PC: Results• Memory usage by component:

• Storage for list is proportionalto number of filters

• Storage for trie is roughlyproportional to number of filters

• Path compression reduces storage by a factor of 3, roughly

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 18

EGT-PC: Results with larger databases• Larger databases are generated using smaller ones as a core

– randomly generated prefixes for Source Address and Destination Address, using the prefix length distributions from the original databases

– Other fields are randomly derived from the distributions in the original databases

• Memory Accesses: still not bad, even for large databases

• Storage Requirements: still appear to be linear

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 19

EGT-PC: Remarks• May only work well with core routers

• Lookups:– faster than HiCuts; not as fast or as deterministic as RFC.

– can easily be characterized by maximum 2D filter depth

• Storage requirements: quite good– using Grid-of-Tries for the 2D scheme is a wise choice (storage efficiency)

• Very nice to have results comparing several different algorithms (unlike nearly all previous papers)

• It is possible to apply the basic EGT idea, but with a different 2D scheme– Tuple Space, FIS-trees, RFC in 2D, and perhaps Area-based QuadTrees

– The trick is that the 2D scheme must be modified to return all filters matching those 2 dimensions (rather than just the least-cost filter matching those 2 dimensions)

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 20

Comparison of different algorithms

Best WorstLookup Speed

Storage Requirements

RFC

RFC

LinearSearch

LinearSearch

EGT

EGT-PC

HiCuts-1TCAM

TCAM

EGT-PC

HiCuts-4

Best Worst

EGTHiCuts-1 HiCuts-4

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 21

Summary• Packet Classification: Given packet P and list of filters

F, find least cost filter in F that matches P– Important metrics: Lookup time, data structure size

• Extended Grid of Tries– Core routers have a low maximum filter depth in the 2D

space defined by <Src. Addr, Dest. Addr> – Thus, we can perform a 2D search via Grid of Tries, and

then and we can add path compression to the trie

– Lookup time is fairly good; storage requirements are very good.

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 22

Thanks -- Questions?

?

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 23

Backup slides to follow...

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 24

Geometric Representation• Filters with K fields can

be represented geometrically in K dimensions

• Example:

2 640

2

6

4

0

Source Address

Sou

rce

Por

t

Source Address Source PortFilter

xxx 2-3a

010 0-7b

xx1 7c

a

b

c c c c

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 25

• Most popular practical approach to high-performance packet classification

• Hardware compares query word (packet header) to all stored words (filters) in parallel– each bit of a stored word can be 0, 1, or X (don’t care)

• Very fast, but not without drawbacks:– High power consumption limits scalability

– inefficient representation of ranges

Ternary CAMs

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 26

SourceAddress

DestinationAddress

Filter

11xx xxxxa

0xxx 01xxb

xxxx 0110c

11100110

11100110

11100110

11100110Query:

Match!

Doesn’t Match

Match!

(Now perform priority resolution...)

1110 0110Packet:Src. Addr. Dest. Addr.

ContentsAddress

11xxxxxx0

0xxx01xx1

xxxx01102

TCAM

Ternary CAM - Example

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 27

Range Matching in TCAMs

• Convert ranges intosets of prefixes– 1-4 becomes 001, 01*, and 100

– 3-5 becomes 011 and 10*

2 640

2

6

4

0

Source Port

Des

tina

tion

Por

t

F

Source Port Destination PortFilter

1-4 3-5F

Applied Research LaboratoryApplied Research LaboratoryEdward W. SpitznagelEdward W. Spitznagel

April 19, 2023 28

Range Matching in TCAMs

• With two 16-bit range fields,a single rule could require upto 900 TCAM entries!

• Typical case: entire filter setexpands by a factor of 2 to 6

2 640

2

6

4

0

Source Port

Des

tina

tion

Por

t

b c

e f

a

d

Source Port Destination PortFilter

001 10*a

01* 10*b

100 10*c

001 011d

01* 011e

100 011f