Routing Lookups and Packet Classification: Theory and Practice

151
Routing Lookups and Packet Classification: Theory and Practice Pankaj Gupta Department of Computer Science Stanford University [email protected] http://www.stanford.edu/~pankaj August 18, 2000 Hot Interconnects 8

description

Routing Lookups and Packet Classification: Theory and Practice. August 18, 2000 Hot Interconnects 8. Pankaj Gupta Department of Computer Science Stanford University [email protected] http://www.stanford.edu/~pankaj. Tutorial Outline. Introduction What this tutorial is about - PowerPoint PPT Presentation

Transcript of Routing Lookups and Packet Classification: Theory and Practice

Page 1: Routing Lookups and Packet Classification:  Theory and Practice

Routing Lookups and Packet Classification: Theory and Practice

Pankaj GuptaDepartment of Computer Science

Stanford [email protected]

http://www.stanford.edu/~pankaj

August 18, 2000

Hot Interconnects 8High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.

Page 2: Routing Lookups and Packet Classification:  Theory and Practice

2

Tutorial Outline

• Introduction– What this tutorial is about

• Routing lookups– Background, lookup schemes

• Packet Classification– Background, classification schemes

• Implementation choices for given design requirements

Page 3: Routing Lookups and Packet Classification:  Theory and Practice

3

Request to you

• Please ask lots of questions!– But I may not be able to answer all of

them right now

• I am here to learn, so please share your experiences, thoughts and opinions freely

Page 4: Routing Lookups and Packet Classification:  Theory and Practice

4

What this tutorial is about?

Page 5: Routing Lookups and Packet Classification:  Theory and Practice

5

Internet: Mesh of Routers

The Internet Core

Edge Router

Campus Area Network

Page 6: Routing Lookups and Packet Classification:  Theory and Practice

6

RFC 1812: Requirements for IPv4 Routers

• Must perform an IP datagram forwarding decision (called forwarding)

• Must send the datagram out the appropriate interface (called switching)

Optionally: a router MAY choose to perform special processing on incoming packets

Page 7: Routing Lookups and Packet Classification:  Theory and Practice

7

Examples of special processing

• Filtering packets for security reasons• Delivering packets according to a pre-

agreed delay guarantee• Treating high priority packets

preferentially• Maintaining statistics on the number

of packets sent by various routers

Page 8: Routing Lookups and Packet Classification:  Theory and Practice

8

Special Processing Requires Identification of

Flows• All packets of a flow obey a pre-defined

rule and are processed similarly by the router

• E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc.

• Router needs to identify the flow of every incoming packet and then perform appropriate special processing

Page 9: Routing Lookups and Packet Classification:  Theory and Practice

9

Flow-aware vs Flow-unaware Routers

• Flow-aware router: keeps track of flows and perform similar processing on packets in a flow

• Flow-unaware router (packet-by-packet router): treats each incoming packet individually

Page 10: Routing Lookups and Packet Classification:  Theory and Practice

10

What this tutorial is about:

• Algorithms and techniques that an IP router uses to decide where to forward the packets next (routing lookup)

• Algorithms and techniques that a flow-aware router uses to classify packets into flows (packet classification)

Page 11: Routing Lookups and Packet Classification:  Theory and Practice

11

Routing Lookups

Page 12: Routing Lookups and Packet Classification:  Theory and Practice

12

Routing Lookups: Outline

• Background and problem definition

• Lookup schemes• Comparative evaluation

Page 13: Routing Lookups and Packet Classification:  Theory and Practice

13

Lookup in an IP Router

Unicast destination address based lookup

Dstn Addr

Next Hop

--------

---- ----

--------

Dstn-prefix Next Hop

Forwarding Table

Next Hop Computation

Forwarding Engine

Incoming Packet

HEADER

Page 14: Routing Lookups and Packet Classification:  Theory and Practice

14

Packet-by-packet Router

ForwardingDecision

Forwarding

Table

Interconnect

Linecard

Linecard

Linecard

Linecard

Routing processor

ForwardingDecision

Forwarding

Table

Page 15: Routing Lookups and Packet Classification:  Theory and Practice

15

Switching

Routing Control

Datapath:per-packet processing

Routing lookup

Packet-by-packet Router: Basic Architectural

Components

Scheduling

Page 16: Routing Lookups and Packet Classification:  Theory and Practice

16

ATM and MPLS SwitchesDirect Lookup

(Port, vci/label)

Address

Memory

Data

(Port, vci/label)

Page 17: Routing Lookups and Packet Classification:  Theory and Practice

17

IPv4 Addresses

• 32-bit addresses• Dotted quad notation: e.g.

12.33.32.1• Can be represented as integers on

the IP number line [0, 232-1]: a.b.c.d denotes the integer: (a*224+b*216+c*28+d)

0.0.0.0 255.255.255.255IP Number Line

Page 18: Routing Lookups and Packet Classification:  Theory and Practice

18

Class-based Addressing

A B C D

0.0.0.0

E

128.0.0.0 192.0.0.0

Class Range MS bits netid hostidA 0.0.0.0 –

128.0.0.00 bits 1-7 bits 8-31

B 128.0.0.0 -191.255.255.255

10 bits 2-15 bits 16-31

C 192.0.0.0 -223.255.255.255

110 bits 3-23 bits 24-31

D (multicast)

224.0.0.0 - 239.255.255.255

1110 - -

E (reserved)

240.0.0.0 -255.255.255.255

11110 - -

Page 19: Routing Lookups and Packet Classification:  Theory and Practice

19

Lookups with Class-based Addresses

23

186.21

Port 1

Port 2192.33.32.1

Class A

Class B

Class C

192.33.32 Port 3Exact match

netid port#

Page 20: Routing Lookups and Packet Classification:  Theory and Practice

20

Problems with Class-based Addressing

• Fixed netid-hostid boundaries too inflexible: rapid depletion of address space

• Exponential growth in size of routing tables

Page 21: Routing Lookups and Packet Classification:  Theory and Practice

21

Exponential Growth in Routing Table Sizes

Num

ber

of

BG

P r

oute

s advert

ised

Page 22: Routing Lookups and Packet Classification:  Theory and Practice

22

Classless Addressing (and CIDR)

• Eliminated class boundaries• Introduced the notion of a variable

length prefix between 0 and 32 bits long• Prefixes represented by P/l: e.g., 122/8,

212.128/13, 34.43.32/22, 10.32.32.2/32 etc.

• An l-bit prefix represents an aggregation of 232-l IP addresses

Page 23: Routing Lookups and Packet Classification:  Theory and Practice

23

CIDR:Hierarchical Route Aggregation

Backbone

Router

R1R2

R3R4

ISP, P ISP, Q192.2.0/22 200.11.0/22

Site, S

192.2.1/24

Site, T

192.2.2/24 192.2.0/22 200.11.0/22

192.2.1/24 192.2.2/24

192.2.0/22, R2

Backbone routing table

IP Number Line

Page 24: Routing Lookups and Packet Classification:  Theory and Practice

24

Size of the Routing Table

Source: http://www.telstra.net/ops/bgptable.html

Nu

mb

er

of

act

ive B

GP p

refixes

Date

Page 25: Routing Lookups and Packet Classification:  Theory and Practice

25

Classless Addressing

A BC

0.0.0.0

Class-based:

255.255.255.255

Classless:

0.0.0.0255.255.255.255

23/8 191/8

191.23/16191.128.192/18

191.23.14/23

Page 26: Routing Lookups and Packet Classification:  Theory and Practice

26

Backbone

Router

R1

R2R3

ISP, P

192.2.0/22, R2

Backbone routing table

Non-aggregatable Prefixes:

(1) Multi-homed Networks

192.2.0/22192.2.2/24

R4

192.2.2/24, R3

Page 27: Routing Lookups and Packet Classification:  Theory and Practice

27

Backbone

Router

R1R2 R3 R4

ISP, P ISP, Q192.2.0/22 200.11.0/22

Site, S

192.2.1/24

Site, T

192.2.2/24 192.2.0/22 200.11.0/22

192.2.1/24 192.2.2/24

Non-aggregatable Prefixes:

(2) Change of ProviderBackbone routing table

192.2.0/22, R2

192.2.2/24, R3

IP Number Line

Page 28: Routing Lookups and Packet Classification:  Theory and Practice

28

Routing Lookups with CIDR

192.2.0/22, R2

192.2.2/24, R3 192.2.0/22 200.11.0/22

192.2.2/24

200.11.0/22, R4

200.11.0.33192.2.0.1 192.2.2.100

Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet

Page 29: Routing Lookups and Packet Classification:  Theory and Practice

29

Longest Prefix Match is Harder than Exact Match

• The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix

• Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length

Page 30: Routing Lookups and Packet Classification:  Theory and Practice

30

Metrics for Lookup Algorithms

• Speed• Storage requirements• Low update time• Ability to handle large routing

tables• Flexibility in implementation• Low preprocessing time

Page 31: Routing Lookups and Packet Classification:  Theory and Practice

31

0.01

0.1

1

10

100

1000

10000

100000

1980 1985 1990 1995 2000 2005Year

Sin

gle

fib

er

capaci

ty (

Gb/s

)

2x per year

Maximum Bandwidth per Installed Fiber

Source: Lucent

Page 32: Routing Lookups and Packet Classification:  Theory and Practice

32

Maximum Bandwidth per Router Port, and Lookup Performance Required

Year Line Linerate (Gbps)

40B (Mpps)

84B (Mpps)

354B (Mpps)

1997-98

OC3 0.155 0.48 0.23 0.054

1998-99

OC12 0.622 1.94 0.92 0.22

1999-00

OC48 2.5 7.81 3.72 0.88

2000-01

OC192 10.0 31.25 14.88 3.53

2002-03

OC768 40.0 125 59.52 14.12

1GE 1.0 3.13 1.49 0.35

Page 33: Routing Lookups and Packet Classification:  Theory and Practice

33

Size of Routing Table?

• Currently, 85K entries• At 25K per year, 230-256K prefixes

for next 5 years• Decreasing costs of transmission

may increase rate of routing table growth

• At 50K per year, need 350-400K prefixes for next 5 years

Page 34: Routing Lookups and Packet Classification:  Theory and Practice

34

Routing Update Rate?

• Currently a peak of a few hundred BGP updates per second

• Hence, 1K per second is a must• 5-10K updates/second seems to be safe • BGP limitations may be a bottleneck

first• Updates should be atomic, and should

interfere little with normal lookups

Page 35: Routing Lookups and Packet Classification:  Theory and Practice

35

Routing Lookups: Outline

• Background and problem definition

• Lookup schemes• Comparative evaluation

Page 36: Routing Lookups and Packet Classification:  Theory and Practice

36

Example Forwarding Table (5-bit Prefixes)

Prefix Next-hop

P1 111* H1

P2 10* H2

P3 1010* H3

P4 10101 H4

Page 37: Routing Lookups and Packet Classification:  Theory and Practice

37

Linear Search

• Keep prefixes in a linked list• O(N) storage, O(N) lookup time,

O(1) update complexity• Improve average time by keeping

linked list sorted in order of prefix lengths

Page 38: Routing Lookups and Packet Classification:  Theory and Practice

38

Caching Addresses

CPU BufferMemory

LineCard

DMA

MAC

LocalBuffer

Memory

LineCard

DMA

MAC

LocalBuffer

Memory

LineCard

DMA

MAC

LocalBuffer

Memory

Fast Path

Slow Path

Page 39: Routing Lookups and Packet Classification:  Theory and Practice

39

Caching Addresses

Advantages

Increased average lookup performance

Disadvantages

Decreased locality in backbone trafficCache sizeCache management overheadHardware implementation difficult

Page 40: Routing Lookups and Packet Classification:  Theory and Practice

40

Radix Trie

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

P2

P3

P4

P1

A

B

C

G

D

F

H

E

1

0

0

1 1

1

1

Lookup 10111

Add P5=1110*

I

0

P5

next-hop-ptr (if prefix)

left-ptr right-ptr

Trie node

Page 41: Routing Lookups and Packet Classification:  Theory and Practice

41

Radix Trie

• W-bit prefixes: O(W) lookup, O(NW) storage and O(W) update complexity

Advantages

SimplicityExtensible to wider fields

Disadvantages

Worst case lookup slowWastage of storage space in chains

Page 42: Routing Lookups and Packet Classification:  Theory and Practice

42

Leaf-pushed Binary Trie

A

B

C

G

D

E

1

0

0

1

1

left-ptr or next-hop

Trie node

right-ptr or next-hop

P2

P4P3

P2

P1P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 43: Routing Lookups and Packet Classification:  Theory and Practice

43

PATRICIA

2A

B C

E

10

1

Patricia tree internal node

3

P3

P2

P4

P110

0F G

D5

bit-position

left-ptr right-ptr

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 44: Routing Lookups and Packet Classification:  Theory and Practice

44

PATRICIA

• W-bit prefixes: O(W2) lookup, O(N) storage and O(W) update complexity

Advantages

Decreased storage Extensible to wider fields

Disadvantages

Worst case lookup slowBacktracking makes implementation complex

Page 45: Routing Lookups and Packet Classification:  Theory and Practice

45

Path-compressed Tree

1, , 2A

B C10

10,P2,4

P4

P1

1

0

E

D1010,P3,5

bit-position

left-ptr right-ptr

variable-length bitstring

next-hop (if prefix present)

Path-compressed tree node structure

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 46: Routing Lookups and Packet Classification:  Theory and Practice

46

Path-compressed Tree

• W-bit prefixes: O(W) lookup, O(N) storage and O(W) update complexity

Advantages

Decreased storage

Disadvantages

Worst case lookup slow

Page 47: Routing Lookups and Packet Classification:  Theory and Practice

47

Early Lookup Schemes

• BSD unix [sklower91] : Patricia, expected lookup time = 1.44logN

• Dynamic prefix trie [doeringer96] : Patricia variant, complex insertion/deletion : 40K entries consumed 2MB with 0.3-0.5 Mpps

Page 48: Routing Lookups and Packet Classification:  Theory and Practice

48

Multi-bit Tries

Depth = WDegree = 2Stride = 1 bit

Binary trieW

Depth = W/kDegree = 2k

Stride = k bits

Multi-ary trie

W/k

Page 49: Routing Lookups and Packet Classification:  Theory and Practice

49

Prefix Expansion with Multi-bit Tries

If stride = k bits, prefix lengths that are not a multiple of k need to be expanded

Prefix Expanded prefixes

0* 00*, 01*

11* 11*

E.g., k = 2:

Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2k-

1

Page 50: Routing Lookups and Packet Classification:  Theory and Practice

50

Four-ary Trie (k=2)

P2

P3 P12

A

B

F11

next-hop-ptr (if prefix)

ptr00 ptr01

A four-ary trie node

P11

10

P42

H11

P41

10

10

1110

D

C

E

G

ptr10 ptr11

Lookup 10111

P1 111* H1

P2 10* H2

P3 1010*

H3

P4 10101

H4

Page 51: Routing Lookups and Packet Classification:  Theory and Practice

51

Compressed Trie (k=8)

L16

L24

L8

Only 4 memory accesses!

L32

8-8-8-8 split

Page 52: Routing Lookups and Packet Classification:  Theory and Practice

52

Prefix Expansion Increases Storage

Consumption• Replication of next-hop ptr• Greater number of unused (null)

pointers in a node

Time ~ W/kStorage ~ NW/k * 2k-1

Page 53: Routing Lookups and Packet Classification:  Theory and Practice

53

Generalization: Different Strides at Each Trie Level

• 16-8-8 split• 4-10-10-8 split• 24-8 split• 21-3-8 split

Page 54: Routing Lookups and Packet Classification:  Theory and Practice

54

Choice of Strides: Controlled Prefix Expansion [Sri98]

Given a forwarding table and a desired number of memory accesses in the worst case (i.e., maximum tree depth, D)

A dynamic programming algorithm to compute the optimal sequence of strides that minimizes the storage requirements: runs in O(W2D) timeAdvantages

Optimal storage under these constraints

Disadvantages

Updates lead to sub-optimality anywayHardware implementation difficult

Page 55: Routing Lookups and Packet Classification:  Theory and Practice

55

Further Generalization: Different Stride at Each

Node [Sri98]

Given a forwarding table and a desired number of memory accesses in the worst case (i.e., maximum tree depth, D)

A dynamic programming algorithm to compute the optimal stride at each node that minimizes the storage requirements: runs in O(NW2D) time

Page 56: Routing Lookups and Packet Classification:  Theory and Practice

56

Stride Optimization : Implementation Results

Two levels Three levels

Fixed-stride

49 MB, 1ms 1.8 MB, 1ms

Varying-stride

1.6MB, 130 ms

0.57 MB, 871 ms

38816 prefixes, 300 MHz P-II

Page 57: Routing Lookups and Packet Classification:  Theory and Practice

57

Lulea Algorithm [lulea98]

16-8-8 split

L16

L24

L32

Page 58: Routing Lookups and Packet Classification:  Theory and Practice

58

Lulea Algorithm

1 0 0 0 1 0 1 1 1 0 0 0 1 1

1 1

16-8-8 split

Page 59: Routing Lookups and Packet Classification:  Theory and Practice

59

Lulea Algorithm

10001010 11100010 10000010 10110100 11000000

R1, 0 R5, 0R2, 3 R3, 7 R4, 9

0 13

Codeword array

Base index array

0 1

0 321 4

P1 P2 P3 P4Pointer array

Page 60: Routing Lookups and Packet Classification:  Theory and Practice

60

Lulea Algorithm

33K entries: 160KB, average 2Mpps

Advantages

Extremely small data structure – can fit in L1/L2 cache

Disadvantages

Scalability to larger tables?Incremental updates not supported

Page 61: Routing Lookups and Packet Classification:  Theory and Practice

61

Binary Search on Trie Levels [wald98]

P

Page 62: Routing Lookups and Packet Classification:  Theory and Practice

62

Prefix-length

Hashtable ptr

8

12

16

22

Binary Search on Trie Levels

10

10.1, 10.2

10.1.10, 10.1.32, 10.2.64

Example prefixes10/8

10.1/16

10.1.10/22

10.1.32/22

10.2.64/22

Example addresses10.1.10.4

10.2.3.9

Page 63: Routing Lookups and Packet Classification:  Theory and Practice

63

Binary Search on Trie Levels

33K entries: 1.4 MB, 1.2-2.2 Mpps

Advantages

Scales nicely to IPv6

Disadvantages

Multiple hashed memory accessesIncremental updates complex

Page 64: Routing Lookups and Packet Classification:  Theory and Practice

64

Binary Search on Prefix Intervals [lampson98]

Prefix Interval

P1 * 0000-1111

P2 00* 0000-0011

P3 1* 1000-1111

P4 1101 1101-1101

P5 001* 0010-0011

0000 11110010 0100 0110 1000 11101010 1100

P1

P4P3

P5P2

I1 I3 I4 I5 I6I2

Page 65: Routing Lookups and Packet Classification:  Theory and Practice

65

0111

I1

I3

I2 I4 I5

I6

0011 1101

11000001

>

>

>

>

>

Alphabetic Tree

0000 11110010 0100 0110 1000 11101010 1100

P1

P4P3

P5P2

I1 I3 I4 I5 I6I2

Page 66: Routing Lookups and Packet Classification:  Theory and Practice

66

Multiway Search on Intervals

38K entries: 0.95 MB, 2.1 Mpps

Advantages

Space is O(N)

Disadvantages

Incremental updates complex

Page 67: Routing Lookups and Packet Classification:  Theory and Practice

67

Depth-constrained Near-optimal Alphabetic Tree

• Redraw the binary search tree based on probability of access of routing table entries:– Minimize average lookup time– But keep worst case lookup time

bounded

40% improvement in lookup time with a small relaxation in worst case lookup time.

Page 68: Routing Lookups and Packet Classification:  Theory and Practice

68

Routing Lookups in Hardware [gupta98]

Prefix length

Num

ber

April 11, 2000

MAE-EAST routing table (source: www.merit.edu)

Page 69: Routing Lookups and Packet Classification:  Theory and Practice

69

Routing Lookups in Hardware

142.19.6.14

Prefixes up to 24-bits

1 Next Hop

24

Next Hop

142.19.6

224 = 16M entries

142.19.6

Page 70: Routing Lookups and Packet Classification:  Theory and Practice

70

Routing Lookups in Hardware

Prefixes up to 24-bits

1 Next Hop

128.3.72

24 0 Pointer

8

Prefixes above 24-bits

Next Hop

Next Hop

Next Hop

off

set

base

128.3.72.14

128.3.72

14

Page 71: Routing Lookups and Packet Classification:  Theory and Practice

71

Routing Lookups in Hardware

Prefixes up to n-bits2n entries:

0

n + m

n

i j Prefixeslonger than

n+m bits

Next Hop

2m

i entries

Page 72: Routing Lookups and Packet Classification:  Theory and Practice

72

Routing Lookups in Hardware

Various compression schemes can be employed to decrease the storage requirements: e.g. employ carefully chosen variable length strides, bitmap compression etc.

Advantages

20 Mpps with 50ns DRAM or 66 Mpps with e-DRAMEasy to implement in hardware

Disadvantages

Large memory required (9-33 MB)Depends on prefix-length distribution

Page 73: Routing Lookups and Packet Classification:  Theory and Practice

73

Content-addressable Memory (CAM)

• Fully associative memory• Exact match operation in a single

clock cycle: parallel compare

Page 74: Routing Lookups and Packet Classification:  Theory and Practice

74

Lookups with Ternary-CAM

Memory array Priority

encoder

Next-hopmemory

P32

P31

P8

DestinationAddress

Next-hop

TCAM RAM

01

2

3

M

0

1

0

0

1

Page 75: Routing Lookups and Packet Classification:  Theory and Practice

75

Advantages

Fast: 15-20 ns

Disadvantages

Expensive (and low density): 0.25 MB at 50 MHZ costs $30-$75High power: 5-8 WUpdates slow

Lookups with TCAM

Page 76: Routing Lookups and Packet Classification:  Theory and Practice

76

Updates with TCAM

P32

P31

P8

01

2

3

M

Issue: how to manage the free space : [Hoti’00]

Empty space

Page 77: Routing Lookups and Packet Classification:  Theory and Practice

77

Routing Lookups: Outline

• Background and problem definition

• Lookup schemes• Comparative evaluation

Page 78: Routing Lookups and Packet Classification:  Theory and Practice

78

Performance Comparison: Complexity

Algorithm Lookup

Storage

Update

Binary trie W NW W

Patricia W2 N W

Path-compressed trie W N W

Multi-ary trie W/k N*2k -

LC trie W N -

Lulea - - -

Binary search on trie levels

logW NlogW -

Binary search on intervals log(2N) N -

TCAM 1 N W

Page 79: Routing Lookups and Packet Classification:  Theory and Practice

79

Performance Comparison

Algorithm Lookup (ns)

Storage (KB)

Patricia (BSD) 2500 3262

Multi-way fixed-stride optimal trie (3-levels)

298 1930

Multi-way fixed-stride optimal trie (5-levels)

428 660

LC trie - 700

Lulea 409 160

Binary search on trie levels 650 1600

6-way search on intervals 490 950

Lookups with direct access 15-60 9-33 * 1000

TCAM 15-20 512

Page 80: Routing Lookups and Packet Classification:  Theory and Practice

80

Routing Lookups: References

• [lulea98] A. Brodnik, S. Carlsson, M. Degermark, S. Pink. “Small Forwarding Tables for Fast Routing Lookups”, Sigcomm 1997, pp 3-14.

• [gupta98] P. Gupta, S. Lin, N.McKeown. “Routing lookups in hardware at memory access speeds”, Infocom 1998, pp 1241-1248, vol. 3.

• P. Gupta, B. Prabhakar, S. Boyd. “Near-optimal routing lookups with bounded worst case performance,” Proc. Infocom, March 2000

• [lampson98] B. Lampson, V. Srinivasan, G. Varghese. “ IP lookups using multiway and multicolumn search”, Infocom 1998, pp 1248-56, vol. 3.

Page 81: Routing Lookups and Packet Classification:  Theory and Practice

81

Routing lookups : References (contd)

• [wald98] M. Waldvogel, G. Varghese, J. Turner, B. Plattner. “Scalable high speed IP routing lookups”, Sigcomm 1997, pp 25-36.

• [LC-trie] S. Nilsson, G. Karlsson. “Fast address lookup for Internet routers”, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998.

• [sri98] V. Srinivasan, G.Varghese. “Fast IP lookups using controlled prefix expansion”, Sigmetrics, June 1998

• TCAM vendors: netlogicmicro.com, laratech.com, mosaid.com, sibercore.com

Page 82: Routing Lookups and Packet Classification:  Theory and Practice

82

Packet Classification

Page 83: Routing Lookups and Packet Classification:  Theory and Practice

83

Packet Classification: Outline

• Background and problem definition

• Classification schemes• Comparative evaluation

Page 84: Routing Lookups and Packet Classification:  Theory and Practice

84

Flow-aware vs Flow-unaware Routers (recap)

• Flow-aware router: keeps track of flows and perform similar processing on packets in a flow

• Flow-unaware router (packet-by-packet router): treats each incoming packet individually

Page 85: Routing Lookups and Packet Classification:  Theory and Practice

85

Why Flow-aware Router?

Routers require additional mechanisms: admission control, resource reservation, per-flow queueing, fair scheduling etc.

ISPs want to provide differentiated services

capability to distinguish and isolate traffic belonging to different flows based on negotiated service agreements

classification

Rules or policies

Page 86: Routing Lookups and Packet Classification:  Theory and Practice

86

Need for Differentiated Services

ISP1

NAP

E1E2

ISP2

ISP3Z

X

Y

Service ExampleTraffic Shaping

Ensure that ISP3 does not inject more than 50Mbps of total traffic on interface X, of which no more than 10Mbps is email traffic

Packet Filtering

Deny all traffic from ISP2 (on interface X) destined to E2

Policy Routing

Send all voice-over-IP traffic arriving from E1 (on interface Y) and destined to E2 via a separate ATM network

Page 87: Routing Lookups and Packet Classification:  Theory and Practice

87

More Value added Services

• Differentiated services – Regard traffic from Autonomous System

#33 as `platinum grade’

• Accounting and Billing– Treat all video traffic as highest priority and

perform accounting for this type of traffic

• Committed Access Rate (rate limiting)– Rate limit WWW traffic from sub

interface#739 to 10Mbps

Page 88: Routing Lookups and Packet Classification:  Theory and Practice

88

Multi-field Packet Classification

Given a classifier with N rules, find the action associated with the highest priority rule matching an incoming packet.

Example: packet (5.168.3.32, 152.133.171.71, …, TCP)

Field 1 Field 2 … Field k

Action

Rule 1 5.3.90/21 2.13.8.11/32

… UDP A1

Rule 2 5.168.3/24 152.133/16 … TCP A2

… … … … … …

Rule N 5.168/16 152/8 … ANY AN

Page 89: Routing Lookups and Packet Classification:  Theory and Practice

89

Packet Header Fields for Classification

L3-SA L2-DAL2-SAL3-DA L3-PROTL4-PROTL4-DPL4-SPPAYLOAD

Transport layer header Network layer header MAC header

Direction of transmission of packet

DA = Destination AddressSA = Source AddressPROT = ProtocolSP = Source portDP = Destination port

L2 = layer 2 (e.g., Ethernet)L3 = layer 3 (e.g., IP)L4 = layer 4 (e.g., TCP)

Page 90: Routing Lookups and Packet Classification:  Theory and Practice

90

Special processing

Control

Datapath:per-packet processing

Routing lookup

Flow-aware Router: Basic Architectural Components

Routing, resource reservation, admission control, SLAs

Packet classification

Switching

Scheduling

Page 91: Routing Lookups and Packet Classification:  Theory and Practice

91

Packet Classification

Action

--------

---- ----

--------

Predicate Action

Classifier (policy database)

Packet Classification

Forwarding Engine

Incoming Packet

HEADER

Page 92: Routing Lookups and Packet Classification:  Theory and Practice

92

Packet Classification: Problem Definition

Given a classifier C with N rules, Rj, 1 j N, where Rj consists of three entities:

1) A regular expression Rj[i], 1 i d, on each of the d header fields,

2) A number, pri(Rj), indicating the priority of the rule in the classifier, and

3) An action, referred to as action(Rj).

For an incoming packet P with the header considered as a d-tuple of points (P1, P2, …, Pd), the d-dimensional packet classification problem is to find the rule Rm with the highest priority among all the rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj), j m, 1 j N, such that Pi matches Rj[i], 1 i d. We call rule Rm the best matching rule for packet P.

Page 93: Routing Lookups and Packet Classification:  Theory and Practice

93

Example 4D classifier

Rule

L3-DA L3-SA L4-DP L4-PROT

Action

R1 152.163.190.69/255.255.255.255

152.163.80.11/255.255.255.255

* * Deny

R2 152.168.3/255.255.255

152.163.200.157/255.255.255.255

eq www udp Deny

R3 152.168.3/255.255.255

152.163.200.157/255.255.255.255

range 20-21

udp Permit

R4 152.168.3/255.255.255

152.163.200.157/255.255.255.255

eq www tcp Deny

R5 * * * * Deny

Page 94: Routing Lookups and Packet Classification:  Theory and Practice

94

Example Classification Results

Pkt Hdr

L3-DA L3-SA L4-DP L4-PROT

Rule, Action

P1 152.163.190.69 152.163.80.11 www tcp R1, Deny

P2 152.168.3.21 152.163.200.157

www udp R2, Deny

Page 95: Routing Lookups and Packet Classification:  Theory and Practice

95

Classification is a Generalization of Lookup

• Classifier = routing table• One-dimension (destination

address)• Rule = routing table entry• Regular expression = prefix• Action = (next-hop-address, port)• Priority = prefix-length

Page 96: Routing Lookups and Packet Classification:  Theory and Practice

96

Metrics for Classification Algorithms

• Speed• Storage requirements• Low update time• Ability to handle large classifiers• Flexibility in implementation• Low preprocessing time• Scalability in the number of header fields• Flexibility in rule specification

Page 97: Routing Lookups and Packet Classification:  Theory and Practice

97

Size of Classifier?

• Microflow recognition: 128K-1M flows in a metro/edge router

• Firewall applications, 8-16K• Wildcarded filters, 16-128K • Depends heavily on where your

box will be deployed

Page 98: Routing Lookups and Packet Classification:  Theory and Practice

98

Packet Classification: Outline

• Background and problem definition

• Classification schemes• Comparative evaluation

Page 99: Routing Lookups and Packet Classification:  Theory and Practice

99

Example Classifier

Rule Destination Address

Source Address

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

Page 100: Routing Lookups and Packet Classification:  Theory and Practice

100

Set-pruning Tries [Tsuchiya, Sri98]

Dimension DA

Rule

DA SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

R7 Dimension SAR2 R1 R5 R7 R2 R1

R3

R7

R6

R7

R4

O(N2) memory

Page 101: Routing Lookups and Packet Classification:  Theory and Practice

101

Grid-of-Tries [Sri98]

Dimension DA

Dimension SAR5 R2 R1

R3R6

R7

R4

O(NW) memoryO(W2) lookup

Rule

DA SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

Page 102: Routing Lookups and Packet Classification:  Theory and Practice

102

Grid-of-Tries [Sri98]

Dimension DA

Dimension SAR5 R2 R1

R3R6

R7

R4

O(NW) memoryO(2W) lookup

Rule

DA SA

R1 0* 10*

R2 0* 01*

R3 0* 1*

R4 00* 1*

R5 00* 11*

R6 10* 1*

R7 * 00*

Page 103: Routing Lookups and Packet Classification:  Theory and Practice

103

Grid-of-Tries

Advantages

Good solution for two dimensions

Disadvantages

Static solutionNot easily extensible to more than two dimensions

20K entries: 2MB, 9 memory accesses (with expansion)

Page 104: Routing Lookups and Packet Classification:  Theory and Practice

104

R5

Geometric Interpretation in 2D

R4

R3

R2R1

R7

P2

Dimension #1

Dim

ensi

on #

2

R6

P1

e.g. (128.16.46.23, *)e.g. (144.24/16, 64/24)

Page 105: Routing Lookups and Packet Classification:  Theory and Practice

105

Bitmap-intersection [Lak98]

R4 R3 R2R11

1

0

0

1

0

1

1

R3

R4

R1

R2

Page 106: Routing Lookups and Packet Classification:  Theory and Practice

106

Bitmap-intersection

Advantages

Good solution for multiple dimensions, for small classifiers

Disadvantages

Static solutionLarge memory bandwidth (scales linearly in N)Large amount of memory (scales quadratically in N)Hardware-optimized

512 rules: 1Mpps with single FPGA (33MHz) and five 1Mb SRAM chips

Page 107: Routing Lookups and Packet Classification:  Theory and Practice

107

2D classification [Lak98]

R4 R3 R2R1

R5

R6

Prefixes

Ranges

P1

R7

Prefixes of length 4

Prefixes of length 3

Page 108: Routing Lookups and Packet Classification:  Theory and Practice

108

2D Classification [Lak98]: Preprocessing

• Store the prefixes in a trie• With each prefix store the set of

intervals that form a rectangle with that prefix as the other side

• Store the intervals by storing them as a set of non-overlapping disjoint intervals

Page 109: Routing Lookups and Packet Classification:  Theory and Practice

109

2D Classification [Lak98]: Lookup

• For each prefix length:– Find the prefix matching the incoming

point and the set of non-overlapping intervals associated with the prefix

– Search for the non-overlapping interval that contains the point

• Repeat for all prefix lengths

Page 110: Routing Lookups and Packet Classification:  Theory and Practice

110

2D Classification [Lak98]: Complexity

• Lookups: O(WlogN) with N two-dimensional rules– O(W+logN) using fractional cascading

• Space: O(N) • Static data structure

Page 111: Routing Lookups and Packet Classification:  Theory and Practice

111

Crossproducting [Sri98]

R4 R3R2

R1

54

3

2

1

6

21 7 8 94 5 63

P1P2

(1,3)

(8,4)

Page 112: Routing Lookups and Packet Classification:  Theory and Practice

112

Crossproducting

Advantages

Fast accessesSuitable for multiple fields

Disadvantages

Large amount of memoryNeed caching for bigger classifiers (> 50 rules)

50 rules: 1.5MB, need caching (on-demand crossproducting) for bigger classifiers

Need: d 1-D lookups + 1 memory access, O(Nd) space

Page 113: Routing Lookups and Packet Classification:  Theory and Practice

113

Space-time Tradeoff

Point Location among N non-overlapping regions in d dimensions:

either O(log N) time with O(Nd) space, orO(logd-1N) time with O(N) space

Need help: exploit structure in real-life classifiers.

Page 114: Routing Lookups and Packet Classification:  Theory and Practice

114

Recursive Flow Classification [Gupta99]

• Difficult to achieve both high classification rate and reasonable storage in the worst case

• Real classifiers exhibit structure and redundancy

• A practical scheme could exploit this structure and redundancy

Observations:

Page 115: Routing Lookups and Packet Classification:  Theory and Practice

115

RFC: Classifier Dataset

• 793 classifiers from 101 ISP and enterprise networks with a total of 41505 rules.

• 40 classifiers: more than 100 rules. Biggest classifier had 1733 rules.

• Maximum of 4 fields per rule: source IP address, destination IP address, protocol and destination port number.

Page 116: Routing Lookups and Packet Classification:  Theory and Practice

116

Structure of the Classifiers

R1

R2

R34 regions

Page 117: Routing Lookups and Packet Classification:  Theory and Practice

117

Structure of the Classifiers

R1

R2

R3

{R1, R2}

{R2, R3}

{R1, R2, R3}

7 regions

dataset: 1733 rule classifier = 4316 distinct regions (worst case is 1013 !)

Page 118: Routing Lookups and Packet Classification:  Theory and Practice

118

Recursive Flow Classification

2S = 2128 2T = 212

One-step

2S = 2128 2T = 212232264

Multi-step

Page 119: Routing Lookups and Packet Classification:  Theory and Practice

119

Chunking of a Packet

Source L3 Address

Destination L3 Address

L4 protocol and flags

Source L4 port

Destination L4 port

Type of Service

Packet Header

Chunk #0

Chunk #7

Page 120: Routing Lookups and Packet Classification:  Theory and Practice

120

Packet Flow

Phase 0 Phase 1 Phase 2 Phase 3

index

action

Header

Combination

16

16 8

16 8

16 8 Reduction

128 64 32 16

14

Page 121: Routing Lookups and Packet Classification:  Theory and Practice

121

Choice of Reduction Tree

3

2

1

0

5

4

Number of phases = P = 310 memory accesses

3

2

1

0

5

4

Number of phases = P = 411 memory acceses

Page 122: Routing Lookups and Packet Classification:  Theory and Practice

122

RFC: Storage Requirements

Number of Rules

Mem

ory

in M

byte

s

Page 123: Routing Lookups and Packet Classification:  Theory and Practice

123

RFC: Classification Time

• Pipelined hardware: 30 Mpps (worst case OC192) using two 4Mb SRAMs and two 64Mb SDRAMs at 125MHz.

• Software: (3 phases) 1 Mpps in the worst case and 1.4-1.7 Mpps in the average case. (average case OC48) [performance measured using Intel Vtune simulator on a windows NT platform]

Page 124: Routing Lookups and Packet Classification:  Theory and Practice

124

RFC: Pros and Cons

Advantages

Exploits structure of real-life classifiersSuitable for multiple fieldsSupports non-contiguous masksFast accesses

Disadvantages

Depends on structure of classifiersLarge pre-processing timeIncremental updates slowLarge worst-case storage requirements

Page 125: Routing Lookups and Packet Classification:  Theory and Practice

125

Hierarchical Intelligent Cuttings (HiCuts)

[Gupta99]

• No single good solution for all cases – But real classifiers have structure

• Perhaps an algorithm can exploit this structure– A heuristic hybrid scheme …

Observations:

Page 126: Routing Lookups and Packet Classification:  Theory and Practice

126

HiCuts: Basic Idea

{R1, R2, R3, …, Rn}

Decision Tree

{R1, R3,R4} {R1, R2,R5} {R8, Rn}

Binth: BinThreshold = Maximum Subset Size = 3

Page 127: Routing Lookups and Packet Classification:  Theory and Practice

127

Heuristics to Exploit Classifier Structure

• Picking a suitable dimension to hicut across

• Minimize the maximum number of rules into any one partition, OR

• Maximize the entropy of the distribution of rules across the partition, OR

• Maximise the different number of specifications in one dimension

• Picking the suitable number of partitions (HiCuts) to be made

• Affects the space consumed and the classification time. Tuned by a parameter, spfac

Page 128: Routing Lookups and Packet Classification:  Theory and Practice

128

HiCuts:Number of Memory Accesses

Binth = 8, spfac = 4

Number of Rules (log

scale)

Crossproducting

Page 129: Routing Lookups and Packet Classification:  Theory and Practice

129

HiCuts: Storage Requirements

Binth = 8 ; spfac = 4

Sp

ace

in

KB

yte

s (l

og

sc

ale

)

Number of Rules (log

scale)

Page 130: Routing Lookups and Packet Classification:  Theory and Practice

130

Incremental Update Time

Binth = 8, spfac = 4 , 333MHz P-II running Linux

Number of Rules (log

scale)

Tim

e in

seco

nd

s (l

og

sc

ale

)

Page 131: Routing Lookups and Packet Classification:  Theory and Practice

131

HiCuts: Pros and Cons

Advantages

Exploits structure of real-life classifiersAdapts data structureSuitable for multiple fieldsSupports incremental updates

Disadvantages

Depends on structure of classifiersLarge pre-processing timeLarge worst-case storage requirements

Page 132: Routing Lookups and Packet Classification:  Theory and Practice

132

Tuple Space Search [Suri99]

Decompose the classification problem into a number of exact match problems, then use hashing

Rule TupleR1 (01*, 111*)

[2,3]

R2 (11*, 010*)

[2,3]

R3 (1*, *) [1,0]

Use one hash table for each tuple, search all hash tables sequentially

Page 133: Routing Lookups and Packet Classification:  Theory and Practice

133

Improved TSS via Precomputation

• Extension of “binary search on trie levels”

• If [2,3,3] succeeeds, no need to search e.g., [4,5,6]

• If [2,3,3] fails, no need to search e.g., [1,2,1]

• Search the tuple space intelligently (decision tree on tuple space)

Page 134: Routing Lookups and Packet Classification:  Theory and Practice

134

TSS: Pros and Cons

Advantages

Suitable for multiple fieldsSupports incremental updatesFast classification and updates on average

Disadvantages

Large pre-processing timeMultiple hashed-memory accesses

Page 135: Routing Lookups and Packet Classification:  Theory and Practice

135

Area-based Quad Tree [Buddhikot99]

00 01 1110

00 01 1110R2

R1

R3R5

R4

R1,R2

R5

R3,R4

Crossing Filter Set

Lookup: two 1-D longest prefix match operations at every node in the path from the root to a leaf

O(N) space O(WlogN) lookup timeO(W+logN) using FC

P1

Page 136: Routing Lookups and Packet Classification:  Theory and Practice

136

AQT: Efficient Updates

new

old

Partition prefixes into groups and do pre-computation per group instead of per interval

O(aW) search and O(aN1/a) updates

Page 137: Routing Lookups and Packet Classification:  Theory and Practice

137

2-D Classification Using FIS Tree [Feldmann00]

R2

R1

R3R5

R4

P1

x-FIS tree

l levelsO(ln1+1/l) space(l+1) 1-D lookups

Page 138: Routing Lookups and Packet Classification:  Theory and Practice

138

FIS Tree: Experimental Study

Number of rules

Levels in FIS tree

Storage space

Number of memory accesses

4-60 K 2 < 5 MB < 15

~106 3 < 100 MB

< 18

Rulesets constructed using netflow data from AT&T Worldnet. Experiments done using static 2-D FIS trees.

Page 139: Routing Lookups and Packet Classification:  Theory and Practice

139

Ternary CAMs

Advantages

Suitable for multiple fieldsFast: 16-20 ns (50-66 Mpps)Simple to understand

Disadvantages

Inflexible: range-to-prefix blowupDensity: largest available in 2000 is 32K x 128 (but can be cascaded)Management software, and on-chip logic: non-trivial complexityPower: 5-8 WIncremental updates: slowDRAM-based CAMs: higher density but soft-error is a problemCost: $30-$160 for 1Mb

Page 140: Routing Lookups and Packet Classification:  Theory and Practice

140

Rule Range Maximal Prefixes

R5 [3,11] 0011, 01**, 10**

R4 [2,7] 001*, 01**

R3 [4,11] 01**, 10**

R2 [4,7] 01**

R1 [1,15] 0001, 001*, 01**, 10**, 110*, 1110

Range-to-prefix Blowup

Rule Range

R1 [3,11]

R2 [2,7]

R3 [4,11]

R4 [4,7]

R5 [1,14]

Maximum memory blowup = factor of (2W-2)d

Page 141: Routing Lookups and Packet Classification:  Theory and Practice

141

Packet Classification: References

• [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202

• [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer 4 switching”, Sigcomm 1998, pp 203-214

• [Suri99] V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space search”, Sigcomm 1999, pp 135-146

• [Gupta99] P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,” Hot Interconnects VII, 1999

Page 142: Routing Lookups and Packet Classification:  Theory and Practice

142

Packet Classification: References (contd.)

• [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm 1999, pp 147-160

• [Buddhikot99] M. M. Buddhikot, S. Suri, and M. Waldvogel, “Space decomposition techniques for fast layer-4 switching,” Protocols for High Speed Networks, vol. 66, no. 6, pp 277-283, 1999

• [Feldmann00] A. Feldmann and S. Muthukrishnan, “Tradeoffs for packet classification,” Infocom 2000

• T. Woo, “A modular approach to packet classification: algorithms and results, “ Infocom 2000

Page 143: Routing Lookups and Packet Classification:  Theory and Practice

143

Special Instances of Classification

• Multicast – PIM SM

– Longest Prefix Matching on the source and group address

– Try (S,G) followed by (*,G) followed by (*,*,RP) – Check Incoming Interface

– DVMRP: – Incoming Interface Check followed by (S,G) lookup

• IPv6 – 128 bit destination address field

Page 144: Routing Lookups and Packet Classification:  Theory and Practice

144

Implementation Choices Given Design

Requirements

Disclaimer: These are my opinions

Page 145: Routing Lookups and Packet Classification:  Theory and Practice

145

Design Requirement LU1

2.5 Gbps, 100K routes

a) 2-4 TCAMsb) On-chip logic with one external

SDRAM chip (using multibit tries)c) On-chip e-DRAM

Requirements:

Choices:

Page 146: Routing Lookups and Packet Classification:  Theory and Practice

146

Design Requirement LU2

10 Gbps, 256K routes

a) 4-8 TCAMsb) On-chip logic with 2-4 external

SDRAM chips (using multibit tries)c) On-chip e-DRAM

Requirements:

Choices:

Page 147: Routing Lookups and Packet Classification:  Theory and Practice

147

Design Requirement PC1

10 Gbps classification up to L4, 16-64K comparatively static 128bit entries

a) 1-4 TCAMs b) On-chip logic with 2 external SDRAM

and 2 SRAM chips (using RFC)c) Off-chip SRAMs (using HiCuts)

Requirements:

Choices:

Page 148: Routing Lookups and Packet Classification:  Theory and Practice

148

Your Design Here

Requirements:

Choices:

Page 149: Routing Lookups and Packet Classification:  Theory and Practice

149

Lookup/Classification Chip Vendors

• Switch-on• Fastchip• Agere• Solidum• Siliconaccess• TCAM vendors: Netlogic, Lara,

Sibercore, Mosaid, Klsi etc.

Page 150: Routing Lookups and Packet Classification:  Theory and Practice

150

Summary

• Both problems are well studied by now but increasing linerates and database sizes continue to present interesting opportunities

• Still need a high-speed (~OC192) dynamic, generic, multi-field classification algorithm for large number of (up to a million) rules

Page 151: Routing Lookups and Packet Classification:  Theory and Practice

151

Thanks! I will appreciate direct

feedback at [email protected]