An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup

30
1 An Efficient, Hardware- based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami Melhem.

description

An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup. Socrates Demetriades , Michel Hanna, Sangyeun Cho and Rami Melhem . Hot Interconnects 2008. Background. IP Lookup in Core Router. Incoming Packet. Outgoing Link. 10101110. Lookup IP Address . Port 2. - PowerPoint PPT Presentation

Transcript of An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup

Page 1: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

1

An Efficient, Hardware-based Multi-Hash Scheme

for High Speed IP Lookup

Hot Interconnects 2008

Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami

Melhem.

Page 2: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

2

BackgroundIP Lookup in Core Router

Incoming Packet

Lookup IP Address IP address Next Hop

Outgoing Link

10101110

1010**** (Port 2)Longer Prefix Matching

Port 2

Page 3: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

3

MotivationIncreasing Internet Traffic

High Speed links Optical technology -> link rates ~100Gbps

High Speed RoutersTCAM-based forwarding engines

Larger forwarding tablesTCAMs FAIL to scale.

Page 4: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

4

IP Lookup Schemes1. TCAM-based schemes. [idt, netlogic, micron,CoolCAM]

1. Fast and constant lookup time2. High cost and power consumption

2. Trie-based schemes. [Eatherton04, Devroye03,…]1. Multi-cycle lookup latencies and low worse-case throughput.2. Performance and scalability are fundamentally tied with the IP

address length. 3. Hash-based schemes. [Srinivasan98, Hasan06, Kaxiras05,…]

1. Key-length independent latencies2. Easy to implement in hardware3. Hashing collisions -> space inefficiency4. Hash keys (prefixes) include “don’t care” bits and they make

hashing complicated.

Page 5: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

5

OverviewProblem: Hash-based schemes can be power and

cost efficient but are still space inefficient or slow.

Goal: A hardware-based forwarding engine that has:

1. Constant and high speed lookup throughput.

2. Space efficiency. 3. Scales well with the increasing fwrding

tables 4. Low cost and power consumption.

Proposal: A h/w-based multi hash architecture with high throughput (1 packet lookup per mem cycle) and at the same time is space and power efficient.

Page 6: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

6

Outline

Introduction High Speed and Space Efficient

Implementation Selecting hashing bits / Dealing

with wildcard bits Experimental Evaluation Summary

Page 7: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

7

h/w Hash-based IP Lookup

key1 key2 keyc… 2R rows

C entries

C keysfetched

match1 match2 matchc…

Hash Indexgenerator

Key (IP address)

Matching Processors

LPM logic

C-way associative memory arrayMuch more power efficient scheme compared with TCAM.

HighThroughput

Page 8: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

8

Hash-based IP Lookup example

key1 key2 keyj… 2R rows

C entries

C keysfetched

match1 match2 matchj…

Hash Indexgenerator

Key (IP address)10101110 / 8 bits

1010**** 1111**** 10100001

1010****

1010****Next Hop

1111**** 10100001

Page 9: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

9

Hash-based IP Lookup - LPM

key1 key2 keyj… 2R rows

C entries

C keysfetched

match1 match2 matchj…

Hash Indexgenerator

Key (IP address)10101110 / 8 bits

1010**** 101011** 1010111*

1010111*Next Hop

1010**** 101011** 1010111*LPM (Longest Prefix Match)

Page 10: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

10

Hash index generation

Simple XOR-folding hash function

N selected bitsF R = N – F

Skew

XOR

IP Prefix or IP incoming address

Bit-Select mechanism

R bit hash index

XOR hash function

Page 11: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

11

Inserting / Hashing IP prefixes

Space Utilization = 30%

Single Hash Table

Balanced is better

Bucket index

Bucket Load

Total available memory space

Used memory space

Page 12: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

12

How to Improve the utilization of the hash table.

Powerful Hash Functions -> Complexity -> Delay on Critical lookup

path. Adaptive perfect or semi-perfect Hash Functions

-> Rehashing of the whole routing table is needed periodically – very time consuming process. Using multiple hash functions (MHT)

-> Increase of space efficiency Our proposal: multi-hashing scheme (MHT) + items are allowed to migrate during insertion operation.

Page 13: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

13

IP prefix insertion (multi-hashing)

h1 h2 h3

UsedEntry

Page 14: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

14

Hashing IP prefixes: multi-hashing

Single Hash Table

Space Utilization 30% 50%

Bucket index

Bucket Load

Single hashing Multi-hashing with 3 hash tables.

Page 15: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

15

Migrations are allowed during the insertion operation

Insertion time?

h1 h2 h3

Page 16: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

16

Hashing prefixes: MHT + migrations

(a) (b) (c)

Single Hash Table

Single hashing Multi-hashing with 3 hash tables. Multi-hashing with

3 hash tables + migrations.

Space Utilization 30% 50% 70%

Page 17: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

17

Crisis: Handling unresolved collisions

Victim TCAM

h1 h2 h3

Page 18: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

18

Outline

Introduction High Speed and Space Efficient

Implementation. Selecting Hashing Bits / Dealing

with wildcard bits. Experimental Evaluation. Summary

Page 19: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

19

Selecting hashing bits from prefixes10101110************************ / length = 8 bits1010111000110111**************** / length = 16 bits

101011100011011110001110******** / length = 24 bits

- No prefix has length < 8 bits- Rightmost bits have higher entropy and are more suitable for hashing. - Routing tables become larger while wildcard bits participate in hashing.

Page 20: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

20

Supporting wildcard bits in hashing

Current technique: Convert each prefix of length x to a set of new prefixes of length L=x+k so the wildcard bits are eliminated up to length L. Then hash the whole new expanded set of prefixes. [Srinivasan et al.]

-> Each prefix expands the table by 2^k prefixes.

10101110001101110000 /16 10101110001101110001 /16 …10101110001101111110 /1610101110001101111111 /16

1010111000110111**************** / length = 16 bits

Page 21: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

21

10101110001101110000 10101110001101110001 …10101110001101111110 10101110001101111111

1010111000110111**************

16 keys to be inserted

0111011011100 (index) 0111011011101 (index) 0111011011110 (index) 0111011011111 (index)

1010111000110111**************

4 keys to be inserted

CWR: Select bits from any carefully predefined positions

CWR: -> Allows Sensitivity analysis that can find optimal configuration points for maximum space efficiency.

-> faster Insertion time per prefix

Control Wildcard Resolution (CWR)

Page 22: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

22

Outline

Introduction High Speed and Space Efficient

Implementation. Selecting hashing bits / Dealing

with wildcard bits Experimental Evaluation. Summary

Page 23: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

23

Lookup Architecture

R+F bits (Selected bits for Index generation )

R bitsHash Index

Tag to match

T + F bits (TAG)

LPM

Incoming packet’s IP Address (32 bits)

Bit-Select mechanism

Page 24: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

24

Sensitivity AnalysisDifferent Bit-select configurations

1. Advantage over the standard MHT scheme. 2. Very small deviation of the points around the

trend line. -> a practical guarantee that the unresolved collisions will not be far from an estimated value.

Page 25: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

25

Comparison for h/w based schemes

TCAM IPStash New schemeDescription h/w CAM

based h/w Hash-based

h/w Hash-based

Throughput 1 1/3 1Space Efficiency

Best Very Good(state of the art for hash-based)

Good

Power consumption

CAM => high consumption per lookup

2.2 mem access per lookup+ many row comparators

1 mem access per lookup + few row comparators

Page 26: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

26

Space Efficiency - Comparison

Load Factor = Routing table size / Available space capacity

Page 27: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

27

Power Consumption

Even with load factor = 0.5- 8x more power efficient than TCAM- 2x compared with IPStash.

Page 28: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

28

Victim TCAM space requirements

The percentage of the ‘unresolved collisions’ is an accurate estimator of the victim space that is required for the corresponding load factor.

Page 29: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

29

Summary

IP Lookup using TCAMs is expensive. Current hash-based approaches are promising but are either space inefficient or limited by low lookup throughput.

The proposed h/w-based multi-hash lookup scheme has:1. High Speed Lookup Throughput.

Requires 1 mem access time per packet lookup

2. Space Efficiency.Effective Load Factor 70% with < 5% victim

TCAM3. Low power consumption and cost.

8x less power than dynamic TCAMs.Best among hash-based schemes. Simple and easy hardware implementation.

4. Scalable to future routing table sizes abd IPv6 transition.

All methods and techniques used scale well.

Page 30: An Efficient, Hardware-based Multi-Hash Scheme for     High Speed IP Lookup

30

Questions

source code:www.cs.pitt.edu/~socrates/HBip