An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup
description
Transcript of An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup
1
An Efficient, Hardware-based Multi-Hash Scheme
for High Speed IP Lookup
Hot Interconnects 2008
Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami
Melhem.
2
BackgroundIP Lookup in Core Router
Incoming Packet
Lookup IP Address IP address Next Hop
Outgoing Link
10101110
1010**** (Port 2)Longer Prefix Matching
Port 2
3
MotivationIncreasing Internet Traffic
High Speed links Optical technology -> link rates ~100Gbps
High Speed RoutersTCAM-based forwarding engines
Larger forwarding tablesTCAMs FAIL to scale.
4
IP Lookup Schemes1. TCAM-based schemes. [idt, netlogic, micron,CoolCAM]
1. Fast and constant lookup time2. High cost and power consumption
2. Trie-based schemes. [Eatherton04, Devroye03,…]1. Multi-cycle lookup latencies and low worse-case throughput.2. Performance and scalability are fundamentally tied with the IP
address length. 3. Hash-based schemes. [Srinivasan98, Hasan06, Kaxiras05,…]
1. Key-length independent latencies2. Easy to implement in hardware3. Hashing collisions -> space inefficiency4. Hash keys (prefixes) include “don’t care” bits and they make
hashing complicated.
5
OverviewProblem: Hash-based schemes can be power and
cost efficient but are still space inefficient or slow.
Goal: A hardware-based forwarding engine that has:
1. Constant and high speed lookup throughput.
2. Space efficiency. 3. Scales well with the increasing fwrding
tables 4. Low cost and power consumption.
Proposal: A h/w-based multi hash architecture with high throughput (1 packet lookup per mem cycle) and at the same time is space and power efficient.
6
Outline
Introduction High Speed and Space Efficient
Implementation Selecting hashing bits / Dealing
with wildcard bits Experimental Evaluation Summary
7
h/w Hash-based IP Lookup
key1 key2 keyc… 2R rows
C entries
C keysfetched
match1 match2 matchc…
Hash Indexgenerator
Key (IP address)
Matching Processors
LPM logic
C-way associative memory arrayMuch more power efficient scheme compared with TCAM.
HighThroughput
8
Hash-based IP Lookup example
key1 key2 keyj… 2R rows
C entries
C keysfetched
match1 match2 matchj…
Hash Indexgenerator
Key (IP address)10101110 / 8 bits
1010**** 1111**** 10100001
1010****
1010****Next Hop
1111**** 10100001
9
Hash-based IP Lookup - LPM
key1 key2 keyj… 2R rows
C entries
C keysfetched
match1 match2 matchj…
Hash Indexgenerator
Key (IP address)10101110 / 8 bits
1010**** 101011** 1010111*
1010111*Next Hop
1010**** 101011** 1010111*LPM (Longest Prefix Match)
10
Hash index generation
Simple XOR-folding hash function
N selected bitsF R = N – F
Skew
XOR
IP Prefix or IP incoming address
Bit-Select mechanism
R bit hash index
XOR hash function
11
Inserting / Hashing IP prefixes
Space Utilization = 30%
Single Hash Table
Balanced is better
Bucket index
Bucket Load
Total available memory space
Used memory space
12
How to Improve the utilization of the hash table.
Powerful Hash Functions -> Complexity -> Delay on Critical lookup
path. Adaptive perfect or semi-perfect Hash Functions
-> Rehashing of the whole routing table is needed periodically – very time consuming process. Using multiple hash functions (MHT)
-> Increase of space efficiency Our proposal: multi-hashing scheme (MHT) + items are allowed to migrate during insertion operation.
13
IP prefix insertion (multi-hashing)
h1 h2 h3
UsedEntry
14
Hashing IP prefixes: multi-hashing
Single Hash Table
Space Utilization 30% 50%
Bucket index
Bucket Load
Single hashing Multi-hashing with 3 hash tables.
15
Migrations are allowed during the insertion operation
Insertion time?
h1 h2 h3
16
Hashing prefixes: MHT + migrations
(a) (b) (c)
Single Hash Table
Single hashing Multi-hashing with 3 hash tables. Multi-hashing with
3 hash tables + migrations.
Space Utilization 30% 50% 70%
17
Crisis: Handling unresolved collisions
Victim TCAM
h1 h2 h3
18
Outline
Introduction High Speed and Space Efficient
Implementation. Selecting Hashing Bits / Dealing
with wildcard bits. Experimental Evaluation. Summary
19
Selecting hashing bits from prefixes10101110************************ / length = 8 bits1010111000110111**************** / length = 16 bits
101011100011011110001110******** / length = 24 bits
- No prefix has length < 8 bits- Rightmost bits have higher entropy and are more suitable for hashing. - Routing tables become larger while wildcard bits participate in hashing.
20
Supporting wildcard bits in hashing
Current technique: Convert each prefix of length x to a set of new prefixes of length L=x+k so the wildcard bits are eliminated up to length L. Then hash the whole new expanded set of prefixes. [Srinivasan et al.]
-> Each prefix expands the table by 2^k prefixes.
10101110001101110000 /16 10101110001101110001 /16 …10101110001101111110 /1610101110001101111111 /16
1010111000110111**************** / length = 16 bits
21
10101110001101110000 10101110001101110001 …10101110001101111110 10101110001101111111
1010111000110111**************
16 keys to be inserted
0111011011100 (index) 0111011011101 (index) 0111011011110 (index) 0111011011111 (index)
1010111000110111**************
4 keys to be inserted
CWR: Select bits from any carefully predefined positions
CWR: -> Allows Sensitivity analysis that can find optimal configuration points for maximum space efficiency.
-> faster Insertion time per prefix
Control Wildcard Resolution (CWR)
22
Outline
Introduction High Speed and Space Efficient
Implementation. Selecting hashing bits / Dealing
with wildcard bits Experimental Evaluation. Summary
23
Lookup Architecture
…
R+F bits (Selected bits for Index generation )
R bitsHash Index
Tag to match
T + F bits (TAG)
…
…
LPM
Incoming packet’s IP Address (32 bits)
Bit-Select mechanism
24
Sensitivity AnalysisDifferent Bit-select configurations
1. Advantage over the standard MHT scheme. 2. Very small deviation of the points around the
trend line. -> a practical guarantee that the unresolved collisions will not be far from an estimated value.
25
Comparison for h/w based schemes
TCAM IPStash New schemeDescription h/w CAM
based h/w Hash-based
h/w Hash-based
Throughput 1 1/3 1Space Efficiency
Best Very Good(state of the art for hash-based)
Good
Power consumption
CAM => high consumption per lookup
2.2 mem access per lookup+ many row comparators
1 mem access per lookup + few row comparators
26
Space Efficiency - Comparison
Load Factor = Routing table size / Available space capacity
27
Power Consumption
Even with load factor = 0.5- 8x more power efficient than TCAM- 2x compared with IPStash.
28
Victim TCAM space requirements
The percentage of the ‘unresolved collisions’ is an accurate estimator of the victim space that is required for the corresponding load factor.
29
Summary
IP Lookup using TCAMs is expensive. Current hash-based approaches are promising but are either space inefficient or limited by low lookup throughput.
The proposed h/w-based multi-hash lookup scheme has:1. High Speed Lookup Throughput.
Requires 1 mem access time per packet lookup
2. Space Efficiency.Effective Load Factor 70% with < 5% victim
TCAM3. Low power consumption and cost.
8x less power than dynamic TCAMs.Best among hash-based schemes. Simple and easy hardware implementation.
4. Scalable to future routing table sizes abd IPv6 transition.
All methods and techniques used scale well.