Efficient Memory Utilization on Network Processors for Deep Packet Inspection
description
Transcript of Efficient Memory Utilization on Network Processors for Deep Packet Inspection
Efficient Memory Utilization on Network Processors
for Deep Packet Inspection
Piti Piyachon
Yan Luo
Electrical and Computer Engineering Department
University of Massachusetts Lowell
ANCS 2006 U Mass Lowell
Our Contributions• Study parallelism of a pattern matching
algorithm• Propose Bit-Byte Aho-Corasick
Deterministic Finite Automata• Construct memory model to find optimal
settings to minimize the memory usage of DFA
ANCS 2006 U Mass Lowell
DPI and Pattern Matching• Deep Packet Inspection
– Inspect: packet header & payload
– Detect: computer viruses, worms, spam, etc.
– Network intrusion detection application: Bro, Snort, etc.
• Pattern Matching requirements1. Matching predefined multiple patterns (keywords, or strings) at
the same time
2. Keywords can be any size.
3. Keywords can be anywhere in the payload of a packet.
4. Matching at line speed
5. Flexibility to accommodate new rule sets
ANCS 2006 U Mass Lowell
Classical Aho-Corasick (AC) DFA: example 1
• A set of keywords– {he, her, him, his}
accept state
0
4
3
h
e
2
r
1
5
sm
i
6
he
herhim his
hhh
h
h
h
start state
accept state accept stateaccept state
Failure edges back to state 1 are shown as dash line.
Failure edges back to state 0 are not shown.
ANCS 2006 U Mass Lowell
Memory Matrix Model of AC DFA
• Snort (Dec’05): 2733 keywords
• 256 next state pointers– width = 15 bits
• > 27,000 states
• keyword-ID width = 2733 bits
• 27538 x (2733 + 256 x 15) = 22 MB
22 MB is too big for on-chip RAM
256 Next State Pointers
255 254 3 2 1 0Keyword-ID
<15> <15> <15> <15> <15> <15><2733>
0
1
2
3
27538
state#
....
....
....
....
ANCS 2006 U Mass Lowell
0
3
0
0
2
1
1
he
her himhis
k0: he : 0 , 0k1: her : 0 , 0 , 1k2: h im : 0 , 0, 0k3: h is : 0 , 0, 1
4
0
Failure edges are not shown.
Bit-1 DFA
0
3
1
0
2
0
1
he
her
him his
k0: he : 1 , 0k1: her : 1 , 0 , 0k2: h im: 1, 1, 1k3: h is : 1 , 1 , 0
5
1
Bit-3 DFA
4
1
6
0
Failure edges are not shown.
Bit-AC DFA (Tan-Sherwood’s Bit-Split)k0 h e 0110 1000 0110 0101k1 h e r 0110 1000 0110 0101 0111 0010k2 h i m 0110 1000 0110 1001 0110 1101k3 h i s 0110 1000 0110 1001 0111 0011
Need 8 bit-DFA
ANCS 2006 U Mass Lowell
Memory Matrix of Bit-AC DFA
• Snort (Dec’05): 2733 keywords
• 2 next state pointers– width = 9 bits
• 361 states
• keyword-ID width = 16 bits
• 1368 DFA• 1368 x 361 x (16 + 2 x 9) = 2 MB
2 Next State Pointers
1 0Keyword-ID
<9> <9><16>
0
1
2
3
361
state#
....
....
....
....
ANCS 2006 U Mass Lowell
Bit-AC DFA Techniques• Shrinking the width of keyword-ID
– From 2733 to 16 bits– By dividing 2733 keywords into 171 subsets
• Each subset has 16 keywords
• Reducing next state pointers – From 256 to 2 pointers– By dividing each input byte into 1 bits– Need 8 bit-DFA
• Extra benefits – The number of states (per DFA) reduces from ~27,000 to ~300 states.– The width of next state pointer reduces from 15 to 9 bits.
• Memory– Reduced from 22 MB to 2 MB
• The number of DFA = ?– With 171 subsets, each subset has 8 DFA. – Total DFA = 171 x 8 = 1,368 DFA
What can we do better to reduce the memory usage?
2 Next State Pointers
1 0Keyword-ID
<9> <9><16>
0
1
2
3
512
state#
....
....
....
....
ANCS 2006 U Mass Lowell
Classical AC DFA: example 2
Failure edges are not shown.
0
e
1
l
2
e
3
m
4
5
e
6
n
7
t
s
8
p
9
a
10
r
11
a
12
13
l
14
l
15
e
l
16
m
17
a
18
n
19
a
20
21
g
e
22
e
23
m
24
o
25
r
26
y
27
k0 k1k2 k3
M atch Found atk0: e lements state 8k1: paralle l state 16k2: manage state 22k3: memory state 27
28 states
Byte-AC DFAbyte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3
k0 elements e l e m e n t sk1 parallel p a r a l l e lk2 manage m a n a g ek3 memory m e m o r y
M atch Found atk0: e lements state 2k1: para lle l state 4k2: m anage state 6k3: m emory state 8
0
3
2
1
4
5
6
8
e
l g
r
e
p m
k0
k1k2
k3
0
3
2
1
M atch Found atk0: e lemen ts state 2k1: pa ra llel state 4k2: manage state 6k3: memory state 8
4 6
7
8
n
l e
y
l
a
e
k0
k1k2
k3
0
3
2
1
M atch Found atk0: e lemen ts state 2k1: paralle l state 4k2: manage state 6k3: mem ory state 8
4
5
6
7
8
t
e (any)
(any)
e
r n
m
k0
k1k2
k3
0
3
2
1
M atch Found atk0: e lem ents state 2k1: para lle l state 4k2: manage state 4k3: memo ry state 8
4
7
8
s
l
(any)
m
a
o
k0
k1k2
k3
Byte 0 Byte 1
Byte 2 Byte 3
• Considering 4 bytes at a time
• 4 DFA
• < 9 states / DFA
• 256 next state pointers!
Similar to Dharmapurikar-Lockwood’s JACK DFA, ANCS’05
ANCS 2006 U Mass Lowell
Bit-Byte-AC DFAbyte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3
k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3: m emory 1 ,0
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
k0: e lemen ts 1 ,1k1: pa ralle l 1 ,1k2: manage 1 ,xk3: mem ory 1 ,x
0
2
1
k0
k1
k2
k3
• 4 bytes at a time
• Each byte divides into bits.
• 32 DFA (= 4 x 8)
• < 6 states/DFA
• 2 next state pointers
ANCS 2006 U Mass Lowell
Memory Matrix of Bit-Byte-AC DFA
• Snort (Dec’05): 2733 keywords • 4 bytes at a time• < 36 states/DFA• 2 next state pointers
– width = 6 bits• keyword-ID width = 3 bits• 29152 DFA (= 911 x 32)• 29152 x 36 x (3 + 2 x 6) = 1.9 MB
2 Next State Pointers
1 0Keyword-ID
<6> <6><3>
0
1
2
3
36
state#
....
....
....
....
• 1.9 MB is a little better than 2 MB.
• This is because
• It is not any optimal setting.
• Each DFA has different number of states.
• Don’t need to provide same size of memory matrix for every DFA.
ANCS 2006 U Mass Lowell
Bit-Byte-AC DFA Techniques• Still keeping the width of keyword-ID as low as Bit-DFA.
• Still keeping next state pointers as small as Bit-DFA.
• Reducing states per DFA by
– Skipping bytes
– Exploiting more shared states than Bit-DFA
• Results of reducing states per DFA
– from ~27,000 to 36 states
– The width of next state pointer reduces from 15 to 6 bits.
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFAbyte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3
k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1k2k3
bit 3 of byte 0
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0k1:k2:k3:
0
1
0
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1k2k3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1:k2:k3:
0
2
1
0
0
k0
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1k2k3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0k2:k3:
0
2
1
0
0
k0
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2k3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2:k3:
0
2
1
3
0 1
0
k0 k1
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2k3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1k3:
0
2
1
3
4
0 1
01
k0 k1
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3:
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3: m emory 1
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3: m emory 1 ,0
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
4 bytes (considered) at a time
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3: m emory 1 ,0
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
Failure edges are not shown.
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFAbyte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3
k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
0
1
3
1
k2
bit 6, Byte 2
k3
k0: e lemen ts 1 ,1k1: paralle l 1 ,1k2: manage 1 ,xk3: mem ory 1 ,x
0
2
1
k0
k1
k2
k3
ANCS 2006 U Mass Lowell
Construction of Bit-Byte AC DFA
k0: e lements 0 ,0k1: para lle l 0 ,1k2: m anage 1 ,0k3: m emory 1 ,0
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
k0: e lemen ts 1 ,1k1: pa ralle l 1 ,1k2: manage 1 ,xk3: mem ory 1 ,x
0
2
1
k0
k1
k2
k3
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3k0 e l e m e n t s 0110 0101 0110 1100 0110 0101 0110 1101 0110 0101 0110 1110 0111 0100 0111 0011k1 p a r a l l e l 0111 0000 0110 0001 0111 0010 0110 0001 0110 1100 0110 1100 0110 0101 0110 1100k2 m a n a g e 0110 1101 0110 0001 0110 1110 0110 0001 0110 0111 0110 0101 xxxx xxxx xxxx xxxxk3 m e m o r y 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 xxxx xxxx xxxx xxxx
32 bit-byte DFA need to be constructed.
ANCS 2006 U Mass Lowell
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
0
2
1
k0
k1
k2
k3
Bit-Byte-DFA: Searchingbyte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3
input
ANCS 2006 U Mass Lowell
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
0
2
1
k0
k1
k2
k3
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3a b 1 2
0110 0001 0110 0010 0011 0001 0011 0010input
A failure edge is shown as necessary.
0
Bit-Byte-DFA: Searching
ANCS 2006 U Mass Lowell
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
0
2
1
k0
k1
k2
k3
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3a b 1 2 m e m o
0110 0001 0110 0010 0011 0001 0011 0010 0110 1101 0110 0101 0110 1101 0110 1111input
Bit-Byte-DFA: Searching
ANCS 2006 U Mass Lowell
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
0
2
1
k0
k1
k2
k3
A failure edge is shown as necessary.
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3a b 1 2 m e m o r y e f
0110 0001 0110 0010 0011 0001 0011 0010 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 0110 0101 0110 0110input
0
Bit-Byte-DFA: Searching
ANCS 2006 U Mass Lowell
0
2
1
3
4
5
0 1 0
01
k0 k1 k2
bit 3, Byte 0
k3
0
1
3
1
k2
bit 6, Byte 2
k3
0
2
1
k0
k1
k2
k3
Match=> (keyword) ‘memory’Only all 32 bit-DFA find the match in their
own!
byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3 byte0 byte1 byte2 byte3a b 1 2 m e m o r y e f
0110 0001 0110 0010 0011 0001 0011 0010 0110 1101 0110 0101 0110 1101 0110 1111 0111 0010 0111 1001 0110 0101 0110 0110input
Bit-Byte-DFA: Searching
ANCS 2006 U Mass Lowell
Find the optimal settings to minimize memory• When k = keywords per subset
– The width of keyword-ID = k bits– k = 1, 2, 3, … , K – when K = the number of keywords in the whole set.
• Snort (Dec.2005) : K = 2733 keywords
• b = bit(s) extracted for each byte– b = 1, 2, 4, 8
– # of next state pointers = 2b
– The example 2: b = 1– Beyond b > 8
• > 256 next state pointers
• B = Bytes considered at a time – B = 1, 2, 3, … – The example 2: B = 4
• Total Memory (T) is a function of k, b, and B.– T = f(k, b, B)
2 Next State Pointers
1 0Keyword-ID
<6> <6><3>
0
1
2
3
36
state#
....
....
....
....
ANCS 2006 U Mass Lowell
T’s Formula
Total memory of all bit-ACs in all subset
kK
j
bB
iijij ekBbkT
1
8
1
),,(
k
KNsubsetwhen ,
DefinitionK The number of all Keywords in the whole rule setk The number of keyword in each subset 1, 2, 3, …, Kb GroupedBit: The number of bits grouped to divide 8 bits (of a byte) 1, 2, 4, 8B The number of bytes considered at a time 1, 2, 3, …
The number of next state pointers 2, 4, 16, 256
The number of states in i th bit-level AC in j th subset
The number of bits used to encode states in i th bit-level AC in j th subset
ije
ij
iie 2log, b2andb
B
N
N
subset
bitDFA 8
2 Next State Pointers
1 0Keyword-ID
<6> <6><3>
0
1
2
3
36
....
....
....
....
k e e
ANCS 2006 U Mass Lowell
250
270
290
310
330
350
370
390
410
1 11 21 31 41 51 61 71 81 91
T (KB)
Bit-Byte-AC DFA: B=16, b=2
k (keywords-per-subset)
Find the optimal k• Each pair of (b, B) has one optimal k for a minimal T.
T_min at k=12
ANCS 2006 U Mass Lowell
Find the optimal b40
6.2
355.
7
314.
9
295.
2
295.
2 341.
6
395.
8
345.
4
307.
4
289.
2
271.
9
273.
4
773.5
672.9
598.0
562.8
502.6
424.0
200
300
400
500
600
700
800
B=1 B=2 B=4 B=8 B=16 B=32
b=1b=2b=4
T (KB)
k =3 3 3
k =3 3 3
k =4 4 4
k =6 5 4
k =9 12 18
k =18 33 34
• Each setting of k, b, and B has different optimal point.– Choosing only the optimal setting to
compare.
• b = 2 is the best.
ANCS 2006 U Mass Lowell
Find the optimal B
100.00
87.26
82.38
77.65 77.51
74.5375.56
73.0672.17
68.69 69.06
65
70
75
80
85
90
95
100
1 2 3 4 5 6 7 8 9 16 32
T (normalized to the base case B =1)
B
k=3 k=3 k=3 k=4 k=4 k=4 k=4 k=5 k=5 k=12 k=32
395 KB
• b = 2
• T reduces while B increases.– Non-linearly
• B > 16, – T begins to increase.
• B = 16 is the best for Snort (Dec’05).
ANCS 2006 U Mass Lowell
39
5.8
4
34
5.4
1
32
6.1
0
30
7.3
9
30
6.8
3
29
5.0
2
29
9.1
1
28
9.1
9
28
5.6
7
27
1.9
1
2001.57
6064.27
0
1000
2000
3000
4000
5000
6000
7000
1 2 3 4 5 6 7 8 9 16
T (KB)
B
Brodie's
Tan's
Comparing with Existing Works
• Tan-Sherwood’s, Brodie-Cytron-Taylor’s, and Ours
• Our Bit-Byte DFA when B=16– The optimal point at b=2 and k=12
– 272 KB
– 14 % of 2001 KB (Tan’s)
– 4 % of 6064 KB (Brodie’s)
ANCS 2006 U Mass Lowell
Comparing with Existing Works
• Tan-Sherwood’s and Ours: At B = 13
95
.84
34
5.4
1
32
6.1
0
30
7.3
9
30
6.8
3
29
5.0
2
29
9.1
1
28
9.1
9
28
5.6
7
27
1.9
1
2001.57
6064.27
0
1000
2000
3000
4000
5000
6000
7000
1 2 3 4 5 6 7 8 9 16
T (KB)
B
Brodie's
Tan's
• (Tan’s on ASIC)
– 2001 KB
– k = 16 is not the optimal setting for B=1.
– Each bit-DFA uses same storage’s capacity, which fits the largest one (worst case).
• (Ours on NP)
– 396 KB < 2001 KB
– k = 3 is the optimal setting for B=1.
– Each bit-DFA uses exactly memory space to hold it.
ANCS 2006 U Mass Lowell
Results with an NP Simulator
• NePSim2– An open source IXP24xx/28xx simulator
• NP Architecture based on IXP2855– 16 MicroEngines (MEs)– 512 KB– 1.4 GHz
• Bit-Byte AC DFA: b=2, B=16, k=12– T = 272 KB– 5 Gbps
ANCS 2006 U Mass Lowell
Conclusion
• Bit-Byte DFA model can reduce memory usage up to 86%.
• Implementing on NP uses on-chip memory more efficiently without wasting space, comparing to ASIC.
• NP has flexibility to accommodate
• The optimal setting of k, b, and B.
• Different sizes of Bit-Byte DFA.
• New rule sets in the future.
• The optimal setting may change.
• The performance (using a NP simulator) satisfies line speed up to 5 Gbps throughput.