Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron...

17
Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral ,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang, Liu Date: 2011/11/2

Transcript of Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron...

Page 1: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

Shift-based Pattern Matching for Compressed Web Traffic

Author:

Anat Bremler-Barr, Yaron Koral ,Victor Zigdon

Publisher: IEEE HPSR,2011

Presenter: Kai-Yang, Liu

Date: 2011/11/2

Page 2: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

INTRODUCTION•Two-thirds of the top 1000 most popular

sites like Yahoo!, Google, MSN, YouTube, Facebook and others use HTTP compression to enhance the speed of their content downloads.

2

Page 3: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

The GZIP Algorithm•LZ77 compression LZ77 compression technique is that we can compress a

series of bytes (characters) if we spot that this series of bytes has already appeared in the past. The algorithm replaces each repeated string by (distance,length) pair.

For example:the text: ‘abcdefgabcde’ can be compressed to:

‘abcdefg(7,5)’; LZ77 refers to the above pair as “pointer” and to uncompressed bytes as “literals”.

•Huffman Coding- reduce the symbol coding size by encoding frequent symbols with fewer bits.

3

Page 4: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

INTRODUCTION

•Recent work (ACCH algorithm) presents technique for pattern matching on compressed traffic that decompresses the traffic and then uses data from the decompression phase to accelerate the process.

•We present Shift-based Pattern matching for Compressed traffic algorithm, SPC, that accelerates MWM on compressed traffic.

4

Page 5: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

THE MODIFIED WU-MANBER ALGORITHM• MWM trims all patterns to their m bytes prefix,

where m is the size of the shortest pattern.• MWM chooses predefined group of bytes, namely

B, to determine the shift value.• MWM starts by precomputing two tables: a skip

shift table called ShiftTable and a patterns hash table, called Ptrns .

• The scan is performed using a virtual scan window of size m. The shift value is determined by indexing the ShiftTable with the B bytes suffix of the scan window.

5

Page 6: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

6

Page 7: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

THE MODIFIED WU-MANBER ALGORITHM

7

Page 8: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

SHIFT-BASED PATTERN MATCHING FOR COMPRESSED TRAFFIC (SPC)• The bytes referred by the pointers were already

scanned; hence, if we have a prior knowledge that an area does not contain patterns, we can skip scanning most of it.

• Observe that even if no patterns were found when the referred area was scanned, patterns may occur in the boundaries of the pointer.

• The general method of the algorithm is to use a combined technique that scans uncompressed portions of the data using MWM and skips scanning most of the data represented by the LZ77 pointers.

8

Page 9: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

9

Page 10: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

10

Page 11: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

11

Page 12: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

SHIFT-BASED PATTERN MATCHING FOR COMPRESSED TRAFFIC (SPC)

12

Page 13: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

EXPERIMENTAL RESULTS

•Data SetWe collected HTTP pages encoded with GZIP

taken from a list constructed from the Alexa website that maintains web traffic metrics and top-site lists.

•Pattern SetOur pattern-sets were gathered from two different

sources: ModSecurity , an open source web application firewall and Snort, an open source network intrusion prevention system.

13

Page 14: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

SPC Characteristics Analysis• In order to understand the impact of B and m we

examined the character of skip ratio, Sr, the percentage of characters the algorithm skips.

•The Snort pattern set contains many short patterns, specifically 410 distinct patterns of length ≤ 3, 539 of length 4 and 381 of length 5.

•To circumvent this problem we inspected the containing rules. We can eliminate most of the short patterns by using longer pattern within the same rule or relying on specific flow parameters.

14

Page 15: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

EXPERIMENTAL RESULTS(Skip Ratio)

15

Page 16: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

EXPERIMENTAL RESULTS(Throughput)

16

Page 17: Shift-based Pattern Matching for Compressed Web Traffic Author: Anat Bremler-Barr, Yaron Koral,Victor Zigdon Publisher: IEEE HPSR,2011 Presenter: Kai-Yang,

EXPERIMENTAL RESULTS(Storage)

17