Gregex: GPU based High Speed Regular Expression Matching Engine Date:101/1/11 Publisher:2011 Fifth...

16
Gregex: GPU based High Speed Regular Expression Matching Engine Date:101/1/11 Publisher:2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing Author:Lei Wang, Shuhui Chen, Yong Tang, Jinshu Su Presenter : Shi-qu Yu
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    229
  • download

    6

Transcript of Gregex: GPU based High Speed Regular Expression Matching Engine Date:101/1/11 Publisher:2011 Fifth...

Gregex: GPU based High Speed Regular ExpressionMatching Engine

Date:101/1/11

Publisher:2011 Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing

Author:Lei Wang, Shuhui Chen, Yong Tang, Jinshu Su

Presenter : Shi-qu Yu

INTRODUCTIONGregex, a Graphics Processing Unit

(GPU) based regular expression matching engine for deep packet inspection (DPI).

Gregex leverages the computational power and high memory bandwidth of GPUs by storing data in proper GPU memory space and executing massive GPU thread concurrently to process lots of packets in parallel

THE PROPOSED GREGEX-Framework

THE PROPOSED GREGEX-FrameworkMatching result buffer is a single

dimension array allocated in the global device memory; the size of the array is equal to the number of packets that are processed by GPU at a time

THE PROPOSED GREGEX-Framework

THE PROPOSED GREGEX-Workflow

pre-processing phasesignature matching phasepost-processing phase

Pre-processing phaseCompiling regular expressions to DFA

Once the DFA has been constructed, the state transition table is copied to texture memory of GPU by two steps: 1. Copy state transition table from CPU memory to GPU global memory; 2. Bind the state transition table in global memory to texture cache.

Transferring packets to GPUGregex chooses to copy packets to device memory in batches.

Signature matching phase

Post-processing phaseWhen all GPU threads finish matching,

the matching result array is copied to the CPU memory. The kth cell of the matching result array contains the ID of the regular expression that matches the kth packet;if no match occurs, it is set to zero.

Optimizations1) Asynchronous packets Transfer with

Page-locked memory(ATP):Asynchronous copy:using cudaMemcpyAsync

function is nonblocking transfers, control is returned immediately to the host.

thread.Zero copy: Zero copy requires mapped page-locked memory and enables GPU threads to directly access host memory.

Optimizations2)Coalesced global memory access in

regular expression matchingCoalesced global memory Access by Buffering packets to shared Memory (CAB) In this work, coalesced global memory access is obtained by having each half warp reading contiguous locations of global memory to shared memory.

We use s packets which is a 32×32 shared memory array of 32-bit words, to ”buffer” packet from global memory for every thread.

EVALUATION RESULTSPC with a 2.66 GHz Intel Core 2 Duo

processor, 4 GB memory and a NVIDIA GeForce GTX 260 GPU card. GTX260 GPU contains 216 SPs organized in 27 SMs, running at 1.35 GHz with 896 MB of global memory.

Gregex uses signatures in the rule set released with Snort 2.7. The rule set consists of 56 different signature sets.

Packets Transfer Performance

Regular Expression Matching Performance

Regular Expression Matching Performance

Overall throughput of Gregex