Fast Snippet Generation Approach Based On CPU-GPU...

23
Huazhong University of Science and Technology Fast Snippet Generation Approach Based On CPU GPU Approach Based On CPU-GPU Hybrid System Di Li Ri Li Xi G Ding Liu, Ruixuan Li, Xiwu Gu, Kunmei Wen, Heng He, Guoqiang Gao Huazhong University of Science and Technology, Wuhan, China

Transcript of Fast Snippet Generation Approach Based On CPU-GPU...

  • Huazhong University ofScience and Technology

    Fast Snippet Generation Approach Based On CPU GPUApproach Based On CPU-GPU

    Hybrid System

    Di Li R i Li Xi GDing Liu, Ruixuan Li, Xiwu Gu,Kunmei Wen, Heng He, Guoqiang Gao

    Huazhong University of Science and Technology,Wuhan, China

  • Outline

    Backgroundg

    Motivation

    Approach

    E i t Experiments

    Conclusion

    Huazhong University of Science and Technology

  • What is Snippet?

    Snippet is a set of fragments which are extracted from the source documentsource document.

    Huazhong University of Science and Technology

  • Characteristics of Snippet

    Snippets help users understand the relevance of th d t d ththe documents and the query.

    The workload of snippet generation is very hugehuge.

    As different queries on a document will generate different snippets, we cannot take the document as the key to cache the snippets.

    Huazhong University of Science and Technology

  • Snippet Generation Approach

    Query-independent approach (static)

    Generate snippet without query, and the snippet is always the same for a certain documentalways the same for a certain document.

    Query-biased approach (dynamic)Query-biased approach (dynamic)

    Generate snippet with query, and the snippet of a pp q y, ppcertain document is different when using different query.

    Huazhong University of Science and Technology

  • Query-biased Snippet Generation

    Document-based approach

    Each < document, query > pair as an isolate process unit.

    Easy to parallelizeEasy to parallelize

    Index-based approach (recursive)

    Holger Bast , et al. WWW 2009

    Using inverted index lists to intersect recursively.Using inverted index lists to intersect recursively.

    Huazhong University of Science and Technology

  • Challenges

    Highlighter is an implementation of document-b d h th t i id l d f i tbased approach that is widely used for snippet generation in enterprise search engine

    li tiapplications.

    There are two defects in this approach.There are two defects in this approach.

    It adopts a serial computational model and this may lead to low efficiencylead to low efficiency.

    The high relevant fragment may be cut off and will cause lower snippet precision.

    Huazhong University of Science and Technology

  • Consideration of Snippet Generation

    Lightweight task

    Computation intensive

    Hard to cache

    S i G i M i ll t k b tSnippet Generation: Massive small tasks, but the processes are the same.

    GPU

    Huazhong University of Science and Technology

  • Contributions of this Paper

    Design a process stream for snippet generation li ti f h i i CPU GPUapplication of search engines using CPU-GPU

    hybrid architecture.

    Use sliding window for document segmentation.

    Parallelize the process of snippet generation for each segmentation and scoringeach segmentation and scoring.

    Huazhong University of Science and Technology

  • Snippet Generation Process

    Pre-Process Document segmentation d f t filt iand fragment filtering.

    C l l t th l f hFragment Scoring Calculate the relevance of each fragment against the query

    Selection & Construction Select the top relevant fragments as output snippetg p pp

    Huazhong University of Science and Technology

  • Document Segmentation

    Two types of document segmentationf Truncating segmentation: cut the document to fragments with

    the same length.• The drawback of this method is that high relevant fragment may beThe drawback of this method is that high relevant fragment may be

    cut off in the process.

    • Compared to sliding method, the amount of fragments is smaller, and it does good to performanceand it does good to performance.

    Sliding segmentation: segment a document as a sliding window, g g g g ,and all the fragments have the same length.

    • This method will not cut off the high relevant fragment, but it will generate much more fragments than the truncating methodgenerate much more fragments than the truncating method.

    • Besides, we should filter out the useless and duplicate part.

    • We propose using GPU to deal with more fragmentsWe propose using GPU to deal with more fragments.

    Huazhong University of Science and Technology

  • Fragment Scoring

    The common method to evaluate the relevance of a document and a query is based on support vectordocument and a query is based on support vector machine (SVM).

    f We consider a fragment as a small document. As the fragment is much more shorter than a document, we simplify the calculation as followssimplify the calculation as follows.

    Huazhong University of Science and Technology

  • Architecture

    Huazhong University of Science and Technology

  • Interaction between CPU and GPU

    Huazhong University of Science and Technology

  • Pipeline

    P1: Preprocessing Segment the documents into fragment sets, and filter out

    the irrelevance fragments

    P2: Transferring & scoring Transfer D Q HitV to GPU Transfer D, Q, HitV to GPU Score in GPU Transfer back to host memory Transfer back to host memoryP3: Sorting & outputting

    Huazhong University of Science and Technology

  • Experiment settings

    Intel Core2-Duo processors (2.2 GHz and 1 MB cache each) and 4 GB of main memoryeach) and 4 GB of main memory.

    GPU device is NVIDIA GTX200 / GTX250.

    All the codes are implemented in C++ and CUDA.

    Th l t d f R t C E li h The corpus was selected from Reuters Corpus English Language 2006, 2007 and 2008.

    The queries were selected from 2006 TREC efficiency topics.

    Data set containing 3,000,000 document-query pairs.

    Huazhong University of Science and Technology

  • Performance with different process unit size

    30

    35

    40

    ) 1.11.2

    1.3

    15

    20

    25

    ut (p

    airs

    /ms)

    0.8

    0.9

    1.0

    se ti

    me

    (ms)

    0

    5

    10

    thou

    ghpu

    0.5

    0.6

    0.7

    resp

    ons

    0 10 20 30 40

    number of pairs per unit0 10 20 30 40

    0.4

    number of pairs per unit

    We adopt 15 as the process unit size in the following experiments.

    Throughput and response time with different process unit size

    Huazhong University of Science and Technology

  • Efficiency under different load levels

    We tested the pipeline under different pressure to find out the performance under different work loadsout the performance under different work loads.

    200

    250

    (us) 0.9

    1.0

    1.1

    ms)

    40

    50

    ms)

    100

    150

    me

    cost

    per

    pai

    r (

    0.5

    0.6

    0.7

    0.8

    spon

    se ti

    me

    (m

    10

    20

    30

    ough

    put (

    pairs

    /m

    0 10 20 30 40 500

    50tim

    load level (pairs/ms)0 10 20 30 40 50

    0.3

    0.4res

    load level (pairs/ms)0 10 20 30 40 50

    0

    10

    thr

    load level (pairs/ms)

    (a) Time consuming of a pair (b) Response time measurement (c) Throughput measurement

    Huazhong University of Science and Technology

  • Performance comparison tests

    The test results show that, when the work load is more than 5 pairs/ms, our approach has a much betterthan 5 pairs/ms, our approach has a much better performance than Highlighter in both throughput and average time cost by a pair.

    32

    GPU running our approachHighlighter

    600 GPU running our approach Highlighter 1.8

    GPU running our approachHighlighter

    16

    24

    g g CPU running our approach

    (pai

    rs/m

    s)

    300

    400

    500 CPU running our approach

    er p

    air (

    us)

    1.2

    1.5

    Highlighter CPU running our approach

    time

    (ms)

    8

    16

    thro

    ughp

    ut

    100

    200

    time

    cost

    p

    0.3

    0.6

    0.9

    resp

    onse

    t

    0 10 20 30 40 500

    load level (pairs/ms)0 10 20 30 40 50

    0

    load level (pairs/ms)0 4 8 12 16 20 24 28 32

    load level (pairs/ms)

    (a) Throughput Measurement (b) Time consuming of a pair (c) Response time measurement

    Huazhong University of Science and Technology

  • Economic advantage

    We compare our approach using GPU and CPU with other high performance multi core processorsother high performance multi-core processors.

    GTX250 + Intel Intel Core2 Due Intel i5 2300 AMD Opteron8 Core2 Due(x2) (x2) (x4)

    p6134 (x6)

    Speed up 5.8 1.8 3.7 5.5

    price($) 107+77=184 77 190 507

    Huazhong University of Science and Technology

  • Snippet precision

    We compare our approach using GPU and CPU with Highlighter on CPU processorHighlighter on CPU processor.

    We use harmonic mean of precision and recall (F1) as the precision measurement to test the snippet precision.

    r p F1

    Our approach 0 48 0 44 0 459Our approach 0.48 0.44 0.459

    Highlighter 0.46 0.43 0.444

    Huazhong University of Science and Technology

  • Conclusions

    We propose a CPU-GPU hybrid system for snippet generation application of search enginesgeneration application of search engines.

    We employ sliding window for document We employ sliding window for document segmentation.

    We parallelize the process of snippet generation for each segmentation and scoring.g g

    We design a pipeline system for stream processing that achieves a preferable performance.

    Huazhong University of Science and Technology

  • Thanks!

    http://idc hust edu cnhttp://idc.hust.edu.cnHuazhong University of Science and Technology