Schneiderman Kanade Viola Jones Presentation

download Schneiderman Kanade Viola Jones Presentation

of 20

Transcript of Schneiderman Kanade Viola Jones Presentation

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    1/20

    Object Detection Using the Statistics of Parts

    Presented by Nicholas Chan16-721 Advanced Perception

    Robust Real-time Object Detection

    Henry Schneiderman and Takeo Kanade

    Paul Viola and Michael Jones

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    2/20

    Object Detection Using the Statistics of Parts

    Object detectors trained using information on image parts

    Henry Schneiderman and Takeo Kanade

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    3/20

    So whats a part?

    Intuitively a part is a portion of an object For the purposes of image processing a

    part is a group of features that arestatistically dependent.

    The assumption being that certain groups of pixels

    in an image tend to appear together and are(relatively) independent of other groups.

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    4/20

    Choosing parts

    First wavelet transform is applied to the image.This decorrelates the pixels, localizingdependencies and therefore producing more

    focused parts.

    A wavelet transform is the result ofapplying a series of wavelet filters toan image. The result is horizontal,vertical and diagonal responses forseveral scales.

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    5/20

    Choosing parts (2)Next, seventeen hand designed local operatorsare applied across the image.These local operators combine pairs of filterresults from the wavelet transform. Some relatehorizontal to vertical responses, whereas othersrelate responses to those of the same orientationbut different scale.

    The output is discrete over 3 8 values. These arethe parts.

    Are we even talking about parts of anything anymore..?

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    6/20

    Choosing parts (3)Intra-Subband

    Inter-orientation

    Inter-frequency

    Inter-frequency/Inter-orientation

    Local operator

    Local operator

    Local operator

    Box o Mystery

    Parts

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    7/20

    Classification by parts

    Using this definition of parts and the baseassumption that pixels within parts areindependent of those outside parts, a classifier can

    be obtained:

    r r

    r

    object non part P object part P

    )|()|(

    A simple independence assumption

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    8/20

    Learning by parts

    P(part | object) and P(part | non-object) arecalculated with a simple MLE:

    )(

    )&()|(

    object count

    object part count object part P

    AdaBoost is used to improve classificationaccuracy (more on this later).

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    9/20

    Detection examples

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    10/20

    Robust Real-time Object DetectionPaul Viola and Michael Jones

    High-speed face detection with good accuracy

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    11/20

    The detector

    A simple filter bank with learned weightsapplied across the image

    But with some notable performance-boosting implementation tricks

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    12/20

    Three big speed gains

    Integral image representation andrectangle features

    Selection of a small but effective featureset with AdaBoost

    Cascading simple detectors to quicklyeliminate false positives

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    13/20

    The integral image representation

    An image representation that stores the sum of theintensity values above and to the left of the imagepoint.

    x, y

    IntegralImage(x,y) = Sum of the values in the grey region

    So whats it good for?

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    14/20

    The integral image representation

    This representation allows rectangular featureresponses to be calculated in constant time.Rectangular features are simple filters that have

    only +1 and -1 values and are well rectangles.

    Two-rectangle features Three-rectangle features I bet you can guesswhat these are called

    With an integral image and rectangular features, filterresponses are just a fixed number of table lookups and

    additions away.

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    15/20

    Speed gain number two: AdaBoost selected features

    AdaBoost is used to select the best set ofrectangular features.

    AdaBoost iteratively trains a classifier by

    emphasizing misclassified training data. Assigned feature weights are used to select themost important features.

    Top two features weighted by AdaBoost

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    16/20

    Intermediate results

    The face detector using 200 AdaBoost-selectedfeatures achieved a 1 in 14084 false positive ratewhen turned for a 95% classification rate.

    An 384x288 image took 0.7 seconds to scan.

    There are more improvements to be made

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    17/20

    Speed gain number three:Cascading detectors

    Instead of applying all 200 filters at every locationin the image, train several simpler classifiers toquickly eliminate easy negatives.

    Each successive filter can be trained on truepositives and the false positives passed by thefilters before it.The filters are trained to allow approximately 10%false positives.

    200Features

    Imagesegment

    Reject

    Accept 20Features

    Imagesegment

    Reject

    Accept20Features

    Reject

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    18/20

    Cascade improvements

    The cascadingfeatures provide

    comparable accuracy,but ten times the

    speed.

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    19/20

    Results

    Good accuracy with very fast evaluation.

    0.067 Seconds per image. An average of 8 out of 4297 features evaluated.

  • 8/12/2019 Schneiderman Kanade Viola Jones Presentation

    20/20

    Detection examples