Schneiderman Kanade Viola Jones Presentation

8/12/2019 Schneiderman Kanade Viola Jones Presentation

1/20

Object Detection Using the Statistics of Parts

Presented by Nicholas Chan16-721 Advanced Perception

Robust Real-time Object Detection

Henry Schneiderman and Takeo Kanade

Paul Viola and Michael Jones


2/20

Object Detection Using the Statistics of Parts

Object detectors trained using information on image parts

Henry Schneiderman and Takeo Kanade


3/20

So whats a part?

Intuitively a part is a portion of an object For the purposes of image processing a

part is a group of features that arestatistically dependent.

The assumption being that certain groups of pixels

in an image tend to appear together and are(relatively) independent of other groups.


4/20

Choosing parts

First wavelet transform is applied to the image.This decorrelates the pixels, localizingdependencies and therefore producing more

focused parts.

A wavelet transform is the result ofapplying a series of wavelet filters toan image. The result is horizontal,vertical and diagonal responses forseveral scales.


5/20

Choosing parts (2)Next, seventeen hand designed local operatorsare applied across the image.These local operators combine pairs of filterresults from the wavelet transform. Some relatehorizontal to vertical responses, whereas othersrelate responses to those of the same orientationbut different scale.

The output is discrete over 3 8 values. These arethe parts.

Are we even talking about parts of anything anymore..?


6/20

Choosing parts (3)Intra-Subband

Inter-orientation

Inter-frequency

Inter-frequency/Inter-orientation

Local operator

Local operator

Local operator

Box o Mystery

Parts


7/20

Classification by parts

Using this definition of parts and the baseassumption that pixels within parts areindependent of those outside parts, a classifier can

be obtained:

r r

r

object non part P object part P

)|()|(

A simple independence assumption


8/20

Learning by parts

P(part | object) and P(part | non-object) arecalculated with a simple MLE:

)(

)&()|(

object count

object part count object part P

AdaBoost is used to improve classificationaccuracy (more on this later).


9/20

Detection examples


10/20

Robust Real-time Object DetectionPaul Viola and Michael Jones

High-speed face detection with good accuracy


11/20

The detector

A simple filter bank with learned weightsapplied across the image

But with some notable performance-boosting implementation tricks


12/20

Three big speed gains

Integral image representation andrectangle features

Selection of a small but effective featureset with AdaBoost

Cascading simple detectors to quicklyeliminate false positives


13/20

The integral image representation

An image representation that stores the sum of theintensity values above and to the left of the imagepoint.

x, y

IntegralImage(x,y) = Sum of the values in the grey region

So whats it good for?


14/20

The integral image representation

This representation allows rectangular featureresponses to be calculated in constant time.Rectangular features are simple filters that have

only +1 and -1 values and are well rectangles.

Two-rectangle features Three-rectangle features I bet you can guesswhat these are called

With an integral image and rectangular features, filterresponses are just a fixed number of table lookups and

additions away.


15/20

Speed gain number two: AdaBoost selected features

AdaBoost is used to select the best set ofrectangular features.

AdaBoost iteratively trains a classifier by

emphasizing misclassified training data. Assigned feature weights are used to select themost important features.

Top two features weighted by AdaBoost


16/20

Intermediate results

The face detector using 200 AdaBoost-selectedfeatures achieved a 1 in 14084 false positive ratewhen turned for a 95% classification rate.

An 384x288 image took 0.7 seconds to scan.

There are more improvements to be made


17/20

Speed gain number three:Cascading detectors

Instead of applying all 200 filters at every locationin the image, train several simpler classifiers toquickly eliminate easy negatives.

Each successive filter can be trained on truepositives and the false positives passed by thefilters before it.The filters are trained to allow approximately 10%false positives.

200Features

Imagesegment

Reject

Accept 20Features

Imagesegment

Reject

Accept20Features

Reject


18/20

Cascade improvements

The cascadingfeatures provide

comparable accuracy,but ten times the

speed.


19/20

Results

Good accuracy with very fast evaluation.

0.067 Seconds per image. An average of 8 out of 4297 features evaluated.


20/20

Detection examples

Schneiderman Kanade Viola Jones Presentation

Documents

Transcript of Schneiderman Kanade Viola Jones Presentation