ICCV2009: Max-Margin Ađitive Classifiers for Detection

36
Max-Margin Additive Classifiers for Detection Subhransu Maji & Alexander Berg University of California at Berkeley Columbia University ICCV 2009, Kyoto, Japan

description

 

Transcript of ICCV2009: Max-Margin Ađitive Classifiers for Detection

Page 1: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Max-Margin Additive Classifiers for Detection

Subhransu Maji & Alexander BergUniversity of California at Berkeley

Columbia UniversityICCV 2009, Kyoto, Japan

Page 2: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Evaluation Timefor SVM Classifiers

Accuracy

Eva

luat

ion

time

Non-linear Kernel

Linear Kernel

Page 3: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Evaluation Timefor SVM Classifiers

Accuracy

Eva

luat

ion

time

Our CVPR 08

Non-linear Kernel

Linear Kernel

Page 4: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Additive Kernel

Accuracy

Eva

luat

ion

time

Our CVPR 08

Accuracy vs. Evaluation Timefor SVM Classifiers

Non-linear Kernel

Linear Kernel

Page 5: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Additive Kernel

Additive Kernel

Accuracy

Eva

luat

ion

time

Our CVPR 08

Accuracy vs. Evaluation Timefor SVM Classifiers

Non-linear Kernel

Linear Kernel

Page 6: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Evaluation Timefor SVM Classifiers

Accuracy

Eva

luat

ion

time

Our CVPR 08

Made it possible to use SVMs with additive kernels for detection.

Non-linear Kernel

Additive KernelLinear Kernel

Additive Kernel

Page 7: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Additive Classifiers

• Much work already uses them!– SVMs with additive kernels are additive classifiers

• Histogram based kernels– Histogram intersection, chi-squared kernel

– Pyramid Match Kernel (Grauman & Darell, ICCV’05)– Spatial Pyramid Match Kernel (Lazebnik et.al., CVPR’06)– ….

Page 8: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Linear Kernel

Accuracy

Tra

inin

g tim

e

Non-linear

Page 9: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Accuracy

Tra

inin

g tim

e

Linear <=1990s

Non-linear

Page 10: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Accuracy

Tra

inin

g tim

e

TodayLinear

Non-linear

Eg. Cutting Plane, Stoc. Gradient Descend, Dual Coordinate Descend

Page 11: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Accuracy

Tra

inin

g tim

e

Linear

Our CVPR 08

Additive

Non-linear

Page 12: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Accuracy

Tra

inin

g tim

e

Linear

Our CVPR 08

Non-linear

Additive

Page 13: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Accuracy

Tra

inin

g tim

e

Linear

This Paper

Non-linear

Additive

Page 14: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Accuracy vs. Training Timefor SVM Classifiers

Linear

Accuracy

Tra

inin

g tim

e

This Paper

Makes it possible to train additive classifiers very fast.

Non-linear

Additive

Page 15: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Summary

• Additive classifiers are widely used and can provide better accuracy than linear

• Our CVPR 08: SVMs with additive kernels are additive classifiers and can be evaluated in O(#dim) -- same as linear.

• This work: additive classifiers can be trained directly as efficiently (up to a small constant) as the best approaches for training linear classifiers.

Additive Kernel SVM

Our Additive Classifier

Linear SVM

Time Train 1000 Test 1000

Train 10Test 1

Train 10Test 1

Accuracy 95 % 94 % 82 %

An example

Page 16: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Support Vector Machines

Kernel Function• Inner Product in the embedded space• Can learn non-linear boundaries in input space

Classification Function

Kernel Trick

Input Space Embedded Space

Page 17: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Embeddings…

• These embeddings can be high dimensional (even infinite)

• Our approach is based on embeddings that approximate kernels.

• We’d like this to be as accurate as possible• We are going to use fast linear classifier training

algorithms on the so sparseness is important.

Page 18: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Key Idea: Embedding an Additive Kernel

• Additive Kernels are easy to embed, just embed each dimension independently

• Linear Embedding for min Kernel for integers

• For non integers can approximate by quantizing

Page 19: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Issues: Embedding Error

• Quantization leads to large errors

• Better encoding

xy

Page 20: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Issues: Sparsity• Represent with sparse values

Page 21: ICCV2009: Max-Margin Ađitive Classifiers for Detection

• Linear SVM objective (solve with LIBLINEAR):

• Encoded SVM objective (not practical):

Linear vs. Encoded SVMs

Page 22: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Linear vs. Encoded SVMs

• Linear SVM objective (solve with LIBLINEAR):

• Encoded SVM modified (custom solver):

Encourages smooth functionsClosely approximates min kernel SVMCustom solver : PWLSGD (see paper)

Page 23: ICCV2009: Max-Margin Ađitive Classifiers for Detection

• Linear SVM objective (solve with LIBLINEAR):

• Encoded SVM objective (solve with LIBLINEAR) :

Linear vs. Encoded SVMs

Page 24: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I ✔ ✔

✔ ✔

Additive Classifier Choices

Regularization

Encoding

Page 25: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I ✔ ✔

✔ ✔

Additive Classifier ChoicesAccuracy Increases

Evaluation times are similar

Regularization

Encoding

Page 26: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I ✔ ✔

✔ ✔

Additive Classifier ChoicesAc

cura

cy In

crea

ses

Accuracy Increases

Evaluation times are similar

Regularization

Encoding

Page 27: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I ✔ ✔

✔ ✔

Additive Classifier ChoicesAc

cura

cy In

crea

ses

Accuracy Increases

Few lines of code + standard solverEg. LIBLINEAR

Standard solverEg. LIBSVM

Regularization

Encoding

Page 28: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I ✔ ✔

✔ ✔

Additive Classifier ChoicesAc

cura

cy In

crea

ses

Accuracy Increases

Custom solver

Regularization

Encoding

Page 29: ICCV2009: Max-Margin Ađitive Classifiers for Detection

linear piecewise linear

IKSVM

I

Additive Classifier ChoicesAc

cura

cy In

crea

ses

Accuracy Increases

Classifier Notations

Regularization

Encoding

Page 30: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Experiments

• “Small” Scale: Caltech 101 (Fei-Fei, et.al.)

• “Medium” Scale: DC Pedestrians (Munder & Gavrila)

• “Large” Scale : INRIA Pedestrians (Dalal & Triggs)

Page 31: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Experiment : DC Pedestrians

20,000 features, 656 dimensional100 bins for encoding6-fold cross validation

100x fastertraining time ~ linear SVMaccuracy ~ kernel SVM

(1.89s, 72.98%)

(2.98s, 85.71%)

(1.86s, 88.80%)

(3.18s, 89.25%)

(363s, 89.05%)

Page 32: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Experiment : Caltech 101

30 training examples per category100 bins for encoding

Pyramid HOG + Spatial Pyramid Match Kernel

(41s, 46.15%)

(2687s, 56.49%)

(291s, 55.35%)

(102s, 54.8%)

(90s, 51.64%)

10x fasterSmall loss in accuracy

Page 33: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Experiment : INRIA Pedestrians

SPHOG: 39,000 features, 2268 dimensional 100 bins for encodingCross Validation Plots

(20s, 0.82)

(27s, 0.88)

(140 mins, 0.95)(76s, 0.94)

(122s, 0.85)

300x fastertraining time ~ linear SVM

accuracy ~ kernel SVMtrains the detector in < 2 mins

Page 34: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Experiment : INRIA Pedestrians

SPHOG: 39,000 features, 2268 dimensional 100 bins for encodingCross Validation Plots

300x fastertraining time ~ linear SVM

accuracy ~ kernel SVMtrains the detector in < 2 mins

Page 35: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Take Home Messages

• Additive models are practical for large scale data• Can be trained discriminatively:

– Poor man’s version : encode + Linear SVM Solver– Middle man’s version : encode + Custom Solver– Rich man’s version : Min Kernel SVM

• Embedding only Approximates kernels, leads to small loss in accuracy but up to 100x speedup in training time

• Everyone should use: see code on our websites– Fast IKSVM from CVPR’08, Encoded SVMs, etc

Page 36: ICCV2009: Max-Margin Ađitive Classifiers for Detection

Thank You