Lecture 15 -Fei-Fei Li
Lecture 15: Object recognition:
Part‐based generative models
Professor Fei‐Fei LiStanford Vision Lab
8‐Nov‐111
Lecture 15 -Fei-Fei Li
What we will learn today?
• Introduction• Constellation model
– Weakly supervised training– One‐shot learning
• (Problem Set 4 (Q1))
8‐Nov‐112
Lecture 15 -Fei-Fei Li
Challenges: intra‐class variation
8‐Nov‐113
Lecture 15 -Fei-Fei Li
Usual Challenges:
Variability due to:
• View point
• Illumination
• Occlusions
8‐Nov‐114
Lecture 15 -Fei-Fei Li
Basic issues• Representation
– 2D Bag of Words (BoW) models; – Part‐based models; – Multi‐view models;
• Learning– Generative & Discriminative BoWmodels– Generative models– Probabilistic Hough voting
• Recognition– Classification with BoW– Classification with Part‐based models
8‐Nov‐115
Lecture 15 -Fei-Fei Li
Basic issues• Representation
– 2D Bag of Words (BoW) models; – Part‐based models; – Multi‐view models;
• Learning– Generative & Discriminative BoWmodels– Generative models– Probabilistic Hough voting
• Recognition– Classification with BoW– Classification with Part‐based models
8‐Nov‐116
Lecture 15 -Fei-Fei Li
Basic issues• Representation
– 2D Bag of Words (BoW) models; – Part‐based models; – Multi‐view models (Lecture #19);
• Learning– Generative & Discriminative BoWmodels– Generative models– Probabilistic Hough voting
• Recognition– Classification with BoW– Classification with Part‐based models
8‐Nov‐117
Lecture 15 -Fei-Fei Li
Problem with bag‐of‐words
• All have equal probability for bag‐of‐words methods
• Location information is important
8‐Nov‐118
Lecture 15 -Fei-Fei Li
Model: Parts and Structure
8‐Nov‐119
Lecture 15 -Fei-Fei Li
• Fischler & Elschlager 1973
• Yuille ‘91• Brunelli & Poggio ‘93• Lades, v.d. Malsburg et al. ‘93• Cootes, Lanitis, Taylor et al. ‘95• Amit & Geman ‘95, ‘99 • et al. Perona ‘95, ‘96, ’98, ’00, ‘03• Huttenlocher et al. ’00• Agarwal & Roth ’02
etc…
Parts and Structure Literature
8‐Nov‐1110
Lecture 15 -Fei-Fei Li
The Constellation ModelT. Leung
M. Burl
Representation
Detection
Shape statistics – F&G ’95Affine invariant shape – CVPR ‘98
CVPR ‘96ECCV ‘98
M. WeberM. Welling Unsupervised Learning
ECCV ‘00Multiple views - F&G ’00 Discovering categories - CVPR ’00
R. Fergus
L. Fei-Fei
Joint shape & appearance learningGeneric feature detectors
One-Shot LearningIncremental learning
CVPR ’03Polluted datasets - ECCV ‘04
ICCV ’03CVPR ‘04
8‐Nov‐1111
Lecture 15 -Fei-Fei Li
A B
DC
Deformations
8‐Nov‐1112
Lecture 15 -Fei-Fei Li
Presence / Absence of Features
occlusion
8‐Nov‐1113
Lecture 15 -Fei-Fei Li
Background clutter
8‐Nov‐1114
Lecture 15 -Fei-Fei Li
Foreground modelGenerative probabilistic model
Gaussian shape pdfClutter modelUniform shape pdfProb. of detection
0.8 0.75
0.9
# detections
pPoisson(N2|2)
pPoisson(N1|1)
pPoisson(N3|3)
Assumptions: (a) Clutter independent of foreground detections(b) Clutter detections independent of each other
Example1. Object Part Positions
3a. N false detect2. Part Absence
N1
N2
3b. Position f. detect
N3
8‐Nov‐1115
Lecture 15 -Fei-Fei Li
Learning Models `Manually’
• Obtain set of training images
• Label parts by hand, train detectors
• Learn model from labeled parts
• Choose parts
8‐Nov‐1116
Lecture 15 -Fei-Fei Li
Recognition1. Run part detectors exhaustively over image
2032
e.g.
0000
4
3
2
1
h
NNNN
h
1
2
3
3
2
41
1
2 3
1
2
2. Try different combinations of detections in model- Allow detections to be missing (occlusion)
3. Pick hypothesis which maximizes:
4. If ratio is above threshold then, instance detected
),|(),|(
HypClutterDatapHypObjectDatap
8‐Nov‐1117
Lecture 15 -Fei-Fei Li
So far…..• Representation
– Joint model of part locations– Ability to deal with background clutter and occlusions
• Learning– Manual construction of part detectors– Estimate parameters of shape density
• Recognition– Run part detectors over image– Try combinations of features in model– Use efficient search techniques to make fast
8‐Nov‐1118
Lecture 15 -Fei-Fei Li
Unsupervised LearningWeber & Welling et. al.
8‐Nov‐1119
Lecture 15 -Fei-Fei Li
(Semi) Unsupervised learning
•Know if image contains object or not•But no segmentation of object or manual selection of features
8‐Nov‐1120
Lecture 15 -Fei-Fei Li
Unsupervised detector training - 1
• Highly textured neighborhoods are selected automatically• produces 100-1000 patterns per image
10
10
8‐Nov‐1121
Lecture 15 -Fei-Fei Li
Unsupervised detector training - 2
“Pattern Space” (100+ dimensions)
8‐Nov‐1122
Lecture 15 -Fei-Fei Li
Unsupervised detector training - 3
100-1000 images ~100 detectors
8‐Nov‐1123
Lecture 15 -Fei-Fei Li
• Task: Estimation of model parameters
Learning
• Let the assignments be a hidden variable and use EM algorithm to learn them and the model parameters
• Chicken and Egg type problem, since we initially know neither:
- Model parameters
- Assignment of regions to foreground / background
• Take training images. Pick set of detectors. Apply detectors.
8‐Nov‐1124
Lecture 15 -Fei-Fei Li
ML using EM1. Current estimate
...
Image 1 Image 2 Image i
2. Assign probabilities to constellations
Large P
Small P
3. Use probabilities as weights to re-estimate parameters. Example:
Large P x + Small P x
new estimate of
+ … =
8‐Nov‐1125
Lecture 15 -Fei-Fei Li
Detector Selection
ParameterEstimation
Choice 1
Choice 2 ParameterEstimation
Model 1
Model 2
Predict / measure model performance(validation set or directly from model)
Detectors (100)
•Try out different combinations of detectors (Greedy search)
8‐Nov‐1126
Lecture 15 -Fei-Fei Li
Frontal Views of Faces
• 200 Images (100 training, 100 testing)
• 30 people, different for training and testing
8‐Nov‐1127
Lecture 15 -Fei-Fei Li
Learned face modelPre-selected Parts
Model Foreground pdf
Sample Detection
Parts in Model
Test Error: 6% (4 Parts)
8‐Nov‐1128
Lecture 15 -Fei-Fei Li
Face images
8‐Nov‐1129
Lecture 15 -Fei-Fei Li
Background images
8‐Nov‐1130
Lecture 15 -Fei-Fei Li
Preselected Parts
Model Foreground pdf
Sample Detection
Parts in Model
Car from RearTest Error: 13% (5 Parts)
8‐Nov‐1131
Lecture 15 -Fei-Fei Li
Detections of Cars
8‐Nov‐1132
Lecture 15 -Fei-Fei Li
Background Images
8‐Nov‐1133
Lecture 15 -Fei-Fei Li
3D Object recognition – Multiple mixture components
8‐Nov‐1134
Lecture 15 -Fei-Fei Li
3D Orientation Tuning
Frontal Profile
0 20 40 60 80 10050
55
60
65
70
75
80
85
90
95
100Orientation Tuning
angle in degrees
% Correct
% Correct
8‐Nov‐1135
Lecture 15 -Fei-Fei Li
So far (2)…..• Representation
– Multiple mixture components for different viewpoints
• Learning– Now semi-unsupervised– Automatic construction and selection of part detectors– Estimation of parameters using EM
• Recognition– As before
• Issues:-Learning is slow (many combinations of detectors)-Appearance learnt first, then shape
8‐Nov‐1136
Lecture 15 -Fei-Fei Li
Issues• Speed of learning
– Slow (many combinations of detectors)• Appearance learnt first, then shape
– Difficult to learn part that has stable location but variable appearance
– Each detector is used as a cross-correlation filter, giving a hard definition of the part’s appearance
• Would like a fully probabilistic representation of the object
8‐Nov‐1137
Lecture 15 -Fei-Fei Li
Object categorization
Fergus et. al.
CVPR ’03, IJCV ‘068‐Nov‐1138
Lecture 15 -Fei-Fei Li
Detection & Representation of regions
Appearance
Location
Scale
(x,y) coords. of region centre
Radius of region (pixels)
11x11 patchNormalizeProjection onto
PCA basis
c1
c2
c15
……
…..
Gives representation of appearance in low-dimensional vector space
• Find regions within image
• Use salient region operator(Kadir & Brady 01)
8‐Nov‐1139
Lecture 15 -Fei-Fei Li
Motorbikes example•Kadir & Brady saliency region detector
8‐Nov‐1140
Lecture 15 -Fei-Fei Li
Foreground modelGaussian shape pdf
Poission pdf on # detections
Uniform shape pdf
Generative probabilistic model (2)
Clutter model
Gaussian part appearance pdf
Gaussian background appearance pdf
Prob. of detection
0.8 0.75 0.9
Gaussian relative scale pdf
log(scale)
Uniformrelative scale pdf
log(scale)
based on Burl, Weber et al. [ECCV ’98, ’00]
8‐Nov‐1141
Lecture 15 -Fei-Fei Li
MotorbikesSamples from appearance model
8‐Nov‐1142
Lecture 15 -Fei-Fei Li
Recognized Motorbikes
8‐Nov‐1143
Lecture 15 -Fei-Fei Li
Background images evaluated with motorbike model
8‐Nov‐1144
Lecture 15 -Fei-Fei Li
Frontal faces
8‐Nov‐1145
Lecture 15 -Fei-Fei Li
Airplanes
8‐Nov‐1146
Lecture 15 -Fei-Fei Li
Spotted cats
8‐Nov‐1147
Lecture 15 -Fei-Fei Li
Summary of results
Dataset Fixed scale experiment
Scale invariant experiment
Motorbikes 7.5 6.7
Faces 4.6 4.6
Airplanes 9.8 7.0
Cars (Rear) 15.2 9.7
Spotted cats 10.0 10.0
% equal error rateNote: Within each series, same settings used for all datasets
8‐Nov‐1148
Lecture 15 -Fei-Fei Li
Comparison to other methods
Dataset Ours Others
Motorbikes 7.5 16.0 Weber et al. [ECCV ‘00]
Faces 4.6 6.0 Weber
Airplanes 9.8 32.0 Weber
Cars (Side) 11.5 21.0Agarwal
Roth [ECCV ’02]
% equal error rate
8‐Nov‐1149
Lecture 15 -Fei-Fei Li
Why this design?• Generic features seem to well in finding consistent parts
of the object
• Some categories perform badly – different feature types needed
• Why PCA representation?– Tried ICA, FLD, Oriented filter responses etc.– But PCA worked best
• Fully probabilistic representation lets us use tools from machine learning community
8‐Nov‐1150
Lecture 15 -Fei-Fei Li
S. Savarese, 2003
8‐Nov‐1151
Lecture 15 -Fei-Fei Li P. Buegel, 15628‐Nov‐1152
Lecture 15 -Fei-Fei Li
One-Shot learningFei-Fei et. al.
ICCV ’03, PAMI ‘068‐Nov‐1153
Lecture 15 -Fei-Fei Li
Algorithm Training Examples Categories
Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400
Faces, Motorbikes, Spotted cats, Airplanes,
Cars
Viola et al. ~10,000 Faces
Schneiderman, et al. ~2,000 Faces, Cars
Rowley et al.
~500 Faces
8‐Nov‐1154
Lecture 15 -Fei-Fei Li
1 2 3 4 5 6 7 8 90
10
20
30
40
50
60
log2 (Training images)
Cla
ssifi
catio
n er
ror
(%)
Generalisation performance
TestTrain
Number of training examples
Previously
6 part Motorbike model
8‐Nov‐1155
Lecture 15 -Fei-Fei Li
How do we do better than what statisticians have told us?
• Intuition 1: use Prior information
• Intuition 2: make best use of training information
8‐Nov‐1156
Lecture 15 -Fei-Fei Li
Prior knowledge: meansShapeAppearance
8‐Nov‐1157
Lecture 15 -Fei-Fei Li
Bayesian frameworkP(object | test, train) vs. P(clutter | test, train)
)object()trainobject,|test( pp
Bayes Rule
dpp )trainobject,|()object,|test(
Expansion by parametrization
8‐Nov‐1158
Lecture 15 -Fei-Fei Li
MLPrevious Work:
Bayesian frameworkP(object | test, train) vs. P(clutter | test, train)
)object()trainobject,|test( pp
Bayes Rule
dpp )trainobject,|()object,|test(
Expansion by parametrization
8‐Nov‐1159
Lecture 15 -Fei-Fei Li
One-Shot learning: pp object,train
Bayesian frameworkP(object | test, train) vs. P(clutter | test, train)
)object()trainobject,|test( pp
Bayes Rule
dpp )trainobject,|()object,|test(
Expansion by parametrization
8‐Nov‐1160
Lecture 15 -Fei-Fei Li
1
2n
model () space
Each object model
Gaussian shape pdfGaussian part
appearance pdf
Model Structure
8‐Nov‐1161
Lecture 15 -Fei-Fei Li
2n
model distribution: p()• conjugate distribution of p(train|,object)
1
model () space
Each object model
Gaussian shape pdfGaussian part
appearance pdf
Model Structure
8‐Nov‐1162
Lecture 15 -Fei-Fei Li
Learning Model Distribution
• use Prior information
• Bayesian learning
• marginalize over theta
Variational EM (Attias, Hinton, Minka, etc.)
ppp object ,train trainobject,
8‐Nov‐1163
Lecture 15 -Fei-Fei Li
E-Step
Random initializationVariational EM
prior knowledge of p()
new estimate of p(|train)
M-Step
new ’s
8‐Nov‐1164
Lecture 15 -Fei-Fei Li
ExperimentsTraining:
1- 6 randomly
drawn images
Testing:
50 fg/ 50 bg images
object present/absent
Datasets
spotted catsairplanes motorbikesfaces
8‐Nov‐1165
Lecture 15 -Fei-Fei Li
Faces
Airplanes
Motorbikes
Spotted cats
8‐Nov‐1166
Lecture 15 -Fei-Fei Li
Experiments: obtaining priors
spotted cats
airplanes
motorbikes
faces
model () space
8‐Nov‐1167
Lecture 15 -Fei-Fei Li
Experiments: obtaining priors
spotted cats
faces
airplanes
motorbikes
model () space
8‐Nov‐1168
Lecture 15 -Fei-Fei Li
Number of training examples
8‐Nov‐1169
Lecture 15 -Fei-Fei Li
Number of training examples
8‐Nov‐1170
Lecture 15 -Fei-Fei Li
Number of training examples
8‐Nov‐1171
Lecture 15 -Fei-Fei Li
Number of training examples
8‐Nov‐1172
Lecture 15 -Fei-Fei Li
Algorithm Training Examples Categories Results(e
rror)
Burl, et al. Weber, et al. Fergus, et al. 200 ~ 400
Faces, Motorbikes, Spotted cats, Airplanes,
Cars
5.6 - 10 %
Viola et al. ~10,000 Faces 7-21%
Schneiderman, et al. ~2,000 Faces, Cars 5.6 – 17%
Rowley et al.
~500 Faces 7.5 –24.1%
BayesianOne-Shot 1 ~ 5 Faces, Motorbikes,
Spotted cats, Airplanes8 –
15 %
8‐Nov‐1173
Lecture 15 -Fei-Fei Li
What we have learned today?
8‐Nov‐1174
• Introduction• Constellation model
– Weakly supervised training– One‐shot learning
• (Problem Set 4 (Q1))
Top Related