How many object categories are there? Biederman 1987.
-
date post
19-Dec-2015 -
Category
Documents
-
view
219 -
download
1
Transcript of How many object categories are there? Biederman 1987.
![Page 1: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/1.jpg)
![Page 2: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/2.jpg)
How many object categories are there?
Biederman 1987
![Page 3: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/3.jpg)
So what does object recognition involve?
![Page 4: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/4.jpg)
Verification: is that a lamp?
![Page 5: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/5.jpg)
Detection: are there people?
![Page 6: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/6.jpg)
Identification: is that Potala Palace?
![Page 7: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/7.jpg)
Object categorization
mountain
building
tree
banner
vendorpeople
street lamp
![Page 8: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/8.jpg)
Scene and context categorization
• outdoor
• city
• …
![Page 9: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/9.jpg)
Computational photography
![Page 10: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/10.jpg)
Assisted driving
meters
met
ers
Ped
Ped
Car
Lane detection
Pedestrian and car detection
• Collision warning systems with adaptive cruise control, • Lane departure warning systems, • Rear object detection systems,
![Page 11: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/11.jpg)
Improving online search
Query:STREET
Organizing photo collections
![Page 12: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/12.jpg)
Challenges 1: view point variation
Michelangelo 1475-1564
![Page 13: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/13.jpg)
Challenges 2: illumination
slide credit: S. Ullman
![Page 14: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/14.jpg)
Challenges 3: occlusion
Magritte, 1957
![Page 15: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/15.jpg)
Challenges 4: scale
![Page 16: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/16.jpg)
Challenges 5: deformation
Xu, Beihong 1943
![Page 17: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/17.jpg)
Challenges 6: background clutter
Klimt, 1913
![Page 18: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/18.jpg)
Challenges 7: intra-class variation
![Page 19: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/19.jpg)
![Page 20: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/20.jpg)
IssuesIssues
• RepresentationHow to represent image, memory?– Low or high level? Concrete or abstract?– Appearance, shape, or both?– Local or global?– Invariance, dimensionality reduction
• Memory contactWhen do you make it?– Bottom up? Top down?– Difficulty of computation
![Page 21: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/21.jpg)
IssuesIssues
• Learning
• Recognition– Discriminative or generative?– Specific object or object category?– One object or many?– Categorization or detection?– Scaling up?
![Page 22: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/22.jpg)
Local, low level Representation
The image itself
![Page 23: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/23.jpg)
Local, low level Representation
![Page 24: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/24.jpg)
Representation
– Part-based
![Page 25: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/25.jpg)
Representation
– Part-based
![Page 26: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/26.jpg)
The correspondence problem• Model with P parts• Image with N possible assignments for each part• Consider mapping to be 1-1
• NP combinations!!!
![Page 27: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/27.jpg)
Deformable Template Matching
QueryTemplate
Berg, Berg and Malik CVPR 2005
• Formulate problem as Integer Quadratic Programming• O(NP) in general• Use approximations that allow P=50 and N=2550 in <2 secs
![Page 28: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/28.jpg)
Jigsaw approach: Borenstein and Ullman, 2002
![Page 29: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/29.jpg)
Globalrepresentation
![Page 30: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/30.jpg)
– Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning
Learning
![Page 31: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/31.jpg)
– Scale / orientation range to search over – Speed– Context
Recognition
![Page 32: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/32.jpg)
Geons
–Biederman, description of geons. Are they still view invariant when describing a geon?
• 3D shape’s occluding contour depends on viewpoint. May be straight from one view, curved from another.
• Metric properties not truly invariant.
• Maybe more like quasi-invariants.
![Page 33: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/33.jpg)
Geons for Recognition
• Analogy to speech.– 36 different geons.– Different relations between them.– Millions of ways of putting a few geons
together.
![Page 34: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/34.jpg)
Chamfer Matching
id
For every edge point in the transformed object, compute the distance to the nearest image edge point. Sum distances.
![Page 35: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/35.jpg)
![Page 36: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/36.jpg)
Linear subspaces
• Classification can be expensive– Must either search (e.g., nearest neighbors) or store large PDF’s
Suppose the data points are arranged as above• Idea—fit a line, classifier measures distance to line
convert x into v1, v2 coordinates
What does the v2 coordinate measure?
What does the v1 coordinate measure?
- distance to line- use it for classification—near 0 for orange pts
- position along line- use it to specify which orange point it is
![Page 37: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/37.jpg)
Dimensionality reduction
How to find v1 and v2 ?- work out on board
Dimensionality reduction• We can represent the orange points with only their v1 coordinates
– since v2 coordinates are all essentially 0
• This makes it much cheaper to store and compare points
• A bigger deal for higher dimensional problems
![Page 38: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/38.jpg)
Linear subspacesConsider the variation along direction v among all of the orange points:
What unit vector v minimizes var?
What unit vector v maximizes var?
Solution: v1 is eigenvector of A with largest eigenvalue v2 is eigenvector of A with smallest eigenvalue
![Page 39: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/39.jpg)
Principle component analysis
• Suppose each data point is N-dimensional– Same procedure applies:
– The eigenvectors of A define a new coordinate system• eigenvector with largest eigenvalue captures the most variation among
training vectors x• eigenvector with smallest eigenvalue has least variation
– We can compress the data by only using the top few eigenvectors• corresponds to choosing a “linear subspace”
– represent points on a line, plane, or “hyper-plane”
• these eigenvectors are known as the principle components
![Page 40: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/40.jpg)
The space of faces
• An image is a point in a high dimensional space– An N x M image is a point in RNM
– We can define vectors in this space as we did in the 2D case
+=
![Page 41: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/41.jpg)
Dimensionality reduction
• The set of faces is a “subspace” of the set of images– Suppose it is K dimensional– We can find the best subspace using PCA– This is like fitting a “hyper-plane” to the set of faces
• spanned by vectors v1, v2, ..., vK
• any face
![Page 42: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/42.jpg)
Eigenfaces
• PCA extracts the eigenvectors of A– Gives a set of vectors v1, v2, v3, ...
– Each one of these vectors is a direction in face space
• what do these look like?
![Page 43: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/43.jpg)
Projecting onto the eigenfaces
• The eigenfaces v1, ..., vK span the space of faces
– A face is converted to eigenface coordinates by
![Page 44: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/44.jpg)
Recognition with eigenfaces• Algorithm1. Process the image database (set of images with labels)
• Run PCA—compute eigenfaces• Calculate the K coefficients for each image
2. Given a new image (to be recognized) x, calculate K coefficients
3. Detect if x is a face
4. If it is a face, who is it?
• Find closest labeled face in database• nearest-neighbor in K-dimensional space
![Page 45: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/45.jpg)
Choosing the dimension K
K NMi =
eigenvalues
• How many eigenfaces to use?• Look at the decay of the eigenvalues
– the eigenvalue tells you the amount of variance “in the direction” of that eigenface
– ignore eigenfaces with low variance
![Page 46: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/46.jpg)
One simple method: skin detection
• Skin pixels have a distinctive range of colors– Corresponds to region(s) in RGB color space
• for visualization, only R and G components are shown above
skin
Skin classifier• A pixel X = (R,G,B) is skin if it is in the skin region• But how to find this region?
![Page 47: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/47.jpg)
Skin detection
• Learn the skin region from examples– Manually label pixels in one or more “training images” as skin or not skin– Plot the training data in RGB space
• skin pixels shown in orange, non-skin pixels shown in blue• some skin pixels may be outside the region, non-skin pixels inside. Why?
Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?
![Page 48: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/48.jpg)
Skin classification techniques
Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?
• Nearest neighbor– find labeled pixel closest to X– choose the label for that pixel
• Data modeling– fit a model (curve, surface, or volume) to each class
• Probabilistic data modeling– fit a probability model to each class
![Page 49: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/49.jpg)
Probabilistic skin classification
• Now we can model uncertainty– Each pixel has a probability of being skin or not skin
•
Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?• Choose interpretation of highest probability
– set X to be a skin pixel if and only if
Where do we get and ?
![Page 50: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/50.jpg)
Learning conditional PDF’s
• We can calculate P(R | skin) from a set of training images– It is simply a histogram over the pixels in the training images
• each bin Ri contains the proportion of skin pixels with color Ri
This doesn’t work as well in higher-dimensional spaces. Why not?
Approach: fit parametric PDF functions • common choice is rotated Gaussian
– center – covariance
» orientation, size defined by eigenvecs, eigenvals
![Page 51: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/51.jpg)
Learning conditional PDF’s
• We can calculate P(R | skin) from a set of training images– It is simply a histogram over the pixels in the training images
• each bin Ri contains the proportion of skin pixels with color Ri
But this isn’t quite what we want• Why not? How to determine if a pixel is skin?• We want P(skin | R) not P(R | skin)
• How can we get it?
![Page 52: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/52.jpg)
Bayes rule
• In terms of our problem:what we measure
(likelihood)domain knowledge
(prior)
what we want(posterior)
normalization term
The prior: P(skin)• Could use domain knowledge
– P(skin) may be larger if we know the image contains a person– for a portrait, P(skin) may be higher for pixels in the center
• Could learn the prior from the training set. How?– P(skin) may be proportion of skin pixels in training set
![Page 53: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/53.jpg)
Skin detection results
![Page 54: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/54.jpg)
Object categorization: Object categorization: the statistical viewpointthe statistical viewpoint
)|( imagezebrap
)( ezebra|imagnopvs.
• Bayes rule:
)(
)(
)|(
)|(
)|(
)|(
zebranop
zebrap
zebranoimagep
zebraimagep
imagezebranop
imagezebrap
posterior ratio likelihood ratio prior ratio
![Page 55: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/55.jpg)
Discriminative
• Direct modeling of
Zebra
Non-zebra
Decisionboundary
)|(
)|(
imagezebranop
imagezebrap
![Page 56: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/56.jpg)
• Model and
Generative)|( zebraimagep ) |( zebranoimagep
Low Middle
High MiddleLow
)|( zebranoimagep)|( zebraimagep
![Page 57: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/57.jpg)
ObjectObject Bag of ‘words’Bag of ‘words’
![Page 58: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/58.jpg)
• Looser definition– Independent features
A clarification: definition of “BoW”
![Page 59: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/59.jpg)
Feature-detector view
![Page 60: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/60.jpg)
• No rigorous geometric information of the object components
• It’s intuitive to most of us that objects are made of parts – no such information
• Not extensively tested yet for– View point invariance– Scale invariance
• Segmentation and localization unclear
Weakness of the modelWeakness of the model
![Page 61: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/61.jpg)
Part 2: part-based models
by Rob Fergus (MIT)
![Page 62: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/62.jpg)
Representation
![Page 63: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/63.jpg)
Model: Parts and Structure
![Page 64: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/64.jpg)
Representation• Object as set of parts
– Generative representation
• Model:
– Relative locations between parts
– Appearance of part
• Issues:
– How to model location
– How to represent appearance
– Sparse or dense (pixels or regions)
– How to handle occlusion/clutter
Figure from [Fischler & Elschlager 73]
![Page 65: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/65.jpg)
Pose: Chamfer Matching with the Distance Transform
00
0 0000
11
1
11
1
11
1111 1
1
22
22
2222
22
22 33
3333
33 4
Example: Each pixel has (Manhattan) distance to nearest edge pixel.
![Page 66: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/66.jpg)
00
0 0000
11
1
11
1
11
1111 1
1
22
22
2222
22
22 33
3333
33 4
Then, try all translations of model edges. Add distances under each edge pixel.
![Page 67: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/67.jpg)
Pose: Ransac
• Match enough features in model to features in image to determine pose.
• Examples:– match a point and determine translation.– match a corner and determine translation and
rotation.– Points and translation, rotation, scaling?– Lines and rotation and translation?
![Page 68: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/68.jpg)
Recap: Recognition w/ RANSAC
1. Find features in model and image.– Such as corners.
2. Match enough to determine pose.– Such as 3 points for planar object, scaled orthographic
projection.
3. Determine pose.4. Project rest of object features into image.5. Look to see how many image features they match.
– Example: with bounded error, count how many object features project near an image feature.
6. Repeat steps 2-5 a bunch of times.7. Pick pose that matches most features.
![Page 69: How many object categories are there? Biederman 1987.](https://reader030.fdocuments.in/reader030/viewer/2022020117/56649d385503460f94a11d65/html5/thumbnails/69.jpg)
Other Viewpoint Invariants
• Parallelism
• Convexity
• Common region
• ….