What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher,...
Transcript of What we didn’t have time for CS664 Lecture 26 Thursday 12/02/04 Some slides c/o Dan Huttenlocher,...
What we didn’t have time for
CS664 Lecture 26Thursday 12/02/04
Some slides c/o Dan Huttenlocher, Stefano Soatto, Sebastian Thrun
Administrivia Final project is due at noon on
Friday 12/17 Write-up only (5MB max) Be sure to include some pictures
Send me email if you missed any quiz for a good reason
Outline Geometry Graph-based segmentation Statistics
Geometry
Homogeneous coordinates
Identify a point in the image plane with ray passing through that point (pixel) (x,y) ´ ( x, y, ) for non-zero (X,Y,Z) ´ (X/Z,Y/Z,1) for non-zero Z
Advantages Many non-linear operations
become linear in homogeneous coordinates Example: (X,Y,Z) projects to
(fX/Z,fY/Z)
2D point 3D point
3x4 camera
projection
Camera projection matrix
epipole
Epipolar geometry
epipolarplane
epipolar
line
Stefano Soatto (c) 2002
Pencil of planes Different epipolar planes for
different scene points x Plane defined by camera origins + x
Epipolar lines are important For pixel p in I
there is a corresponding epipolar line in I’ This allows us to
limit the search! Generalization of
stereo to arbitrary camera positions
Classical stereo has parallel cameras
p
Example: verged stereo
Examples: motion
Parallel toImage Plane
Forward
Essential matrix E Ex is perpendicular to x’s epipolar
line in the other image So if x’ corresponds to x then
x’TEx = 0 Captures the scene geometry
We assume the cameras are calibrated
Otherwise we get the fundamental matrix
Estimating the geometry The essential matrix has 5 parameters
Can estimate from 5 corresponding points
Fundamental matrix has 7 The question of “how few perfect
correspondences do you need” has spawned an unfortunately large literature
Yet more optimization We can estimate the essential
matrix from a bunch of point matches
A similar technique can be used to compute structure from motion Bundle adjustment
RANSAC (line fitting) Variant of generate-and-test Pick a small set of points at random Fit them via least squares Points “far” from this line are
outliers Repeat until you find a line with
very few outliers
RANSAC (camera geometry) Pick a small set of corresponding
pixels At least 5 (essential) or 7
(fundamental) Compute the matrix from these See how many corresponding
pixels this matrix explains
Graph-based Segmentation
Segmentation by min cut
ImagePixels
w
SimilarityMeasure
MinimumCut
* From Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Min cuts don’t segment well
Ideal Cut
Cuts with lesser weightthan the ideal cut
* Slide from Khurram Hassan-Shafique CAP5415 Computer Vision 2003
Normalized cuts Instead of the min cut, minimize
Measure of dis-similarity between the sets A and B NP-hard to minimize Rely on continuous approximation
VyBzVyAx
cut yzw
BAw
yxw
BAwBAN
,,,
,
,
,,
Normalized cuts examples
Limitations of normalized cuts Works by binary partitioning Slow and memory-intensive Textured backgrounds are
problems
Other graph-based methods Many other variants on min cuts
Typical cuts, nested cuts, etc. No clear winner for segmentation
Perhaps mean shift?
MST-based segmentation Minimum spanning tree is the
cheapest way to connect all pixels into a single component (or “region”)
Merge two components when the cheapest edge between them is cheap compared to a measure of the internal variation
Provably good segmentation under a fairly natural definition
Neither too coarse nor too fine
Example output
Solves many problems with normalized cuts
More statistics
Dimensionality reduction
We can represent orange points only by their v1 coordinate
Eigenfaces An n-pixel image
is a point in <n
Find low-dimensional representation of face images (from a training set)
Recognition by finding the closest face in face space
Markov Random Fields
MRF defining property:
Hammersley-Clifford Theorem:
),|(Pr),|(Pr pqpqp Nqffpqff
),(
),( ),(exp~)(Prqp
qpqp ffVf
neighborhood relationships (n-links)
image pixels (vertices)
pf - disparity at pixel p
),...,( 1 mfff - configuration
MAP estimation of an MRF
)Pr()|Pr(maxargˆ ffOff
p qp
qpqppp
f
ffVfOgf),(
),( ),()|(lnexpmaxargˆ
)|(Prmaxargˆ Offf
Observed data
Likelihoodfunction
(sensor noise)
Prior (MRF model)
Bayes rule
Energy minimization
),(
),( ),()|(ln)(qp
qpqp
p
pp ffVfOgfE
Data term
(sensor noise)
Smoothness term
(MRF prior)