Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth....
-
Upload
darren-oliver -
Category
Documents
-
view
215 -
download
1
Transcript of Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth....
Stereo
Dan Kong
Stereo vision
Triangulate on two images of the same scene point to recover depth.
Camera calibration Finding all correspondence Computing depth or surfaces
baseline
depth
left Right
Outline
Basic stereo equationsConstraints and assumptionWindows-based matchingCooperative StereoDynamic programmingGraph cut and Belief PropagationSegmentation-based method
Pinhole Camera Model
Imageplane
VirtualImage
),,( ZYXP
),,( ZYXP
f
Oy
x
z
)1,,()1,,(),,(
,
,,
Z
Yf
Z
XfyxZYX
YyXxZ
YfY
Z
XfXfZ
Basic Stereo Derivations
),,(1 ZYXP 1Oy
x
z
f
2Oy
x
z
B
BfxxZ ,,, offunction a as for expression Derive 21
1p
2p
Basic Stereo Derivations
),,(1 ZYXP 1Oy
x
z
f
2Oy
x
z
B
d
Bf
xx
BfZ
Z
Bfx
Z
BXfx
Z
Xfx
211
11
1
12
1
11 ,
disparity
Stereo Constraint
Color constancy The color of any world points remains
constant from image to image This assumption is true under
Lambertian Model In practice, given photometric camera
calibration and typical scenes, color constancy holds well enough for most stereo algorithms.
Stereo Constraint
Epipolar geometry The epipolar geometry is the fundamental constraint in
stereo. Rectification aligns epipolar lines with scanlines
Epipolar plane
Epipolar line for pEpipolar line for p’
Stereo Constraint
Uniqueness and Continuity Proposed by Marr&Poggio. Each item from each image may be
assigned at most one disparity value,” and the “disparity” varies smoothly almost everywhere.
Correspondence Using Window-based matching
SSD error
disparity
Left Right
scanline
Sum of Squared (Pixel) Differences
Left Right
Lw Rw
LI RI
),(),(
2
2222
)],(),([),,(
:disparity offunction a as differenceintensity themeasurescost SSD The
},|,{),(
:function window thedefine We
pixels. of windowsby ingcorrespond are and
yxWvuRLr
mmmmm
RL
m
vduIvuIdyxC
yvyxuxvuyxW
mmww
LwRw
),( LL yx ),( LL ydx
m
m
Image Normalization
Even when the cameras are identical models, there can be differences in gain and sensitivity.
The cameras do not see exactly the same surfaces, so their overall light levels can differ.
For these reasons and more, it is a good idea to normalize the pixels in each window:
pixel Normalized ),(
),(ˆ
magnitude Window )],([
pixel Average ),(
),(
),(),(
2
),(
),(),(),(
1
yxW
yxWvuyxW
yxWvuyxW
m
mm
m
m
II
IyxIyxI
vuII
vuII
Images as VectorsLeft Right
LwRw
m
m
Lw
Lw
row 1
row 2
row 3
m
m
m
“Unwrap” image to form vector, using raster scan order
Each window is a vectorin an m2 dimensionalvector space.Normalization makesthem unit length.
Normalized Correlation
Lw)(dwR
Normalized Correlation
cos)(
),(ˆ),(ˆ)(),(),(
NC
dww
vduIvuIdC
RL
yxWvuRL
m
* arg max ( )d L Rd w w d
Results Using window-based Method
Left Disparity Map
Images courtesy of Point Grey Research
Stereo Results
Left Disparity map
Problems with Window-based matching
Disparity within the window must be constant.
Bias the results towards frontal-parallel surfaces.
Blur across depth discontinuities.Perform poorly in textureless regions.Erroneous results in occluded regions
Cooperative Stereo Algorithm
Based on two basic assumption by Marr and Poggio: Uniqueness: at most a single unique match
exists for each pixel. Continuous: disparity values are generally
continuous, i.e., smooth within a local neighborhood.
Disparity Space Image (DSI)
The 3D disparity space has dimensions row r column c and disparity d. Each element (r, c, d) of the disparity space projects to the pixel (r, c) in the left image and to the (r, c + d) in the right image
DSI represents the confidence or likelihood of a particular match.
Illustration of DSI
(r, c) slices for different d
(c, d) slice for r = 151
Definition
( , , )nL r c d
0 ( , , )L r c d
( , , )r c d
( , , )r c d
Match value assigned to element (r, c, d) at iteration n
Initial values computed from SSD or NCC
Inhibition area for element (r, c, d)
Local support area for element (r, c, d)
Illustration of Inhibitory and Support Regions
Iterative Updating DSI
1
2
3
4
Explicit Detection of Occlusion
Identify occlusions by examining the magnitude of the converged values in conjunction with the uniqueness
constrain
Summary of Cooperative Stereo
Prepare a 3D array, (r, c, d): (r, c) for each pixel in the reference image and d for the range of disparity.
Set initial match values using a function of image intensities, such as normalized correlation or SSD.
Iteratively update match values using (4) until the match values converge.
For each pixel (r, c), find the element (r, c, d) with the maximum match value.
If the maximum match value is higher than a threshold, output the disparity d, otherwise, declare a occlusion.
0L
nL
MRF Stereo Model
Local Evidence function
Compatibility function
( , )p px y
( , )p nx x
:Lx1 vector
:LxL matrix
Disparity Optimization
Joint probability of MRF:
The disparity optimization step requires choosing an estimator for MMSE: estimate of the mean of the marginal
distribution of MAP: the labeling of maximize the
above joint probability
1 2 , 1 2
( , )
( , ,..., , ,... )
( , ) ( , )N N
i j p pi j p
P x x x y y y
x x x y
1 ,... Nx x
ix1 ,... Nx x
(1)
Equivalence to Energy Minimization
Taking the negative log of equation 1:
In graph cut, equation 2 is expressed as:
Maximizing the probability in equation 1is equivalent to minimizing energy in equation 3.
1 2 , 1 2
( , )
( , ,..., , ,... )
log ( , ) log ( , )N N
i j p pi j p
E x x x y y y
x x x y
(2)
1 2 , 1 2
( , )
( , ,..., , ,... )
( , ) ( , )N N
i j p pi j p
E x x x y y y
V x x D x y
(3)
Stereo Matching Using Belief Propagation
Belief propagation is an iterative inference algorithm that propagates messages in the Markov network ( , )st s tm x x
( , )s s sm x y( )s sb x
Message node send to sxtx
Message observed node send to sysx
Belief at node sx
We simplify as , and as ( , )st s tm x x ( )st tm x ( , )s s sm x y( )s sm x
Belief Propagation Algorithm
Initialize messages as uniform distribution
Iterative update messages for I = 1:T
Compute belief at each node and output disparity
Illustration of BP
BP Results
Stereo As a Pixel-Labeling Problem
Let P be a set of pixels, L be a label set. The goal is find a labeling f which minimize some energy. For stereo, the labels are disparities.
The classic form of energy function is:
Energy Function:
The energy function measures how appropriate a label is for the pixel given the observed data. In stereo, this term corresponds to the match cost or likelihood.
The energy term encodes the prior or smoothness constraint. In stereo, the so called Potts model is used:
( )p pD fp
, ( , )p q p qV f f
, ,
0( )
( )p q
p q p q
I
f fV f f
I otherwise
Two Energy Minimization Algorithm via Graph Cuts
Swap algorithm
Two Energy Minimization Algorithm via Graph Cuts
expansion algorithm
Moves
Graph Cuts Results
Graph Cuts Belief Propagation
Ordering Constraint
If an object a is left on an object b in the left image then object a will also appear to the left of object b in the right image
Ordering constraint… …and its failure
Stereo Correspondences
… …Left scanline Right scanline
Match intensities sequentially between two scanlines
Stereo Correspondences
… …Left scanline Right scanline
Match
Match
MatchLeft occlusion Right occlusion
Search Over Correspondences
Three cases: Sequential – cost of match Left occluded – cost of no match Right occluded – cost of no match
Left scanline
Right scanline
Left Occluded Pixels
Right occluded Pixels
Standard 3-move Dynamic Programming for Stereo
Dynamic programming yields the optimal path through grid. This is the best set of matches that satisfy the ordering constraint
Left Occluded Pixels
Left scanline
Right occluded P
ixels
Right scanline
Start
End
Dynamic Programming
Efficient algorithm for solving sequential decision (optimal path) problems.
1
2
3
1
2
3
1
2
3
1t 2t 3t
1i
2i
3i
1
2
3
Tt
…
How many paths through this trellis? T3
Dynamic Programming
1
2
3
1
2
3
1
2
3
1tC tC 1tC
12
22
32
Suppose cost can be decomposed into stages:
jiij state to state from going ofCost
1i
2i
3i
States:
Dynamic Programming
1
2
3
1
2
3
1
2
3
1tC tC 1tC
12
22
32
Principle of Optimality for an n-stage assignment problem
))((min)( 1 iCjC tijit
2j
1i
2i
3i
Dynamic Programming
1
2
3
1
2
3
1
2
3
1tC tC 1tC
2)2( tb
))((minarg)(
))((min)(
1
1
iCjb
iCjC
tijit
tijit
2j
1i
2i
3i
Stereo Matching with Dynamic Programming
Pseudo-code describing how to calculate the optimal match
Stereo Matching with Dynamic Programming
Pseudo-code describing how to reconstruct the optimal
path
Results
Local errors may be propagated along a scan-line and no inter scan-line consistency is enforced.
Assumption Behind Segmentation-based Stereo
Depth discontinuity tend to correlate well with color edges
Disparity variation within a segment is small
Approximation the scene with piece-wise planar surfaces
Segmentation-based stereo
Plane equation is fitted in each segment based on initial disparity estimation obtained SSD or Correlation
Globe matching criteria: if a depth map is good, warping the reference image to the other view according to this depth will render an image that matches the real view
Optimization by iterative neighborhood depth hypothesizing
Hypothesizing neighborhood depth
Correct depth is propagated to reduce fattening effect:
Hypothesizing neighborhood depth
Background depth is hypothesized for unmatched region:
Result
Another Result