Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth....

56
Stereo Dan Kong

Transcript of Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth....

Page 1: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo

Dan Kong

Page 2: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo vision

Triangulate on two images of the same scene point to recover depth.

Camera calibration Finding all correspondence Computing depth or surfaces

baseline

depth

left Right

Page 3: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Outline

Basic stereo equationsConstraints and assumptionWindows-based matchingCooperative StereoDynamic programmingGraph cut and Belief PropagationSegmentation-based method

Page 4: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Pinhole Camera Model

Imageplane

VirtualImage

),,( ZYXP

),,( ZYXP

f

Oy

x

z

)1,,()1,,(),,(

,

,,

Z

Yf

Z

XfyxZYX

YyXxZ

YfY

Z

XfXfZ

Page 5: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Basic Stereo Derivations

),,(1 ZYXP 1Oy

x

z

f

2Oy

x

z

B

BfxxZ ,,, offunction a as for expression Derive 21

1p

2p

Page 6: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Basic Stereo Derivations

),,(1 ZYXP 1Oy

x

z

f

2Oy

x

z

B

d

Bf

xx

BfZ

Z

Bfx

Z

BXfx

Z

Xfx

211

11

1

12

1

11 ,

disparity

Page 7: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Constraint

Color constancy The color of any world points remains

constant from image to image This assumption is true under

Lambertian Model In practice, given photometric camera

calibration and typical scenes, color constancy holds well enough for most stereo algorithms.

Page 8: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Constraint

Epipolar geometry The epipolar geometry is the fundamental constraint in

stereo. Rectification aligns epipolar lines with scanlines

Epipolar plane

Epipolar line for pEpipolar line for p’

Page 9: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Constraint

Uniqueness and Continuity Proposed by Marr&Poggio. Each item from each image may be

assigned at most one disparity value,” and the “disparity” varies smoothly almost everywhere.

Page 10: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Correspondence Using Window-based matching

SSD error

disparity

Left Right

scanline

Page 11: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Sum of Squared (Pixel) Differences

Left Right

Lw Rw

LI RI

),(),(

2

2222

)],(),([),,(

:disparity offunction a as differenceintensity themeasurescost SSD The

},|,{),(

:function window thedefine We

pixels. of windowsby ingcorrespond are and

yxWvuRLr

mmmmm

RL

m

vduIvuIdyxC

yvyxuxvuyxW

mmww

LwRw

),( LL yx ),( LL ydx

m

m

Page 12: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Image Normalization

Even when the cameras are identical models, there can be differences in gain and sensitivity.

The cameras do not see exactly the same surfaces, so their overall light levels can differ.

For these reasons and more, it is a good idea to normalize the pixels in each window:

pixel Normalized ),(

),(ˆ

magnitude Window )],([

pixel Average ),(

),(

),(),(

2

),(

),(),(),(

1

yxW

yxWvuyxW

yxWvuyxW

m

mm

m

m

II

IyxIyxI

vuII

vuII

Page 13: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Images as VectorsLeft Right

LwRw

m

m

Lw

Lw

row 1

row 2

row 3

m

m

m

“Unwrap” image to form vector, using raster scan order

Each window is a vectorin an m2 dimensionalvector space.Normalization makesthem unit length.

Page 14: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Normalized Correlation

Lw)(dwR

Normalized Correlation

cos)(

),(ˆ),(ˆ)(),(),(

NC

dww

vduIvuIdC

RL

yxWvuRL

m

* arg max ( )d L Rd w w d

Page 15: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Results Using window-based Method

Left Disparity Map

Images courtesy of Point Grey Research

Page 16: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Results

Left Disparity map

Page 17: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Problems with Window-based matching

Disparity within the window must be constant.

Bias the results towards frontal-parallel surfaces.

Blur across depth discontinuities.Perform poorly in textureless regions.Erroneous results in occluded regions

Page 18: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Cooperative Stereo Algorithm

Based on two basic assumption by Marr and Poggio: Uniqueness: at most a single unique match

exists for each pixel. Continuous: disparity values are generally

continuous, i.e., smooth within a local neighborhood.

Page 19: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Disparity Space Image (DSI)

The 3D disparity space has dimensions row r column c and disparity d. Each element (r, c, d) of the disparity space projects to the pixel (r, c) in the left image and to the (r, c + d) in the right image

DSI represents the confidence or likelihood of a particular match.

Page 20: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Illustration of DSI

(r, c) slices for different d

(c, d) slice for r = 151

Page 21: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Definition

( , , )nL r c d

0 ( , , )L r c d

( , , )r c d

( , , )r c d

Match value assigned to element (r, c, d) at iteration n

Initial values computed from SSD or NCC

Inhibition area for element (r, c, d)

Local support area for element (r, c, d)

Page 22: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Illustration of Inhibitory and Support Regions

Page 23: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Iterative Updating DSI

1

2

3

4

Page 24: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Explicit Detection of Occlusion

Identify occlusions by examining the magnitude of the converged values in conjunction with the uniqueness

constrain

Page 25: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Summary of Cooperative Stereo

Prepare a 3D array, (r, c, d): (r, c) for each pixel in the reference image and d for the range of disparity.

Set initial match values using a function of image intensities, such as normalized correlation or SSD.

Iteratively update match values using (4) until the match values converge.

For each pixel (r, c), find the element (r, c, d) with the maximum match value.

If the maximum match value is higher than a threshold, output the disparity d, otherwise, declare a occlusion.

0L

nL

Page 26: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

MRF Stereo Model

Local Evidence function

Compatibility function

( , )p px y

( , )p nx x

:Lx1 vector

:LxL matrix

Page 27: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Disparity Optimization

Joint probability of MRF:

The disparity optimization step requires choosing an estimator for MMSE: estimate of the mean of the marginal

distribution of MAP: the labeling of maximize the

above joint probability

1 2 , 1 2

( , )

( , ,..., , ,... )

( , ) ( , )N N

i j p pi j p

P x x x y y y

x x x y

1 ,... Nx x

ix1 ,... Nx x

(1)

Page 28: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Equivalence to Energy Minimization

Taking the negative log of equation 1:

In graph cut, equation 2 is expressed as:

Maximizing the probability in equation 1is equivalent to minimizing energy in equation 3.

1 2 , 1 2

( , )

( , ,..., , ,... )

log ( , ) log ( , )N N

i j p pi j p

E x x x y y y

x x x y

(2)

1 2 , 1 2

( , )

( , ,..., , ,... )

( , ) ( , )N N

i j p pi j p

E x x x y y y

V x x D x y

(3)

Page 29: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Matching Using Belief Propagation

Belief propagation is an iterative inference algorithm that propagates messages in the Markov network ( , )st s tm x x

( , )s s sm x y( )s sb x

Message node send to sxtx

Message observed node send to sysx

Belief at node sx

We simplify as , and as ( , )st s tm x x ( )st tm x ( , )s s sm x y( )s sm x

Page 30: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Belief Propagation Algorithm

Initialize messages as uniform distribution

Iterative update messages for I = 1:T

Compute belief at each node and output disparity

Page 31: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Illustration of BP

Page 32: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

BP Results

Page 33: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo As a Pixel-Labeling Problem

Let P be a set of pixels, L be a label set. The goal is find a labeling f which minimize some energy. For stereo, the labels are disparities.

The classic form of energy function is:

Page 34: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Energy Function:

The energy function measures how appropriate a label is for the pixel given the observed data. In stereo, this term corresponds to the match cost or likelihood.

The energy term encodes the prior or smoothness constraint. In stereo, the so called Potts model is used:

( )p pD fp

, ( , )p q p qV f f

, ,

0( )

( )p q

p q p q

I

f fV f f

I otherwise

Page 35: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Two Energy Minimization Algorithm via Graph Cuts

Swap algorithm

Page 36: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Two Energy Minimization Algorithm via Graph Cuts

expansion algorithm

Page 37: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Moves

Page 38: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Graph Cuts Results

Graph Cuts Belief Propagation

Page 39: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Ordering Constraint

If an object a is left on an object b in the left image then object a will also appear to the left of object b in the right image

Ordering constraint… …and its failure

Page 40: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Correspondences

… …Left scanline Right scanline

Match intensities sequentially between two scanlines

Page 41: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Correspondences

… …Left scanline Right scanline

Match

Match

MatchLeft occlusion Right occlusion

Page 42: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Search Over Correspondences

Three cases: Sequential – cost of match Left occluded – cost of no match Right occluded – cost of no match

Left scanline

Right scanline

Left Occluded Pixels

Right occluded Pixels

Page 43: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Standard 3-move Dynamic Programming for Stereo

Dynamic programming yields the optimal path through grid. This is the best set of matches that satisfy the ordering constraint

Left Occluded Pixels

Left scanline

Right occluded P

ixels

Right scanline

Start

End

Page 44: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Dynamic Programming

Efficient algorithm for solving sequential decision (optimal path) problems.

1

2

3

1

2

3

1

2

3

1t 2t 3t

1i

2i

3i

1

2

3

Tt

How many paths through this trellis? T3

Page 45: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Dynamic Programming

1

2

3

1

2

3

1

2

3

1tC tC 1tC

12

22

32

Suppose cost can be decomposed into stages:

jiij state to state from going ofCost

1i

2i

3i

States:

Page 46: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Dynamic Programming

1

2

3

1

2

3

1

2

3

1tC tC 1tC

12

22

32

Principle of Optimality for an n-stage assignment problem

))((min)( 1 iCjC tijit

2j

1i

2i

3i

Page 47: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Dynamic Programming

1

2

3

1

2

3

1

2

3

1tC tC 1tC

2)2( tb

))((minarg)(

))((min)(

1

1

iCjb

iCjC

tijit

tijit

2j

1i

2i

3i

Page 48: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Matching with Dynamic Programming

Pseudo-code describing how to calculate the optimal match

Page 49: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Stereo Matching with Dynamic Programming

Pseudo-code describing how to reconstruct the optimal

path

Page 50: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Results

Local errors may be propagated along a scan-line and no inter scan-line consistency is enforced.

Page 51: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Assumption Behind Segmentation-based Stereo

Depth discontinuity tend to correlate well with color edges

Disparity variation within a segment is small

Approximation the scene with piece-wise planar surfaces

Page 52: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Segmentation-based stereo

Plane equation is fitted in each segment based on initial disparity estimation obtained SSD or Correlation

Globe matching criteria: if a depth map is good, warping the reference image to the other view according to this depth will render an image that matches the real view

Optimization by iterative neighborhood depth hypothesizing

Page 53: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Hypothesizing neighborhood depth

Correct depth is propagated to reduce fattening effect:

Page 54: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Hypothesizing neighborhood depth

Background depth is hypothesized for unmatched region:

Page 55: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Result

Page 56: Stereo Dan Kong. Stereo vision Triangulate on two images of the same scene point to recover depth. yCamera calibration yFinding all correspondence yComputing.

Another Result