Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of...

Geometry 3:Stereo Reconstruction

Introduction to Computer VisionRonen Basri

Weizmann Institute of Science

Material covered

• Pinhole camera model, perspective projection• Two view geometry, general case:• Epipolar geometry, the essential matrix• Camera calibration, the fundamental matrix

• Two view geometry, degenerate cases• Homography (planes, camera rotation)• A taste of projective geometry

• Stereo vision: 3D reconstruction from two views• Multi-view geometry, reconstruction through

factorization

Summary of last lecture

Homography Perspective (calibrated)

Perspective (uncalibrated)

Orthographic

Form 0 0 0Properties One-to-one

(group)Concentric epipolar lines

Concentric epipolar lines

Parallel epipolar lines

DOFs 8(5) 8(5) 8(7) 4Eqs/pnt 2 1 1 1Minimal configuration 4 5+ (8,linear) 7+ (8,linear) 4

Depth No Yes, up to scale

Yes, projective structure

Affine structure (third view required for Euclidean structure)

Camera rotation

• Images obtained by rotating the camera about its optical axis are related by homography:

• Verify that does not depend on :

Planar scene

• For a planar scene , with

Epipolar lines

epipolar linesepipolar lines

BaselineO O’

epipolar plane

𝑝 ′𝑇 𝐸𝑝=0

Rectification

• Rectification: rotation and scaling of each camera’s coordinate frame to make the epipolar lines horizontal and equi-height,by bringing the two image planes to be parallel to the baseline

• Rectification is achieved by applying homography to each of the two images

Rectification

BaselineO O’

𝑞 ′𝑇𝐻 𝑙−𝑇 𝐸𝐻𝑟

−1𝑞=0

𝐻 𝑙 𝐻𝑟

Cyclopean coordinates

• In a rectified stereo rig with baseline of length , we place the origin at the midpoint between the camera centers.

• a point is projected to:• Left image: , • Right image: ,

• Cyclopean coordinates:

Disparity

• Disparity is inverse proportional to depth• Constant disparity constant depth• Larger baseline, more stable reconstruction of depth

(but more occlusions, correspondence is harder)

(Note that disparity is defined in a rectified rig in a cyclopean coordinate frame)

The correspondence problem

• Stereo matching is ill-posed:• Matching ambiguity: different regions may look similar

The correspondence problem

• Stereo matching is ill-posed:• Matching ambiguity: different regions may look similar• Specular reflectance: multiple depth values

Random dot stereogram

• Depth is perceived from a pair of random dot images• Stereo perception is based solely on local

information (low level)

Moving random dots

Compared elements for correspondence

• Single pixel intensities• Pixel color• Small window (e.g. or ), often using normalized

correlation to offset gain• Features and edges• Mini segments

Dynamic programming

• Each pair of epipolar lines is compared independently• Local cost, sum of unary term and binary term• Unary term: cost of a single match• Binary term: cost of change of disparity (occlusion)

• Analogous to string matching (‘diff’ in Unix)

String matching

• Swing → String

S t r i n g

S w i n g

String matching

• Cost: #substitutions + #insertions + #deletions

S t r i n g

S w i n g

Stereo with dynamic programming• Shortest path in a grid• Diagonals: constant disparity• Moving along the diagonal –

pay unary cost (cost of pixel match)• Move sideways – pay binary cost,

i.e. disparity change (occlusion, right or left)• Cost prefers fronto-parallel planes.

Penalty is paid for tilted planes

Dynamic programming on a grid

, Complexity?

Probability interpretation: the Viterbi algorithm

• Markov chain

• States: discrete set of disparity

• Log probabilities: product sum

Probability interpretation: the Viterbi algorithm

• Markov chain

• States: discrete set of disparity

• Maximum likelihood: minimize sum of negative logs• Viterbi algorithm: equivalent to shortest path

Dynamic programming: pros and cons• Advantages:• Simple, efficient• Achieves global optimum• Generally works well

• Disadvantages:

Dynamic programming: pros and cons• Advantages:• Simple, efficient• Achieves global optimum• Generally works well

• Disadvantages:• Works separately on each epipolar line,

does not enforce smoothness across epipolars• Prefers fronto-parallel planes• Too local? (considers only immediate neighbors)

Markov random field

• Graph In our case: graph isa 4-connected gridrepresenting one image

• States: disparity

• Minimize energy of the form

• Interpreted as negative log probabilities

Iterated conditional modes (ICM)

• Initialize states (= disparities) for every pixel• Update repeatedly each pixel by the most likely

disparity given the values assigned to its neighbors:

• Markov blanket: the state of a pixel only depends on the states of its immediate neighbors• Similar to Gauss-Seidel iterations• Slow convergence to (often bad) local minimum

Graph cuts: expansion moves

• Assume is non-negative and is metric:

• We can apply more semi-global moves using minimal s-t cuts

• Converges faster to a better (local) minimum

α-Expansion

• In any one round, expansion move allows each pixel to either • change its state to α, or• maintain its previous state

Each round is implemented via max flow/min cut

• One iteration: apply expansion moves sequentially with all possible disparity values

• Repeat till convergence

α-Expansion

• Every round achieves a globally optimal solution over one expansion move• Energy decreases (non-increasing) monotonically

between rounds• At convergence energy is optimal with respect to all

expansion moves, and within a scale factor from the global optimum:

α-Expansion (1D example)

𝐷𝑝(𝛼) 𝐷𝑞 (𝛼)

𝑉 𝑝𝑞 (𝛼 ,𝛼 )=0

𝐷𝑝(𝑑𝑝) 𝐷𝑞 (𝑑𝑞)

But what about?

𝐷𝑝(𝑑𝑝) 𝐷𝑞 (𝑑𝑞)

𝑉 𝑝𝑞(𝑑𝑝 ,𝑑𝑞)

𝐷𝑝(𝑑𝑝)

𝑉 𝑝𝑞(𝑑𝑝 ,𝛼)𝐷𝑞 (𝛼)

𝐷𝑞 (𝑑𝑞)

𝑉 𝑝𝑞(𝛼 ,𝑑𝑞)𝐷𝑝(𝛼)

𝑉 𝑝𝑞(𝛼 ,𝑑𝑞)𝑉 𝑝𝑞(𝑑𝑝 ,𝛼)

𝑉 𝑝𝑞(𝑑𝑝 ,𝑑𝑞)

Such a cut cannot be obtained due to triangle inequality:

Common metrics

• Potts model:

• Truncated :

• Truncated squared difference is not a metric

Reconstruction with graph-cuts

Original Result Ground truth

A different application: detect skyline• Input: one image, oriented with sky above• Objective: find the skyline in the image• Graph: grid• Two states: sky, ground• Unary (data) term:

• State = sky, low if blue, otherwise high• State = ground, high if blue, otherwise low

• Binary term for vertical connections:• If state(node)=sky then state(node above)=sky (infinity if not)• If state(node)=ground then state(node below)= ground

• Solve with expansion move. This is a two state problem, and so graph cut finds the global optimum in one expansion move

Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of...

Documents

Transcript of Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of...

Contour-Based Joint Clustering of Multiple Segmentations Daniel Glasner * 1 Shiv N. Vitaladevuni * 2 Ronen Basri 1 * equal contribution authors 1 2.

Collaboration: Ehud Ehud AltmanAltman - - The Weizmann ...€¦ · Ehud AltmanAltman - - The Weizmann . Weizmann Institute of Science. Eugene Demler . Demler - - Harvard University.

Instructional software presentation ronen cohen

Hands-on course in deep neural networks for vision Instructors Michal Irani, Ronen Basri Teaching Assistants Ita Lifshitz, Ethan Fetaya, Amir Rosenfeld.

Hazrat Rabia Basri r.a in Urdu

70 Istighfaar attributed to Hazrat Hasan Basri

Darwin 1*), Syahrul2), Hairul Basri

Ronen Ingbir

MECHANICS - Weizmann

Uniform Hardness vs. Randomness Tradeoffs for Arthur-Merlin Games. Danny Gutfreund, Hebrew U. Ronen Shaltiel, Weizmann Inst. Amnon Ta-Shma, Tel-Aviv U.

Ronen Hadash

Statistical Symmetric Shape from Shading for 3D … · 2 Roman Dovgard and Ronen Basri In this paper, we aim to recover the 3D shape of a human face using a single image. In this

TU-E Capital & Al- Basri - documents

fuentes - Weizmann

Facts & Figures - Weizmann Institute of Science · Weizmann Institute of Science Facts and Figures 2 3 About the Weizmann Institute of Science The Weizmann Institute of Science is

GLoSH: Global-Local Spherical Harmonics for Intrinsic ... · [2] Ronen Basri and David W. Jacobs. Lambertian reﬂectance and linear subspaces. TPAMI, 25(2), 2003. 10 [3] Qingnan

Rabiah Al-Basri - The-Sufi Woman

The plan for today Camera matrix Part A) Notation, preprocessing, and basic concepts. Part B) 4 Stereo Algorithms Slides are courtesy of Prof. Ronen Basri.

Understanding the effect of lighting in images Ronen Basri.

Fast, Multiscale Image Segmentation: From Pixels to Semantics Ronen Basri The Weizmann Institute of Science Joint work with Achi Brandt, Meirav Galun,