Zoltan Kato: Computer Vision Standard stereo setup
Transcript of Zoltan Kato: Computer Vision Standard stereo setup
6. 3D Reconstruction6. 3D Reconstruction
Computer VisionComputer Vision
ZoltanZoltan KatoKatohttp://www.inf.uhttp://www.inf.uhttp://www.inf.u---szeged.hu/~katoszeged.hu/~katoszeged.hu/~kato///
2
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Standard stereo setup• Image planes
of cameras are parallel.
• Focal points are at same height.
• Focal lengths same.
• epipolarlines are horizontal scan lines.
BfxxZ rl ,,, offunction a as for expression Derive
),,( ZYXP
lCy
x
z
f−
rCy
x
z
B
lp
rp
3
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
adapted fromD. Young
Disparity and depth
4
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Depth reconstruction
Cl Cr
P(X,Y,Z)
pl pr
B
Z
xl xr
f
Then given Z, we can compute X and Y.B is the stereo baselined measures the difference in retinal position between corresponding points
Disparity: d=xl-xr
rl
lr
xxBfZ
ZB
fZxxB
−=
=−−+
dBfZ =
5
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stereo disparity example
Left image Right imagecourtesy of D. Young
Cameras have parallel optical axes, separated by horizontal translation (approximately)
6
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stereo disparity example
Left and right edge images, superimposed
courtesy of D. Young
Disparity getssmaller with increasing depth
7
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
What If…?
),,(1 ZYXP =1Oy
x
z
f−
2Oy
x
z
B
1p
2p
),,(1 ZYXP =1Oy
x
z
1p
f−2Oy
x
z
2p
8
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Two view (stereo) geometry
Geometric relations between two views are fullydescribed by recovered 3X3 matrix F
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• Brings two views into standard stereo setup• reproject image
planes onto common plane parallel to line between optical centers
• Notice, only focal point of camera really matters
(Seitz)
Planar rectification
10
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
e
e
He001
=
Image pair rectification• simplify stereo matching by warping the images• Apply projective transformation so that epipolar lines
correspond to horizontal scanlines• map epipole e to (1,0,0)• try to minimize image distortion• problem when epipole in (or close to) the image
11
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• The key aspect of epipolar geometry is its linear constraint on where a point in one image can be in the other
• By matching pixels (or features) along epipolar lines and measuring the disparity between them, we can construct a depth map (scene point depth is inversely proportional to disparity)
View 1 View 2 Computed depth mapcourtesy of P. Debevec
Extracting Structure
12
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stereo matching• What should be matched?
• Pixels?• Collections of pixels?• Edges?• Objects?
• Correlation-based algorithms• Produce a DENSE set of correspondences
• Feature-based algorithms• Produce a SPARSE set of correspondences
13
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stereo matching: constraints• Photometric constraint• Epipolar constraint (through rectification) • Ordering constraint• Uniqueness constraint• Disparity limit• Disparity continuity constraint
14
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Photometric constraint• Photometric constraint
• Assumption: Same world point has same intensity in both images
• Issues: noise, specularity,…• matching standalone pixels won’t work
• match windows centered around individual pixels.
==??
ff gg
15
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Image Normalization• Even when the cameras are identical models,
there can be differences in gain and sensitivity.• The cameras do not see exactly the same
surfaces, so their overall light levels can differ.• For these reasons and more, it is a good idea to
normalize the pixels in each window:
pixel Normalized ),(),(ˆ
magnitude Window )],([
pixel Average ),(
),(
),(),(
2),(
),(),(),(
1
yxW
yxWvuyxW
yxWvuyxW
m
mm
mm
IIIyxIyxI
vuII
vuII
−−
=
=
=
∑
∑
∈
∈
16
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Left Right
LwRw
m
m
Lw
Lw
row 1
row 2
row 3
m
m
m
“Unwrap”image to form vector, using raster scan order
Each window is a vector in an m2 dimensional vector space. Normalization makes them unit length.
Images as Vectors
17
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Image Metrics
Lw
)(dwR
2
),(),(
2SSD
)(
)],(ˆ),(ˆ[)(
dww
vduIvuIdC
RL
yxWvuRL
m
−=
−−= ∑∈
(Normalized) Sum of Squared Differences (SSD)
Normalized Correlation (NC)
θcos)(
),(ˆ),(ˆ)(),(),(
NC
=⋅=
−= ∑∈
dww
vduIvuIdC
RL
yxWvuRL
m
18
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Correspondence via correlation
Rectified images
Left Right
scanline
SSD error
disparity
)(maxarg)(minarg 2* dwwdwwd RLdRLd ⋅=−=
19
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
W = 3 W = 20
Better results with adaptive window• T. Kanade and M. Okutomi, A Stereo Matching
Algorithm with an Adaptive Window: Theory and Experiment,, Proc. International Conference on Robotics and Automation, 1991.
• D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion. International Journal of Computer Vision, 28(2):155-174, July 1998
(Seitz)
Window size
20
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Epipolar constraint
• Match pixels along corresponding epipolar lines (1D search)
21
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
… …Left scanline Right scanline
Match
Match
MatchOcclusion Disocclusion
Ordering constraint• order of points in two images is usually the same
surfacesurface sliceslice
22
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Three cases:• Sequential – cost of match• Occluded – cost of no match• Disoccluded – cost of no match
Left scanline
Right scanline
Occluded Pixels
Disoccluded Pixels
Ordering constraint
23
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Surface as a path
11 22 33 4,54,5 66 11 2,32,3 44 55 66
2211 33 4,54,5 6611
2,32,3
4455
66
surfacesurface sliceslice surfacesurface as a as a pathpath
occlusion right
occlusion left
24
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Uniqueness constraint• In an image pair each pixel has at most one
corresponding pixel• In general one corresponding pixel• In case of occlusion/disocclusion there is none
25
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Disparity constraint• Range of expected scene depths guides maximum
possible disparity.• Search only a segment of epipolar line
surfacesurface sliceslicesurfacesurface as a as a pathpath
bounding box
dispa
rityba
nd
use reconsructed features to determine bounding box
constantdisparitysurfaces
26
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Disparity continuity constraint• Assume piecewise continuous surface• piecewise continuous disparity
• In general disparity changes continuously• discontinuities at occluding boundaries
• Unfortunately, this makes the problem 2D again.• Solved with a host of graph algorithms, Markov
Random Fields, Belief Propagation, ….
27
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stereo matching• Constraints
• epipolar• ordering• uniqueness• disparity limit• disparity gradient limit
• Trade-off• Matching cost (data)• Discontinuities (prior)
Optimal path(dynamic programming )
Similarity measure(SSD or NC)
(Cox et al. CVGIP’96; Koch’96; Falkenhagen´97; Van Meerbergen,Vergauwen,Pollefeys,VanGool IJCV‘02)
28
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Disparity map
image I(x,y) image I´(x´,y´)Disparity map D(x,y)
(x´,y´)=(x+D(x,y),y)
29
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Feature-based Methods• Conceptually very similar to Correlation-based
methods, but:• They only search for correspondences of a
sparse set of image features.• Correspondences are given by the most similar
feature pairs.• Similarity measure must be adapted to the type of
feature used.
30
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Features commonly used• Corners
• Similarity measured in terms of • surrounding gray values (SSD, NC) • location
• Edges, Lines• Similarity measured in terms of:
• orientation• contrast• coordinates of edge or line’s midpoint• length of line
31
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Example: Comparing lines• ll and lr: line lengths • θl and θr: line orientations• (xl,yl) and (xr,yr): midpoints• cl and cr: average contrast along lines• ωl ωθ ωm ωc : weights controlling influence
The more similar the lines, the larger S is!32
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
LEFT IMAGE
corner line
structure
Correspondence By Features
33
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Correspondence By FeaturesRIGHT IMAGE
corner line
structure
• Search in the right image… the disparity (dx, dy) is the displacement when the similarity measure is maximum
34
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Computing Correspondence• Which method is better?
• Edges tend to fail in dense texture (outdoors)• Correlation tends to fail in smooth featureless
areas• Both methods fail for smooth surfaces
35
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Three geometric questions1. Correspondence geometry: Given an image
point x in the first view, how does this constrain the position of the corresponding point x’ in the second image?
2. Camera geometry (motion): Given a set of corresponding image points xi ↔x’i, i=1,…,n, what are the camera matrixes P and P’ for the two views?
3. Scene geometry (structure): Given corresponding image points xi ↔x’i and cameras P, P’, what is the position of (their pre-image) X in space?
36
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Terminology• Point correspondences: xi↔x‘i
• Original scene (pre-image): Xi
• Projective, affine, similarity reconstruction =• reconstruction that is identical to original up to
projective, affine, similarity transformation
• Literature: Metric and Euclidean reconstruction = similarity reconstruction
37
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• Recall that canonical camera matrices P, P’ can be computed from fundamental matrix F• E.g. P=[I|0] and P’=[[e’]xF|e’],
• Triangulation: Back-projection of rays from image points x, x’ to 3-D point of intersection X such that x=PX and x’=P’X
from Hartley & Zisserman
Computing Structure
38
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
( )( )iii XHPHPXx S-1S==
λt]t'RR'|K[RR'λ0t'R'-R' t]|K[RPH TT
TT1-
S +−=
=
Reconstruction ambiguity: similarity
39
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
( )( )iii XHPHPXx P-1
P==
Reconstruction ambiguity: projective
40
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
The projective reconstruction theoremIf a set of point correspondences in two views determine thefundamental matrix uniquely, then the scene and camerasmay be reconstructed from these correspondences alone, and any two such reconstructions from thesecorrespondences are projectively equivalent
( ) ( )( )
( ) 2i2i1i11i-1
11i2
ii12-1
12-1
12
2i221i11i
XPxXPHXHPHXP0FxFxHXXHPPHPP
X,'P,PX,'P,Pxx
====
=′==′=′=
′↔ :except
i
⇒ along same ray of P2, idem for P‘2two possibilities: X2i=HX1i, or points along baseline
key result: allows reconstruction from pair of uncalibrated images
41
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• Errors in • points x, x’ & F such that x’TFx=0 or
• X such that x=PX and x’=P’X• This means that rays are skew — they don’t
intersect
from Hartley & Zisserman
Triangulation: Issues
42
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
PXx = XP'x'=
Point reconstruction
43
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
XP'x'=PXx =
0XP'x =×( ) ( )( ) ( )
( ) ( ) 0XpXp0XpXp0XpXp
1T2T
2T3T
1T3T
=−
=−
=−
yxyx
−−−−
=2T3T
1T3T
2T3T
1T3T
p'p''p'p''pppp
A
yxyx
0AX =
homogeneous 1X =
)1,,,( ZYX
inhomogeneous
invariance?
e)(HX)(AH-1 =
algebraic error yes, constraint no (except for affine)
Linear triangulation
44
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• Reconstruct matches in projective frame byminimizing the reprojectionerror
• Non-iterative optimal solution (seeHartley&Sturm,CVIU´97)
( ) ( )2222
11 XP,xXP,x dd +
Geometric error
45
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
• Pick that exactly satisfies camera geometry so that and , and which minimizes
• Can use as error function for nonlinear minimization on two views• Polynomial solution exists
from Hartley & Zisserman
Optimal Projective-Invariant Triangulation: Reprojection Error
46
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Covariance of Structure Recovery• Bigger angle between rays Less uncertainty
• Can’t triangulate points on baseline (epipoles) because rays intersect along entire length
from Hartley & Zisserman
47
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Line reconstruction
LX0LXP'l'
PlL
T
T
lineon point afor =
=
doesn‘t work for epipolar plane
48
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
from Hartley & Zisserman
Reconstructions related by a 4 x 4 projection H
Two views from which F and hence P, P’ are computed
Projective Reconstruction Ambiguity
49
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Less ambiguity
Properties of transformations (2-D)from Hartley & Zisserman
Metric/
Hierarchy of transformations
50
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Stratified Reconstruction• Idea: Try to upgrade reconstruction to differ from
the truth by a less ambiguous transformation • Use additional constraints imposed by:
• Scene: • known 3-D points (no 4 coplanar) Euclidean reconstruction• Identify parallel, orthogonal lines in scene
• Camera calibration: Known K, K’ Metric/similarity reconstruction
• Camera motion: Known R, t
51
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Projective Affine Upgrade• Identify plane at infinity π∞ (in the “true”
coordinate frame, π∞ = (0, 0, 0, 1)T) • E.g., intersection points of three sets of parallel
lines define a plane• E.g., if one camera is known to be affine
52
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Projective Affine Upgrade• Then apply 4 x 4 transformation:
• This is the 3-D analog of affine image rectification via the line at infinity l∞
• Things that can be computed/constructed with only affine ambiguity:• Midpoint of two points• Centroid of group of points• Lines parallel to other lines, planes
=
∞Tπ0|I
H ( )iX,P'P,
( ) ( )TT 1,0,0,0,,,π aDCBA=∞
( )TT 1,0,0,0πH- =∞
53
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Translational motion• points at infinity are fixed for a pure translation
• reconstruction of xi↔ x’i is on π∞
×× == ]e'[]e[F0]|[IP =
]e'|[IP'=
54
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Scene constraints• Parallel lines
• parallel lines intersect at infinity• reconstruction of corresponding vanishing point yields
point on plane at infinity• 3 sets of parallel lines allow to uniquely determine π∞
• Remarks: • in presence of noise determining the intersection of
parallel lines is a delicate problem• obtaining vanishing point in one image can be sufficient
55
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Affine Reconstruction Ambiguity
Affine reconstructions
56
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Affine Metric Upgrade• Identify absolute conic Ω∞ on π∞ via image
of absolute conic (IAC) ω• From scene
• E.g., orthogonal lines• From known camera calibration
• Completely constrained: ω = K-T K-1
• Partially constrained:• Zero skew• Square pixels
• Same camera took all images (e.g. moving camera)
57
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
The infinity homography
∞
∞
( )T0,X~X =∞
X~M'x' =∞X~Mx =∞
m]|[MP = ]m'|[M'P'=-1MM'H =∞
m]Ma|[MA10aAm]|[MP +=
=
-1-1MAAM'H =∞
Unchanged underaffine transformations
0]|[IP = e]|[HP ∞=affine reconstruction:
58
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Affine to metric• Identify absolute conic Ω∞
• transform so that• then projective
transformation relating original and reconstruction is a similarity transformation
• in practice, find ω (image of absolute conic) • ω back-projects to cone that
intersects π∞ in Ω∞
ω*
Ω*
projection
constraints
∞∞ =++Ω on π ,0: 222 ZYX
59
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Affine to metric• Given
• possible transformation from affine to metric is
m]|[MP = ω
= 10
0AH-1
( ) 1TT ωMMAA −=
(cholesky factorisation)
60
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Same camera for all images• Same intrinsics same image of the
absolute conic• For example: moving camera
• Given sufficient images there is in general only one conic that projects to the same image in all images, i.e. the absolute conic• This approach is called self-calibration
• Transfer of IAC:-1-TωHHω' ∞∞=
61
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Direct metric reconstruction using ω
• Approach 1• calibrated reconstruction
• Approach 2• compute projective reconstruction• back-project ω from both images
• intersection defines Ω∞ and its support plane π∞
KKKω -1-T ⇒=
62
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Direct metric reconstruction using ground truth
• Use control points XEi with know coordinates to go directly from projective to metric• Need 5 points (no 4 coplanar)• 2 linear eq. in H-1 per view, 3 for
two views)
ii HXXE = Eii XPHx -1=
63
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Metric Reconstruction Example
Only overall scale ambiguity remains—i.e., what are units of length?
from Hartley & Zisserman
Original views Synthesized views of reconstruction
64
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
Metric Reconstruction• Objective: Given two uncalibrated images compute (PM,P‘M,XMi) (i.e.
within similarity of original scene and cameras)
• Algorithm1. Compute projective reconstruction (P,P‘,Xi)
1. Compute F from xi↔x‘I2. Compute P, P‘ from F3. Triangulate Xi from xi↔x‘I
2. Rectify reconstruction from projective to metric1. Direct method: compute H from control points
2. Stratified method:1. Affine reconstruction: compute π∞
2. Metric reconstruction: compute IAC ω
ii HXXE = -1M PHP = -1
M HPP ′=′ ii HXXM =
=
∞π0|I H
= 10
0AH-1
( ) 1TT ωMMAA −=
65
ZoltanZoltan Kato: Computer VisionKato: Computer Vision
metricπ∞Ω∞
F,H∞
ω,ω’
Points correspondences and internal camera calibration
affineπ∞F,H∞
point correspondences including vanishing points
projectiveFpoint correspondences
reconstruction ambiguity
3-space objects
View relations and projective objects
Image information provided
Reconstruction summary