Inverse Depth Monocular SLAM -...
Transcript of Inverse Depth Monocular SLAM -...
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse Depth Monocular SLAMJ. Civera, A.J. Davison and J.M.Montiel
Javier CiveraJ.M.M. Montiel A.J. Davison
Problem Statement
• Sequential simultaneous sensor location and map building at frame rate, 30Hz.
• Camera moves freely in 3D, 6dof camera motion
• Outdoors real scenes contains close and distant, even at infinity, features
• Main contribution codifying scene points with its inverse depth:
1. Deals with low parallax cases2. Deals with both distant and
close points3. Map features are initialised
just from one image
Javier CiveraJ.M.M. Montiel A.J. Davison
Camera Geometry: Pure Bearing-only Sensor
X[X,Y,Z]
camera,optical center
O
f
• Camera detects rays• A ray is defined by the optical
center O and the observed point• The images is used as the method
to determined the detected ray• Depth is not detected
x[x,y]x
yY
Z
X
Earth thefrom far light years at gallaxies of image
910
Javier CiveraJ.M.M. Montiel A.J. Davison
Points at Infinity
• projective cameras do observe points at infinity• parallel lines meet at infinity, a projective camera does observe this
intersection point as vanishing point• we intend to code and exploit this points at infinity in the monocular
SLAM problem
Javier CiveraJ.M.M. Montiel A.J. Davison
Parallax
• No parallax geometries– Camera rotation– Camera observing a scene
plane• Low parallax cases
– Distant features compared with camera translation
– Initial feature observation
Javier CiveraJ.M.M. Montiel A.J. Davison
State of the Art
• SLAM, initially proposed by Smith and Cheesman, 1986, widespread usage in robotics for multisensor fusion [Casetellanos 1999], [Ferder 1999] [Thrun 2004]
– Sequential approach– Ability to close loops, identifying features previously observed as
reobserved. Complexity is linked to the scene not to the number of observations processed.
• SLAM used for computer vision, [Castellanos 2000], [Davison 1998] combined with odometry
• Monocular SLAM vision [Davison 2003]– Camera "following the laws of mechanics" motion model– Vision as the only sensor, no odometry.– Synergic usage of vision geometry and vision photometric map– Low parallax points avoided:
» Points represented as XYZ, only works with points close to the camera
» Delayed initialization• Monocular SLAM Fast SLAM, inverse depth delayed
initialization [Eade 2006]
SFM
,com
pute
r vis
ion
met
hods
Javier CiveraJ.M.M. Montiel A.J. Davison
State of the Art
• Photogrammetric bundle adjustment, 60's– Normally only close points
• Computer vision geometry Hartley & Zisserman [Hartley 2003]– Robust statistics– Matching between several shots enforcing a coherence with a
projective camera model– Applied to individual shots, to sequences with varying camera
parameters– Applied for robot navigation [Nister 2003, Mouragnon 2006]– Not sequential– Wide-baseline performance– Routine usage of points at infinity
• Model selection problem [Torr ICCV98]– No parallax, homography model– Parallax epipolar geometry model– Increasing the frame rate, the interframe motion closes to a
homography
SLA
M m
etho
ds
Javier CiveraJ.M.M. Montiel A.J. Davison
Camera Geometry: Pure Bearing-only Sensor
X[X,Y,Z]
camera,optical center
O
f
• Camera detects rays• A ray is defined by the optical
center O and the observed point• The images is used as the method
to determined the detected ray• Depth is not detected
x[x,y]x
yY
Z
X
Earth thefrom far light years at gallaxies of image
910
Javier CiveraJ.M.M. Montiel A.J. Davison
Gaussianity of the Inverse Depth Coding
o5.0=α
o0.1=σ o0.1=σ
-2 -1 0 10
100
200
300
400
500
600
700
800
900
x
z
x z representation
-1 -0.5 0 0.50
0.002
0.004
0.006
0.008
0.01
0.012
0.014
0.016
0.018
ρ =1/
d
θ(deg)
ρθrepresentation
Simulation: computing depth of a point from 2 views at known camera locations• Non Gaussian in XZ• Gaussian in 1/d, theta
Javier CiveraJ.M.M. Montiel A.J. Davison
Camera Motion Priors
W
CC
CC
Constant Velocity Motion Modelsmooth camera motionimpulse acceleration noise
( )WCWC qr ,
WvC
Cω
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
C
W
WC
WC
v
ωVqr
x
Javier CiveraJ.M.M. Montiel A.J. Davison
C
i
i
i
zyx
Wr−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
( )⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛= ii
WC
i
i
i
iCW
z
y
xC
zyx
hhh
φθρ ,mrRh
Measurement equation
Scene Point Coding in Inverse Depth. Measurement Equation
W
WCr
ii
d=ρ1
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx ( )ii φθ ,m
Inverse dept point coding
z
y
y hh
dfvv −= 0
Ch
0z
x
x hh
dfuu −=
Bearing only camera measurement
C
( )WCWC qr ,
Javier CiveraJ.M.M. Montiel A.J. Davison
Scene Point Coding in Inverse Depth. Measurement Equation
W
C
( )ii φθ ,m
WCr
Ch parallaxangle
( )WCWC qr ,
C
i
i
i
zyx
Wr−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
( )iii
i
i
i
zyx
φθρ
,1 m+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
scene pointαii
d=ρ1
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx
( )
( )⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
=⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−+
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
−=−=
iiWC
i
i
i
iCW
WCii
ii
i
iCW
z
y
xC
z
y
yz
x
x
zyx
zyx
hhh
hh
dfvv
hh
dfuu
φθρ
φθρ
,
,
00
1
mrR
rmRh
( )
zero togoes , baseline, low locations, camera close
zero togoesparallax ,0 point,distant
parallax ,
observed is
feature thefirst time theof those, , ,
WC
i
i
i
i
WC
i
i
i
i
ii
i
i
i
zyx
zyx
zyx
r
r
m
−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
⇒→
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
ρ
ρ
φθ
Javier CiveraJ.M.M. Montiel A.J. Davison
Scene Point Coding in Inverse Depth. Measurement Equation
W
C
( )ii φθ ,m
WCr
Ch parallaxangle
( )WCWC qr ,
C
i
i
i
zyx
Wr−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
( )iii
i
i
i
zyx
φθρ
,1 m+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
scene pointαii
d=ρ1
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx
( )
( )( )
:observed isinfinity at point only parallax, low
,
,
00
iiCWC
iiWC
i
i
i
iCWC
z
y
yz
x
x
zyx
hh
dfvv
hh
dfuu
φθ
φθρ
mRh
mrRh
≈
⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
+⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛−⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
−=−=
Javier CiveraJ.M.M. Montiel A.J. Davison
EKF sequential processing
storedpatch
Estimation at step k-1
Prediction at step k
Update at step k
stored patch is searched in allacceptance region for the feature prediction
Javier CiveraJ.M.M. Montiel A.J. Davison
[ ]infinite including and fromregion a covers
2,2- interval at the th
so dinitialize is covarinace its and at nobservatio featurefirst thefrom dinitialize are
covariance ingcorrespond theand ,,,,nobservatioan just from observed are points New
min
0
d
zyx
ρρ
ρ
σρσρ
σρρ
φθ
+
Inverse Depth Feature Initialization Priors
00 21σρ +
ρσρ 210
min
+=d
0
1ρ
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
i
i
i
zyx
( )ii φθ ,m
ρσρ 20 −
∞
0ρ
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse Depth Feature Initialization
( )vu,
estimatelocation map and cameracurrent
ˆ
ˆ
ˆˆˆ
ˆ1
kk
nCkk
Wkk
WCkk
WCkk
v Py
y
Vqr
x⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
= M
ω
0ρρ =i
Javier CiveraJ.M.M. Montiel A.J. Davison
Dimensional Monocular SLAM
0
00
a
,,
,
ρ
ω
α
σρσσσσ
V
tzi
Δσ,z
( )TnCWWCWC yyvqr L1,,,, ωMonocular SLAM
Processing
camera location
scenemap
Calibrated image measurements
Frame Rate
Camera Motion Priors
Intial depth Prior
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse Depth Estimation History
200 300 400 500 600 700 800-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0 100 200 300 400 500 600 700 800 900-1.5
-1
-0.5
0
0.5
1
1.5
2
200 300 400 500 600 700 800-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0 100 200 300 400 500 600 700 800 900-1.5
-1
-0.5
0
0.5
1
1.5
near feature (3), eventually excludes infinite from the acceptance region
distant feature (11), infinite always included in the acceptance region11 3
113
11 3
113
(b)
11 3
11
3
(a)
11
Javier CiveraJ.M.M. Montiel A.J. Davison
200 400 6000
0.2
0.4
0.6
0.8
1rx
frame number
met
ers
200 400 6000
0.2
0.4
0.6
0.8
1ry
frame number
met
ers
200 400 6000
0.2
0.4
0.6
0.8
1rz
frame number
met
ers
200 400 6000
0.5
1
1.5
2ψ (rotx)
frame numberde
g
200 400 6000
0.5
1
1.5
2θ (roty)
frame number
deg
200 400 6000
0.5
1
1.5
2φ (rotz)
frame number
deg
Loop Closing
Javier CiveraJ.M.M. Montiel A.J. Davison
Linearity Index
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse depth linearityanalysis
linearitypoor L ,1cos0
tion initializaat
d ↑≈⇒≈ ααlinearitypoor
L ,1cos0 tion initializaat
d ↑≈⇒≈ αα
linearitypoor L ,1cos0
tion initializaat
d ↑≈⇒≈ αα
estimation whole thealong eperformanc goodlinearity
but ,cos1 gathering,parallax after
linearity L , 0cos10
tion initializaat
ρ
ρ
σα
αα
−
↓≈−⇒≈
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse depth to XYZ conversion• Inverse depth good performance along the whole estimation process• Inverse depth coding needs 6 parameters• XYZ coding good performance for reduced depth uncertainty• So, it is not mandatory to switch from inverse depth to XYZ, but
computing overhead can be reduce• Switching criteria based on the linearity index
Javier CiveraJ.M.M. Montiel A.J. Davison
Inverse depth to XYZ conversion threshold
Javier CiveraJ.M.M. Montiel A.J. Davison
0 100 200 300 400 500 600 7000
200
400
600
dim
ensi
on
state vector dimension
0 100 200 300 400 500 600 70070
80
90
100
% re
duct
ion
% dimension reduction for code swithching
0 100 200 300 400 500 600 7000
50
100
# m
ap 3
D p
oint
s
# map 3D points
swithching inverse depth to XYZ
all map features in inverse depth
total # 3D points
total # 3D points inverse depth
total # 3D points in XYZ
Switching evolution
Javier CiveraJ.M.M. Montiel A.J. Davison
Switching evolution
(a)
(c)
(b)
(d)
Inverse depth coding
Depth coding
Javier CiveraJ.M.M. Montiel A.J. Davison
Bibliografía
Davison, AJ , Reid, ID, Molton, ND , Stasse, O :” MonoSLAM: Real-time single camera SLAM” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 29 (6): 1052-1067 JUN 2007
J.M.M. Montiel, Javier Civera y Andrew J. Davison: “Unified Inverse DepthParametrization for Monocular SLAM”. Robotics: Science and Systems Conference2006.
Javier Civera , Andrew J. Davison and J.M.M. Montiel,: “Inverse Depth to DepthConversion for Monocular SLAM”. IEEE Int Conf on Robotics and Automation Rome, April 2007.
Javier Civera, Andrew J. Davison, J. M. M. Montiel. Dimensionless Monocular SLAM. 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), Girona, 2007
J.M.M Montiel home page: http://webdiis.unizar.es/~josemari/
Anderw Davison home page http://www.doc.ic.ac.uk/~ajd/