Inverse Depth Monocular SLAM -...

Javier CiveraJ.M.M. Montiel A.J. Davison

Inverse Depth Monocular SLAMJ. Civera, A.J. Davison and J.M.Montiel


Problem Statement

• Sequential simultaneous sensor location and map building at frame rate, 30Hz.

• Camera moves freely in 3D, 6dof camera motion

• Outdoors real scenes contains close and distant, even at infinity, features

• Main contribution codifying scene points with its inverse depth:

1. Deals with low parallax cases2. Deals with both distant and

close points3. Map features are initialised

just from one image


Camera Geometry: Pure Bearing-only Sensor

X[X,Y,Z]

camera,optical center

O

f

• Camera detects rays• A ray is defined by the optical

center O and the observed point• The images is used as the method

to determined the detected ray• Depth is not detected

x[x,y]x

yY

Z

X

Earth thefrom far light years at gallaxies of image

910


Points at Infinity

• projective cameras do observe points at infinity• parallel lines meet at infinity, a projective camera does observe this

intersection point as vanishing point• we intend to code and exploit this points at infinity in the monocular

SLAM problem


Parallax

• No parallax geometries– Camera rotation– Camera observing a scene

plane• Low parallax cases

– Distant features compared with camera translation

– Initial feature observation


State of the Art

• SLAM, initially proposed by Smith and Cheesman, 1986, widespread usage in robotics for multisensor fusion [Casetellanos 1999], [Ferder 1999] [Thrun 2004]

– Sequential approach– Ability to close loops, identifying features previously observed as

reobserved. Complexity is linked to the scene not to the number of observations processed.

• SLAM used for computer vision, [Castellanos 2000], [Davison 1998] combined with odometry

• Monocular SLAM vision [Davison 2003]– Camera "following the laws of mechanics" motion model– Vision as the only sensor, no odometry.– Synergic usage of vision geometry and vision photometric map– Low parallax points avoided:

» Points represented as XYZ, only works with points close to the camera

» Delayed initialization• Monocular SLAM Fast SLAM, inverse depth delayed

initialization [Eade 2006]

SFM

,com

pute

r vis

ion

met

hods


State of the Art

• Photogrammetric bundle adjustment, 60's– Normally only close points

• Computer vision geometry Hartley & Zisserman [Hartley 2003]– Robust statistics– Matching between several shots enforcing a coherence with a

projective camera model– Applied to individual shots, to sequences with varying camera

parameters– Applied for robot navigation [Nister 2003, Mouragnon 2006]– Not sequential– Wide-baseline performance– Routine usage of points at infinity

• Model selection problem [Torr ICCV98]– No parallax, homography model– Parallax epipolar geometry model– Increasing the frame rate, the interframe motion closes to a

homography

SLA

M m

etho

ds


Camera Geometry: Pure Bearing-only Sensor

X[X,Y,Z]

camera,optical center

O

f

• Camera detects rays• A ray is defined by the optical

center O and the observed point• The images is used as the method

to determined the detected ray• Depth is not detected

x[x,y]x

yY

Z

X

Earth thefrom far light years at gallaxies of image

910


Gaussianity of the Inverse Depth Coding

o5.0=α

o0.1=σ o0.1=σ

-2 -1 0 10

100

200

300

400

500

600

700

800

900

x

z

x z representation

-1 -0.5 0 0.50

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

ρ =1/

d

θ(deg)

ρθrepresentation

Simulation: computing depth of a point from 2 views at known camera locations• Non Gaussian in XZ• Gaussian in 1/d, theta


Camera Motion Priors

W

CC

CC

Constant Velocity Motion Modelsmooth camera motionimpulse acceleration noise

( )WCWC qr ,

WvC

Cω

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

C

W

WC

WC

v

ωVqr

x


C

i

i

i

zyx

Wr−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

( )⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

+⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛= ii

WC

i

i

i

iCW

z

y

xC

zyx

hhh

φθρ ,mrRh

Measurement equation

Scene Point Coding in Inverse Depth. Measurement Equation

W

WCr

ii

d=ρ1

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

i

i

i

zyx ( )ii φθ ,m

Inverse dept point coding

z

y

y hh

dfvv −= 0

Ch

0z

x

x hh

dfuu −=

Bearing only camera measurement

C

( )WCWC qr ,



W

C

( )ii φθ ,m

WCr

Ch parallaxangle

( )WCWC qr ,

C

i

i

i

zyx

Wr−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

( )iii

i

i

i

zyx

φθρ

,1 m+⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

scene pointαii

d=ρ1

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

i

i

i

zyx

( )

( )⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

+⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

=⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−+

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

−=−=

iiWC

i

i

i

iCW

WCii

ii

i

iCW

z

y

xC

z

y

yz

x

x

zyx

zyx

hhh

hh

dfvv

hh

dfuu

φθρ

φθρ

,

,

00

1

mrR

rmRh

( )

zero togoes , baseline, low locations, camera close

zero togoesparallax ,0 point,distant

parallax ,

observed is

feature thefirst time theof those, , ,

WC

i

i

i

i

WC

i

i

i

i

ii

i

i

i

zyx

zyx

zyx

r

r

m

−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

⇒→

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

ρ

ρ

φθ



W

C

( )ii φθ ,m

WCr

Ch parallaxangle

( )WCWC qr ,

C

i

i

i

zyx

Wr−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

( )iii

i

i

i

zyx

φθρ

,1 m+⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

scene pointαii

d=ρ1

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

i

i

i

zyx

( )

( )( )

:observed isinfinity at point only parallax, low

,

,

00

iiCWC

iiWC

i

i

i

iCWC

z

y

yz

x

x

zyx

hh

dfvv

hh

dfuu

φθ

φθρ

mRh

mrRh

≈

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

+⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛−⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

−=−=


EKF sequential processing

storedpatch

Estimation at step k-1

Prediction at step k

Update at step k

stored patch is searched in allacceptance region for the feature prediction


[ ]infinite including and fromregion a covers

2,2- interval at the th

so dinitialize is covarinace its and at nobservatio featurefirst thefrom dinitialize are

covariance ingcorrespond theand ,,,,nobservatioan just from observed are points New

min

0

d

zyx

ρρ

ρ

σρσρ

σρρ

φθ

+

Inverse Depth Feature Initialization Priors

00 21σρ +

ρσρ 210

min

+=d

0

1ρ

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

i

i

i

zyx

( )ii φθ ,m

ρσρ 20 −

∞

0ρ


Inverse Depth Feature Initialization

( )vu,

estimatelocation map and cameracurrent

ˆ

ˆ

ˆˆˆ

ˆ1

kk

nCkk

Wkk

WCkk

WCkk

v Py

y

Vqr

x⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

= M

ω

0ρρ =i


Dimensional Monocular SLAM

0

00

a

,,

,

ρ

ω

α

σρσσσσ

V

tzi

Δσ,z

( )TnCWWCWC yyvqr L1,,,, ωMonocular SLAM

Processing

camera location

scenemap

Calibrated image measurements

Frame Rate

Camera Motion Priors

Intial depth Prior


Inverse Depth Estimation History

200 300 400 500 600 700 800-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0 100 200 300 400 500 600 700 800 900-1.5

-1

-0.5

0

0.5

1

1.5

2

200 300 400 500 600 700 800-0.02

-0.01

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0 100 200 300 400 500 600 700 800 900-1.5

-1

-0.5

0

0.5

1

1.5

near feature (3), eventually excludes infinite from the acceptance region

distant feature (11), infinite always included in the acceptance region11 3

113

11 3

113

(b)

11 3

11

3

(a)

11


200 400 6000

0.2

0.4

0.6

0.8

1rx

frame number

met

ers

200 400 6000

0.2

0.4

0.6

0.8

1ry

frame number

met

ers

200 400 6000

0.2

0.4

0.6

0.8

1rz

frame number

met

ers

200 400 6000

0.5

1

1.5

2ψ (rotx)

frame numberde

g

200 400 6000

0.5

1

1.5

2θ (roty)

frame number

deg

200 400 6000

0.5

1

1.5

2φ (rotz)

frame number

deg

Loop Closing


Linearity Index


Inverse depth linearityanalysis

linearitypoor L ,1cos0

tion initializaat

d ↑≈⇒≈ ααlinearitypoor

L ,1cos0 tion initializaat

d ↑≈⇒≈ αα

linearitypoor L ,1cos0

tion initializaat

d ↑≈⇒≈ αα

estimation whole thealong eperformanc goodlinearity

but ,cos1 gathering,parallax after

linearity L , 0cos10

tion initializaat

ρ

ρ

σα

αα

−

↓≈−⇒≈


Inverse depth to XYZ conversion• Inverse depth good performance along the whole estimation process• Inverse depth coding needs 6 parameters• XYZ coding good performance for reduced depth uncertainty• So, it is not mandatory to switch from inverse depth to XYZ, but

computing overhead can be reduce• Switching criteria based on the linearity index


Inverse depth to XYZ conversion threshold


0 100 200 300 400 500 600 7000

200

400

600

dim

ensi

on

state vector dimension

0 100 200 300 400 500 600 70070

80

90

100

% re

duct

ion

% dimension reduction for code swithching

0 100 200 300 400 500 600 7000

50

100

# m

ap 3

D p

oint

s

# map 3D points

swithching inverse depth to XYZ

all map features in inverse depth

total # 3D points

total # 3D points inverse depth

total # 3D points in XYZ

Switching evolution


Switching evolution

(a)

(c)

(b)

(d)

Inverse depth coding

Depth coding


Bibliografía

Davison, AJ , Reid, ID, Molton, ND , Stasse, O :” MonoSLAM: Real-time single camera SLAM” IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 29 (6): 1052-1067 JUN 2007

J.M.M. Montiel, Javier Civera y Andrew J. Davison: “Unified Inverse DepthParametrization for Monocular SLAM”. Robotics: Science and Systems Conference2006.

Javier Civera , Andrew J. Davison and J.M.M. Montiel,: “Inverse Depth to DepthConversion for Monocular SLAM”. IEEE Int Conf on Robotics and Automation Rome, April 2007.

Javier Civera, Andrew J. Davison, J. M. M. Montiel. Dimensionless Monocular SLAM. 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA), Girona, 2007

J.M.M Montiel home page: http://webdiis.unizar.es/~josemari/

Anderw Davison home page http://www.doc.ic.ac.uk/~ajd/

Inverse Depth Monocular SLAM -...

Documents

Transcript of Inverse Depth Monocular SLAM -...