Vision Algorithms for Mobile Robotics Lecture 01 Introduction

49
Vision Algorithms for Mobile Robotics Lecture 13 Visual Inertial Fusion Davide Scaramuzza http://rpg.ifi.uzh.ch 1

Transcript of Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Page 1: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Vision Algorithms for Mobile Robotics

Lecture 13Visual Inertial Fusion

Davide Scaramuzza

httprpgifiuzhch1

Lab Exercise ndash This afternoon

Bundle Adjustment

2

Visual Inertial Odometry (VIO)

References

bull Scaramuzza Zhang Visual-Inertial Odometry of Aerial Robots Encyclopedia of Robotics Springer 2019 PDF

bull Huang Visual-inertial navigation A concise review International conference on Robotics and Automation (ICRA) 2019 PDF

3

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

4

What is an IMU

bull Inertial Measurement Unitbull Gyroscope Angular velocity

bull Accelerometer Linear Accelerations

5

Mechanical Gyroscope

Mechanical Accelerometer

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 2: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Lab Exercise ndash This afternoon

Bundle Adjustment

2

Visual Inertial Odometry (VIO)

References

bull Scaramuzza Zhang Visual-Inertial Odometry of Aerial Robots Encyclopedia of Robotics Springer 2019 PDF

bull Huang Visual-inertial navigation A concise review International conference on Robotics and Automation (ICRA) 2019 PDF

3

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

4

What is an IMU

bull Inertial Measurement Unitbull Gyroscope Angular velocity

bull Accelerometer Linear Accelerations

5

Mechanical Gyroscope

Mechanical Accelerometer

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 3: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Visual Inertial Odometry (VIO)

References

bull Scaramuzza Zhang Visual-Inertial Odometry of Aerial Robots Encyclopedia of Robotics Springer 2019 PDF

bull Huang Visual-inertial navigation A concise review International conference on Robotics and Automation (ICRA) 2019 PDF

3

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

4

What is an IMU

bull Inertial Measurement Unitbull Gyroscope Angular velocity

bull Accelerometer Linear Accelerations

5

Mechanical Gyroscope

Mechanical Accelerometer

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 4: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

4

What is an IMU

bull Inertial Measurement Unitbull Gyroscope Angular velocity

bull Accelerometer Linear Accelerations

5

Mechanical Gyroscope

Mechanical Accelerometer

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 5: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

What is an IMU

bull Inertial Measurement Unitbull Gyroscope Angular velocity

bull Accelerometer Linear Accelerations

5

Mechanical Gyroscope

Mechanical Accelerometer

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 6: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

What is an IMU

bull Different categoriesbull Mechanical ($100000-1M)

bull Optical ($20000-100k)

bull MEMS (from 1$ (phones) to 1000$ (higher cost because they have a microchip running a Kalman filter))

bull For small mobile robots amp drones MEMS IMU are mostly usedbull Cheap

bull Power efficient

bull Light weight and solid state

6

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 7: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

MEMS Accelerometer

A spring-like structure connects the device to a seismic mass vibrating in a capacitive divider A capacitive divider converts the displacement of the seismic mass into an electric signal Damping is created by the gas sealed in the device

7

aaspring

capacitive divider

M M M

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 8: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

MEMS Gyroscopes

bull MEMS gyroscopes measure the Coriolis forces acting on MEMS vibrating structures (tuning forks vibrating wheels or resonant solids)

bull Their working principle is similar to the haltere of a fly

bull Haltere are small structures of some two-winged insects such as flies They are flapped rapidly and function as gyroscopes informing the insect about rotation of the body during flight

8

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 9: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Why do we need an IMU

bull Monocular vision is scale ambiguous

bull Pure vision is not robust enoughbull Low texture

bull High Dynamic Range (HDR)

bull High speed motion

Robustness is a critical issue Tesla accidentldquoThe autopilot sensors on the Model S failed to distinguish a white tractor-trailer crossing the highway against a bright sky rdquo [The Guardian]

9

High Dynamic Range

Motion blur

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 10: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Why is IMU alone not enough

bull Pure IMU integration will lead to large drift (especially in cheap IMUs)bull Will see later mathematically

bull Intuition

bull Double integration of acceleration to get position (eg 1D scenario 119909 = 1199090 + 1199070(1199051 minus 1199050) + 1199050

1199051 119886 119905 1198891199052 )

If there is a constant bias in the acceleration the error of position will be proportional to 119957120784

bull Integration of angular velocity to get orientation if there is a bias in angular velocity the error is proportional to the time 119957

10

Table from Vectornav one of the best IMU companies Errors were computed assuming the device at rest httpswwwvectornavcomresourcesins-error-budget

AutomotiveSmartphoneamp Drone accelerometers

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 11: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

bull IMU and vision are complementary

bull What cameras and IMU have in common both estimate the pose incrementally (known as dead-reckoning) which suffers from drift over time Solution loop detection and loop closure

Cameras IMU

Precise in slow motion More informative than an IMU (measures light energy

from the environment)

ᵡ Limited output rate (~100 Hz)ᵡ Scale ambiguity in monocular setupᵡ Lack of robustness to HDR and high speed

Insensitive to motion blur HDR texture Can predict next feature position High output rate (~1000 Hz) The higher the acceleration the more accurate the IMU

(due to higher signal to noise ratio)

ᵡ Large relative uncertainty when at low accelerationangular velocity

ᵡ Ambiguity in gravity acceleration (IMU measures the total acceleration)

Why visual inertial fusion

11

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 12: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

12

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 13: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

IMU Measurement Model

bull Measures angular velocity ω119861 119905 and acceleration 119886119861(119905) in the body frame 119861

Notation

bull The superscript 119892 stands for gyroscope and 119886 for accelerometer

bull 119877119861119882 is the rotation of the World frame 119882 with respect to Body frame 119861

bull The gravity 119892 is expressed in the World frame

bull Biases and noise are expressed in the body frame13

IMU biases + noise

Raw IMU measurements true 120654 (in body frame) and true 119834 (in

world frame) to estimate

119886119861 119905 = 119877119861119882 119905 119886119882 119905 minus 119892 + 119887119886 119905 + 119899119886(119905)

ω119861 119905 = ω119861 119905 + 119887119892 119905 + 119899119892(119905)

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 14: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

IMU Noise Model

bull Additive Gaussian white noise

bull Bias

bull The gyroscope and accelerometer biases are considered slowly varying ldquoconstantsrdquo Their temporal fluctuation is modeled by saying that the derivative of the bias is a zero-mean Gaussian noise with standard deviation 120590119887

bull Some facts about IMU biasesbull They change with temperature and mechanical and atmospheric pressurebull They may change every time the IMU is startedbull Good news they can be estimated (see later)

14

g at tn n

g at tb b

( ) bt tb w 119856 t ~119821(o 1)

Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDFMore info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 15: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

IMU Integration Model

bull The IMU Integration Model computes the position orientation and velocity of the IMU in the world frame To do this we must first compute the acceleration 119886 119905 in the world frame from the measured one 119886(119905) in the body frame (see Slide 13)

bull The position 119901119896 at time 119905119896 can then be predicted from the position 119901119896minus1 at time 119905119896minus1 by integrating all the inertial measurements 119886119895 ω119895 within that time interval

bull A similar expression can be obtained to predict the velocity 119907119896 and orientation 119877119882119861 of the IMU in the world frame as functions of both 119886119895 and ω119895

15Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119901119896 = 119901119896minus1 + 119907119896minus1 119905119896 minus 119905119896minus1 +ඵ119905119896minus1

119905119896

119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892)1198891199052

119886(119905) = 119877119882119861(119905) 119886 119905 minus 119887 119905 + 119892

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 16: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

IMU Integration Model

For convenience the IMU Integration Model is normally written as

where

bull 119909 =119901119902119907

represents the IMU state comprising position orientation and velocity

bull 119902 is the IMU orientation 119929119934119913 (usually represented using quaternions)

bull 119906 = 119886119895 ω119895 are the accelerometer and gyroscope measurements in the time interval 119905119896minus1 119905119896

16Trawny Roumeliotis Indirect Kalman filter for 3D attitude estimation Technical Report University of Minnesota 2005 PDF

More info on the noise model httpsgithubcomethz-aslkalibrwikiIMU-Noise-Model

119909119896 = 119891 119909119896minus1 119906

119901119896119902119896119907119896

= 119891

119901119896minus1119902119896minus1119907119896minus1

119906 or more compactly

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 17: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

17

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 18: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Visual Inertial Odometry

Different paradigms exist

bull Loosely coupled

bull Treats VO and IMU as two separate (not coupled) black boxes

bull Each black box estimates pose and velocity from visual (up to a scale) and inertial data (absolute scale)

bull Easy to implement

bull Inaccurate Should not be used

bull Tightly coupled

bull Makes use of the raw sensorsrsquo measurements

bull 2D features

bull IMU readings

bull More accurate

bull More implementation effort

In this lecture we will only see tightly coupled approaches

18

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 19: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

The Loosely Coupled Approach

19

Feature tracking

VOimages

Position (up to a scale) amporientation

IMU Integration

PositionOrientation

Velocity

2D features

FusionRefined Position

OrientationVelocity

IMU measurements

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 20: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

The Tightly Coupled Approach

20

Feature tracking

images

IMU measurements

2D features

FusionRefined Position

OrientationVelocity

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 21: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Filtering Visual Inertial Formulation

bull System states

21

Tightly coupled

Loosely coupled

Corke Lobo Dias An Introduction to Inertial and Visual Sensing International Journal of Robotics Research (IJRR) 2017 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 22: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

22

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 23: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Closed-form Solution (1D case)

bull From the camera we can only get the absolute position 119909 up to a unknown scale factor 119904 thus

119909 = 119904 119909

where 119909 is the relative motion estimated by the camera

bull From the IMU

119909 = 1199090 + 1199070(1199051 minus 1199050) +ඵ1199050

1199051

119886 119905 1198891199052

bull By equating them

119904 119909 = 1199090 + 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

bull As shown in [Martinellirsquo14] if we assume to know 1199090 (usually we set it to 0) then for 6DOF both 119956 and 119959120782can be determined in closed form from a single feature observation and 3 views

23Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

1199050 1199051

1198711

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 24: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Closed-form Solution (1D case)

24Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF

119904 1199091 = 1199070 1199051 minus 1199050 +ඵ1199050

1199051

119886 119905 1198891199052

119904 1199092 = 1199070 1199052 minus 1199050 +ඵ1199050

1199052

119886 119905 1198891199052

1199091 (1199050minus1199051)1199092 (1199050minus1199052)

1199041199070

=

ඵ1199050

1199051

119886 119905 1198891199052

ඵ1199050

2

119886 119905 1198891199052

1199050 1199051 1199052

1198711

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 25: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Closed-form Solution (general case)

bull Considers 119873 feature observations and the full 6DOF case

bull Can be used to initialize filters and non-linear optimizers (which always need an initialization point) [3]

bull More complex to derive than the 1D case But it also reaches a linear system of equations that can be solved using the pseudoinverse

25

119935 is the vector of unknownsbull Absolute scale 119904bull Initial velocity 1199070bull 3D Point distances (wrt the first camera) bull Direction of the gravity vector bull Biases

119912 and 119930 contain 2D feature coordinates acceleration and angular velocity measurements

[1] Martinelli Vision and IMU data fusion Closed-form solutions for attitude speed absolute scale and bias determination IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Martinelli Closed-form solution of visual-inertial structure from motion International Journal of Computer Vision (IJCV) 2014 PDF[3] Kaiser Martinelli Fontana Scaramuzza Simultaneous state initialization and gyroscope bias calibration in visual inertial aided navigation IEEE Robotics and Automation Letters (R-AL) 2017 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 26: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

26

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 27: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

27

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 28: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Non-linear Optimization Methods

VIO is solved as non-linear Least Square optimization problem over

where

bull 119883 = 1199091 hellip 119909119873 set of robot state estimates 119909119896 (position velocity orientation) at frame times 119896

bull 119871 = 1198711 hellip 119871119872 3D landmarks

bull 119891 119909119896minus1 119906 pose prediction update from integration of IMU measurements 119906 = 119886119895 ω119895

bull 120587(119909119896 119897119894) measurement update from projection of landmark 119871119894 onto camera frame 119868119896bull 119911119894119896 observed features

bull 120556119896 state covariance from the IMU integration 119891 119909119896minus1 119906

bull 120564119894119896 is the covariance from the noisy 2D feature measurements

28

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

[1] Jung Taylor Camera Trajectory Estimation using Inertial Sensor Measurements and Structure from Motion Results International Conference on Computer Vision and Pattern Recognition (CVPR) 2001 PDF[2] Sterlow Singh Motion estimation from image and inertial measurements International Journal of Robotics Research (IJRR) 2004 PDF

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 29: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Case Study 1 OKVIS

Because the complexity of the optimization is cubic with respect to the number of cameras poses and features real-time operation becomes infeasible as the trajectory and the map grow over time OKVIS proposed to only optimize the current pose and a window of past keyframes

29Leutenegger Lynen Bosse Siegwart Furgale Keyframe-based visualndashinertial odometry using nonlinear optimization International Journal

of Robotics Research (IJRR) 2015 PDF

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 30: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Case Study 2 SVO+GTSAM

Solves the same optimization problem as OKVIS but

bull Keeps all the frames (from the start to the end of the trajectory)

bull To make the optimization efficient

bull Marginalizes 3D landmarks (rather than triangulating landmarks only their visual bearing measurements are kept and the visual constraintare enforced by the Epipolar Line Distance (Lecture 08))

bull pre-integrates the IMU data between keyframes (see later)

bull Optimization solved using Factor Graphs via GTSAM

bull Very fast because it only optimizes the poses that are affected by a new observation

30Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 31: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

SVO+GTSAM

31Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 32: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

SVO+GTSAM

32Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry

IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 33: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Problem with IMU integration

bull The integration of IMU measurements 119891 119909119896minus1 119906 from 119896 minus 1 to 119896 is related to the state estimation at time 119896 minus 1

bull During optimization every time the linearization point at 119896 minus 1 changes the integration between 119896 minus 1and 119896 must be re-evaluated thus slowing down the optimization

bull Idea Preintegration

bull defines relative motion increments expressed in body frame which are independent on the global position orientation and velocity at 119896 [1]

bull [2] uses this theory by leveraging the manifold structure of the rotation group SO(3)

33

[1] Lupton Sukkarieh Visual-inertial-aided navigation for high-dynamic motion in built environments without initial conditions IEEE Transactions on Robotics (T-RO) 2012 PDF[2] Forster Carlone Dellaert Scaramuzza On-Manifold Preintegration for Real-Time Visual-Inertial Odometry IEEE Transactions on Robotics (T-RO) Feb 2017 Best Paper Award 2018

X L 119887119886 119887119892 = 119886119903119892119898119894119899X L 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 34: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

IMU Pre-Integration

34

120596 119886 Δ ෨119877 Δ 119907 Δ 119901

119877119896 119901119896 v119896

StandardEvaluate error in global frame

PreintegrationEvaluate relative errors

119942119877 = Δ ෨119877119879 Δ119877

119942v = Δv minus Δv

119942119901 = Δ 119901 minus Δ119901

119942119877 = 119877 120596 119877119896minus1119879119877119896

119942v = ොv(120596 119886 v119896minus1) minus v119896

119942119901 = Ƹ119901(120596 119886 119901119896minus1) minus 119901119896

Repeat integration when previous state changes

Preintegration of IMU deltas possible with no initial condition required

Prediction Estimate

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 35: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

GTSAM

bull GTSAM [1] is a library that implements sensor fusion based on factor graphs and Bayes trees

bull Factor graphs are a tool borrowed from machine learning [2] that

bull naturally capture the spatial dependences among different sensor measurements

bull can represent many robotics problems like pose graphs and SLAM

35

[1] Dellaert and Kaess Factor Graphs for Robot Perception Foundations and Trends in Robotics 2017 PDF Code[2] Kschischang Frey Loeliger Factor Graphs and the Sum-Product Algorithm Transactions on Information Theory 2001 PDF

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 36: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

GTSAM

bull GTSAM uses factor graphs and Bayes trees to implement incremental inference

bull A Bayes tree is a data structure a directed tree that visually represents the connectivity among the past robot statesand measurements

bull Incremental inference means that when new robot states or measurements are added to the graph only a local part of the tree is updated

36[1] Kaess Ranganathan Dellaer ldquoiSAM Incremental Smoothing and Mappingrdquo IEEE Transactions on Robotics (T-RO) 2008 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 37: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

37

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 38: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Non-linear optimization vs Filtering

38

Non-linear Optimization methods Filtering methods

Optimize a window of multiple states (or all the states) using non-linear Least-Squares optimization

Solve the same problem by running only one iteration of the optimization function (eg using Extended Kalman Filter (EKF))

Multiple iterations (it re-linearizes at each iteration)

Acheives the highest accuracy

Slower (but fast with GTSAM)

1 Linearization iteration only

Sensitive to linearization point

Fastest

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 39: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Case Study 1 ROVIO

bull EKF state X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 1198711 1198712 hellip 119871119896bull Basic idea

1 Prediction step predicts next position velocity orientation and features using IMU integration model

2 Measurement update refines state by leveraging visual constraint (ROVIO minimizes the photometric error between corresponding points (alternative would be the reprojection error))

39Bloesch Burri Omari Hutter Siegwart Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback

International Journal of Robotics Research (IJRR) 2017 PDF Code

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 40: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

ROVIO Problems

bull Complexity of the EKF grows quadratically in the number of estimated landmarksbull Thus max 20 landmarks are tracked to allow real-time operation

bull Only updates the most recent state

40

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 41: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Case Study 2 MSCKF

bull MSCKF (Multi-State Constraint Kalman Filter) updates multiple past poses 119901C1 119902C1 hellip 119901C119873 119902CN in addition to the current state 119901 119905 119902 119905 119907 119905 State vector

bull Prediction step same as ROVIO

bull Measurement updatebull Differently from ROVIO

bull landmark positions are not added to the state vector thus can run very fast independently of the number of featuresbull Visual constraint is obtained from the Epipolar Line Distance (Lecture 08)

bull Used in spacecraft landing (NASAJPL Moon and Mars landing) DJI drones Google ARCore

bull Released open source within the OpenVins project httpsgithubcomrpngopen_vins

41

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

X = 119901 119905 119902 119905 119907 119905 b119886 119905 b119892 119905 119901C1 119902C1 hellip 119901C119873 119902CN

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 42: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

MSCKF running in Google ARCore (former Google Tango)

42

[1] Mourikis Roumeliotis A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation International Conference on Robotics and Automation (ICRA) 2007 PDF

[2] Li Mourikis High-precision consistent EKF-based visualndashinertial odometry International Journal of Robotics Research (IJRR) 2013 PDF

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 43: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Outline

bull What is an IMU and why do we need it

bull IMU model

bull Visual Inertial Odometrybull Closed-form solution

bull Non-linear optimization methods

bull Filtering methods

bull Camera-IMU extrinsic calibration and Synchronization

43

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 44: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Camera-IMU Calibration

bull Goal estimate the rigid-body transformation 119931119913119914 and time offset 119957119941 between the camera and the IMU caused by communication delays and the internal sensor delays (introduced by filters and logic)

bull Assumptions Camera and IMU rigidly attached Camera intrinsically calibrated

bull Data

bull Image points from calibration pattern (checkerboard or QR board)

bull IMU measurements accelerometer 119886119896 and gyroscope 120596119896

44

119905

images

IMU

119957119941

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 45: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Kalibr Toolbox

45Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Code httpsgithubcomethz-aslkalibrwikicamera-imu-calibration

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 46: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Kalibr Toolbox

46Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Solves a non-linear Least Square optimization problem similar to that seen before but also optimizes over 119879119861119862 119905119889

bull Continuous-time modelling using splines for 119883

bull Numerical solver Levenberg-Marquardt

119883 119871 119879119861119862 119905119889 119887119886 119887119892 = 119886119903119892119898119894119899119883 119871119879119861119862119905119889 119887119886 119887119892

119896=1

119873

119891 119909119896minus1 119906 minus 119909119896 1205561198962 +

119896=1

119873

119894=1

119872

120587(119909119896 119871119894) minus 119911119894119896 120564119894119896

2

IMU residuals Reprojection residuals(Bundle Adjustment term)

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 47: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Kalibr Toolbox

47Furgale Rehder Siegwart Unified Temporal and Spatial Calibration for Multi-Sensor Systems IEEERSJ International Conference on Intelligent Robots and Systems (IROS) 2013 PDF Code

bull Generates a report after optimizing the cost function

Residuals

Reprojection error [px] 00976 0051

Gyroscope error [rads] 00167 0009

Accelerometer error [ms^2] 00595 0031

Transformation T_ci (imu to cam)

[[ 099995526 -000934911 -000143776 000008436]

[ 000936458 099989388 001115983 000197427]

[ 000133327 -00111728 099993669 -005054946]

[ 0 0 0 1 ]]

Time shift (delay d)

cam0 to imu0 [s] (t_imu = t_cam + shift)

000270636270255

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 48: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Popular Datasets for VIO VI-SLAM

48

EuRoC

Drone with IMU and stereo camera

Blackbird

Drone flying fast with VR goggles (synthetic images and real dynamics + IMU)

UZH FPV Drone Racing

Drone flying fast with event camera stereo camera IMU

MVSEC

Drone motorcycle car with event camera stereo camera

3D lidar GPS IMU

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49

Page 49: Vision Algorithms for Mobile Robotics Lecture 01 Introduction

Understanding Check

Are you able to answer the following questions

bull Why is it recommended to use an IMU for Visual Odometry

bull Why not just an IMU without a camera

bull What is the basic idea behind MEMS IMUs

bull What is the drift of a consumer IMU

bull What is the IMU measurement model (formula)

bull What causes the bias in an IMU

bull How do we model the bias

bull How do we integrate the acceleration to get the position (formula)

bull What is the definition of loosely coupled and tightly coupled visual inertial fusion

bull How does non-linear optimization-based visual inertial odometry Can you write down the cost function and illustrate its meaning

49