LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV...

13
LiDAR SLAM Utilizing Normal Distribution Transform and Measurement Consensus Ashwin Vivek Kanhere and Grace Xingxin Gao Stanford University BIOGRAPHIES Ashwin Vivek Kanhere is a PhD student in the Navigation and Autonomous Vehicles Laboratory in the department of Aero- nautics and Astronautics at Stanford University. He received his B.Tech degree in Aerospace Engineering from the Indian Institute of Technology Bombay, Mumbai in 2017 and an MS in Aerospace Engineering from the University of Illinois at Urbana-Champaign in 2019. His research interests are reliable navigation, with a particular interest in Computer Vision and LiDAR applications. Grace Xingxin Gao is an assistant professor in the Department of Aeronautics and Astronautics at Stanford University. Before joining Stanford University, she was an assistant professor at University of Illinois at Urbana-Champaign. She obtained her Ph.D. degree at Stanford University. Her research is on robust and secure positioning, navigation and timing with applications to manned and unmanned aerial vehicles, robotics and power systems. ABSTRACT Parametric point cloud approximations, such as the Normal Distribution Transform (NDT), are used for LiDAR navigation that requires compact maps. However, the state-of-the-art in these algorithms do not provide a reliability assessment. In addition, the computational efficiency needs to be improved for real-time applications. We propose a consensus-NDT Simultaneous Localization and Mapping (SLAM) framework, which utilizes measurement consensus, to increase computational efficiency and estimate localization reliability. In our approach, measurement consensus is quantified using a two-tiered method. In the first tier, a voxel consensus metric evaluates feature-level measurement consensus. In the second tier, voxel consensus metrics for all features are combined to yield the localization consensus metric, which estimates localization reliability. By selecting high measurement consensus voxels, computational efficiency is improved during the LiDAR odometry and map update steps. We perform experiments with real-world data which show that the proposed framework improves localization computation efficiency without compromising accuracy as compared to na¨ ıve NDT SLAM. Additional experiments show that the localization consensus metric is a valid metric for estimating localization reliability. 1 I NTRODUCTION Simultaneous Localization and Mapping (SLAM) [1] [2] [3] is an important tool to process LiDAR point clouds for their use in mobile robot navigation. A key element of LiDAR SLAM is the scan matching process, in which test point clouds are aligned with a prior reference of the environment. The reference may be a map generated prior to operation or point clouds from previous instances in time. Commonly used scan matching techniques are variants of the Iterative Closest Points (ICP) algorithm [6] [7]. Maps built using ICP scan matching contain dense point clouds, making them unsuitable for real-time applications where storage and computational resources are limited. These compact map and fast lookup requirements of real-time LiDAR applications are fulfilled [9] by using parametric point cloud approximations [8] while maintaining localization accuracy [10]. To estimate the localization reliability of these techniques, the goodness of match between the test and reference measurements must be

Transcript of LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV...

Page 1: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

LiDAR SLAM Utilizing Normal DistributionTransform and Measurement Consensus

Ashwin Vivek Kanhere and Grace Xingxin GaoStanford University

BIOGRAPHIES

Ashwin Vivek Kanhere is a PhD student in the Navigation and Autonomous Vehicles Laboratory in the department of Aero-nautics and Astronautics at Stanford University. He received his B.Tech degree in Aerospace Engineering from the IndianInstitute of Technology Bombay, Mumbai in 2017 and an MS in Aerospace Engineering from the University of Illinois atUrbana-Champaign in 2019. His research interests are reliable navigation, with a particular interest in Computer Vision andLiDAR applications.Grace Xingxin Gao is an assistant professor in the Department of Aeronautics and Astronautics at Stanford University. Beforejoining Stanford University, she was an assistant professor at University of Illinois at Urbana-Champaign. She obtained herPh.D. degree at Stanford University. Her research is on robust and secure positioning, navigation and timing with applicationsto manned and unmanned aerial vehicles, robotics and power systems.

ABSTRACT

Parametric point cloud approximations, such as the Normal Distribution Transform (NDT), are used for LiDAR navigation thatrequires compact maps. However, the state-of-the-art in these algorithms do not provide a reliability assessment. In addition,the computational efficiency needs to be improved for real-time applications.

We propose a consensus-NDT Simultaneous Localization and Mapping (SLAM) framework, which utilizes measurementconsensus, to increase computational efficiency and estimate localization reliability. In our approach, measurement consensus isquantified using a two-tiered method. In the first tier, a voxel consensus metric evaluates feature-level measurement consensus.In the second tier, voxel consensus metrics for all features are combined to yield the localization consensus metric, whichestimates localization reliability. By selecting high measurement consensus voxels, computational efficiency is improved duringthe LiDAR odometry and map update steps.

We perform experiments with real-world data which show that the proposed framework improves localization computationefficiency without compromising accuracy as compared to naıve NDT SLAM. Additional experiments show that the localizationconsensus metric is a valid metric for estimating localization reliability.

1 INTRODUCTION

Simultaneous Localization and Mapping (SLAM) [1] [2] [3] is an important tool to process LiDAR point clouds for their use inmobile robot navigation. A key element of LiDAR SLAM is the scan matching process, in which test point clouds are alignedwith a prior reference of the environment. The reference may be a map generated prior to operation or point clouds fromprevious instances in time.

Commonly used scan matching techniques are variants of the Iterative Closest Points (ICP) algorithm [6] [7]. Maps builtusing ICP scan matching contain dense point clouds, making them unsuitable for real-time applications where storage andcomputational resources are limited. These compact map and fast lookup requirements of real-time LiDAR applications arefulfilled [9] by using parametric point cloud approximations [8] while maintaining localization accuracy [10]. To estimatethe localization reliability of these techniques, the goodness of match between the test and reference measurements must be

Page 2: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

quantified. The goodness of match for the point cloud depends on the goodness of match on the feature level. The feature-levelgoodness of match is impacted by the presence of faults, such as outliers and occlusions, in the corresponding measurements.By detecting the presence of such faults, feature and subsequently overall localization reliability can be estimated for a test-reference pair.

The arrangement of features in the LiDAR’s field of view (FOV) also impacts localization uncertainty and reliability.Physically connected objects in the environment are represented as point clusters in measurements. Scan matching errors ofpoint in these clusters are highly correlated due to the shared physical nature of the real-world object. When such a pointcluster contains measurement faults, like when it measures the same moving object, the related scan matching error induces adisproportionately large drift in the navigation solution. Clusters of low reliability voxels are detrimental to navigation solutionsand must therefore be considered while estimating the reliability of localization solutions.

Prior work in studying the impact of measurement faults on the reliability of laser-based navigation quantifies the effectof feature-match uncertainty, or the probability of hazardously misleading information. In [11] and [12], the authors theorizethat erroneous feature correspondences are the main source of faults in laser-based navigation. By estimating the probabilityof incorrect feature correspondence, the authors quantify the probability of an unreliable (hazardously misleading) navigationsolution. In [13] the impact of feature uncertainty on the resulting navigation solution is used to quantify navigation reliability.

Prior approaches quantify overall localization reliability but potentially faulty features are not removed from the scan match-ing pipeline. Rather than giving both potentially faulty and non-faulty measurements the same weight during the scan matchingprocess, computation efficiency can be improved by using only reliable features. Removing potentially faulty measurementswill result in more efficient computation of the navigation solution without sacrificing accuracy. As an additional benefit,reliable features can be used landmarks during the map update process.

Features that provide consistently similar measurements across different instances of time are a category that are reliable forlocalization purposes. Quantifying the consensus between measurements and a prior reference and selecting high measurementconsensus features provides a method of extracting such reliable features. The choice of approach to estimate measurementconsensus depends on the fraction of outliers present in the measurement. For low outlier fractions and large number ofmeasurements, Random Sample Consensus (RANSAC) [14] provides an efficient method for identifying [15] and utilizing [16]features with high measurement consensus. Alternatively, when the outlier fraction is high or the number of measurements arelow, a nominal measurement model is used [17]. Such a model provides an estimate of acceptable measurement variation anda faulty measurement is detected when this variation is exceeded.

In this paper, the Normal Distribution Transform (NDT) [18] [19] is used as the parametric point cloud approximationfor the prior reference. In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured insideeach voxel are approximated by a 3D Gaussian distribution. NDT approximates the large variety of measured features with acommon set of parameters resulting in a compact map. An additional benefit of using NDT is the elimination of the potentiallyfaulty feature correspondence step [11] from the scan matching pipeline.

The NDT approximation of the reference point cloud provides an estimate of a nominal point cloud error model. This modelis used to evaluate the measurement consensus supporting a candidate pose for a particular test-reference pair. Intuitively,high measurement consensus for a majority of features in the LiDAR’s FOV implies high measurement consensus supportingthe candidate pose. This motivates an approach in which measurement consensus is first quantified by a feature-level voxelconsensus metric. The voxel consensus metrics are then combined for all features to obtain the localization consensus metric,which estimates reliability of the navigation solution. By integrating both the voxel and localization consensus metric in a NDTSLAM framework, computational efficiency can be increased and the reliability of localization solutions can be estimated.

Our contribution

This work is based on the author’s MS thesis [24] which contains an expanded discussion of this paper’s contents. The contri-butions of this work are the following:

1. We design a consensus-NDT SLAM architecture that utilizes measurement consensus for faster convergence to the lo-calization solution.

2. We define a consensus metric for each voxel to detect and remove potentially faulty features, voxels containing outliers,occlusions and mismatched features, from the scan matching pipeline.

3. We propose a Consensus Weighted Dilution of Precision (CWDOP)-based approach for the combination of the voxel

2

Page 3: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

consensus metrics. The resultant localization consensus metric quantifies the localization reliability while accounting forthe impact of voxel arrangements on reliability.

4. We benchmarked the localization performance of consensus-NDT SLAM against naıve NDT SLAM using data collectedon the campus of the University of Illinois at Urbana-Champaign.

5. We use data from the KITTI Vision Benchmark Suite [20] to demonstrate the effectiveness of the proposed localizationconsensus metric in estimating the reliability of a localization solution.

The remainder of this paper is arranged as follows: Section 2 provides a brief background on NDT; Section 3 describes theconsensus-NDT SLAM architecture, with odometry and mapping components used in this work. Section 4 details two-tieredmeasurement consensus evaluation technique; Section 5 describes the validation procedure and discusses experimental results;and finally, Section 6 concludes this paper.

2 BACKGROUND

2.1 Normal Distribution Transform

For completeness, key details of the 3D NDT approximation used in our work are included here. A detailed discussion of whichcan be found in [19].

In NDT, the 3D space of the LiDAR’s FOV is divided into a grid of overlapping voxels and measured points are binnedinto voxels containing them. If a voxel was previously unoccupied, the new measured points are used to create an approximateGaussian distribution for that voxel. If a voxel was occupied and has a corresponding Gaussian approximation, the new pointsare utilized to update the existing distribution.

For this iterative update process, we define the sum of points in voxel j at time t asmj,t. Sj,t is defined as the autocorrelationof the points present in voxel j at time t. The number of points present in voxel j at time t is given by Nj,t.

When new points are binned in the voxel, the sum of points and autocorrelation corresponding to voxel j are updated using

mt,j = mt−1,j +∑i

pi,j and (1)

St,j = St−1,j +∑i

pi,jpTi,j (2)

respectively. Here pi,j is the ith point binned in voxel j at time t, given by pi,j = [pxi,j , pyi,j , p

zi,j ]

T . These equations are thenused to update the parameters of the Gaussian distribution occupying voxel j using

µj,t =mt,j

Nt,j, and (3)

Σj,t =St − µj,tµ

Tj,t

Nt. (4)

respectively.In the following sections, the notation is reduced to µj and Σj , which refer to the mean and covariance of the Gaussian

distribution in voxel j. Further throughout the paper, i indexes over all points present in a voxel while j indexes over alloccupied voxels in the NDT approximation.

To prevent degeneracy in representations of features like lines and planes, during the update the smallest eigenvalue of thecovariance matrix is set to be greater than 0.001 times the largest eigenvalue. Moreover, the calculation of consensus for a voxelrequires a redundancy of points, as a result of which the update step is carried out only if Nt ≥ 5.

2.2 NDT Scan Matching

Scan matching provides an affine transformation, x = [Tx, Ty, Tz, φ, θ, ψ]T , that maps the test point cloud to the referencepoint cloud. This transformation is applied to a point pi by means of a matrix A(x) such that vi(x) = A(x)pi, where

3

Page 4: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 1: The key steps executed for an individual key frame in consensus-NDT SLAM.

vi(x) = [vxi , vyi , v

zi , 1]T and pi(x) are 3D homogeneous points. A(x) is expressed as A(x) =

[R(x) T (x)

0 1

], where

T (x) = [Tx, Ty, Tz]T is the applied translation. R(x) the rotation matrix corresponding to an intrinsic rotation of φ, θ and ψapplied about the x, y and z axes respectively [24].

Scan matching in ICP and its variants is an iterative process. Each iteration begins by forming correspondences betweenfeatures extracted from point clouds at different instances. A pose that minimizes the distance between corresponding featuresis then passed to the next iteration. This process is carried out until the desired error tolerances for convergence are achieved.On the other hand, NDT scan matching eliminates the feature correspondence step and directly minimizes a score betweenthe test point cloud and reference NDT approximation. An analytical expression for the solution is not known and the posealigning the test measurement with the prior reference is obtained by numerically optimizing an objective function. For a giventransformation vector x, the objective function is obtained by combining the score for each voxel

sj(x) = −∑i

exp(−qTi,jΣ

−1j qi,j

2) , (5)

where qi,j(x) = vi,j(x) − µj , µj and Σj are the mean and covariance corresponding to voxel j, as described in Section 2.1.vi,j(x) are points binned in voxel j after transformation by x. This dependence of the points and score function on the appliedtransformation x is assumed to exist throughout the remainder of this paper and (x) is dropped from the notation.

Ideally, when the test point cloud is aligned with the reference NDT, the score will be minimized for every voxel. In orderto model this effect while increasing robustness to measurement noise, the objective function is defined as the sum of the scorefor each voxel corresponding to the pair of reference NDT, ndtn, and test point cloud, pcm,

S(x;ndtn; pcm) = S(x) =∑j

sj(x) . (6)

Analytical expressions for both the Jacobian and Hessian of the scan matching objective function exist [24]. As a result,algorithms such as Newton’s Conjugate Gradient Method [23] can be used to perform numerical optimization and match thetest point cloud to the NDT reference.

3 CONSENSUS-NDT SLAM

Figure 1 gives an overview of the consensus-NDT SLAM framework proposed in this paper. Similar to [21], in consensus-NDT SLAM LiDAR odometry is performed for each consecutive pair of measured point clouds and the map is updated onceper keyframe. We define a keyframe as a collection of all point clouds that fulfill a pre-determined similarity criterion. The ratioof common features to total features is used as the similarity criterion in this work. Once this similarity falls below a threshold,a new keyframe is created and the process repeats.

In the LiDAR odometry component, the NDT approximation of the previously measured point cloud (reference pointcloud) is compared to the latest measured point cloud (test point cloud). The LiDAR odometry is a modification of the NDT

4

Page 5: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 2: The LiDAR odometry component of the algorithm. Only high consensus voxels are used for the fine optimizationstep, reducing computational load and improving convergence time.

scan matching process. In calculating LiDAR odometry, feature-level measurement consensus is quantified by means of a voxelconsensus metric (Cv), that is defined for each occupied voxel. The voxel consensus metric is used to retain high measurementconsensus voxels in a two-step process. The result of this modified scan matching process is a pose that transforms the testpoint cloud to the reference point cloud.

The map update takes place when LiDAR odometry has been obtained for all point clouds in a keyframe. Inputs to themap update step are the prior map of the surroundings and the current keyframe. The NDT approximations of each pointcloud in the keyframe and solutions from the LiDAR odometry for all consecutive pairs are also passed to the map updatecomponent of the framework. The LiDAR odometry solution is used as an initial condition for the map update step. As aresult of the map update, new landmarks are added to the map. These landmarks consist of high consensus voxels, Gaussiandistributions (µj ,Σj) for a voxel with centre [mx

j ,myj ,m

zj ]T . This component also refines the localization solution, the LiDAR

pose x(t) = [Tx, Ty, Tz, φ, θ, ψ]T , for all given point clouds and provides an estimate of the corresponding reliability in theform of a localization consensus metric (Cm).

3.1 LiDAR Odometry

In the nave implementation of the NDT SLAM framework used in this paper, scan matching, described previously in Section2.2, is performed for point cloud pairs during this step. In the consensus-NDT SLAM framework, a voxel consensus metricthresholding component is added to the scan matching pipeline. The output from the LiDAR odometry is a transformation thatmatches the test point cloud to the reference point cloud with consensus aiding. As a result, the notation used in this section issimilar to that of Section 2.2.

The LiDAR odometry approach utilizes voxel consensus metric thresholding to retain high consensus voxels at an inter-mediate stage. This thresholding step reduces the weight measurement faults, such as outliers, mismatches and occlusions,are given during the scan matching process. This results in faster convergence to the navigation solution while maintaininglocalization accuracy. To perform LiDAR odometry between two consecutive point clouds, the following steps, depicted inFigure 2, are taken.

1. The optimization problem is initialized assuming no motion between the consecutive point clouds i.e. x0 = [0, 0, 0, 0, 0, 0]T

and an NDT approximation is created for the reference point cloud.

2. Scan matching is performed using all NDT voxels and measured points. The result of this coarse optimization stepyields an intermediate affine transformation x = [Tx, Ty, Tz, φ, θ, ψ]T . Points belonging to low consensus voxels arealso utilized during this coarse optimization step and add to the computation cost of optimization.

3. Voxels with Cv < η are removed from the NDT approximation to reduce the computation cost of each iteration duringoptimization. For this work, η = 0.7 is used as the threshold.

4. With x as the initial condition scan matching is performed again with the remaining high consensus voxels. The outputof this step is an estimate of the LiDAR odometry, x = [Tx, Ty, Tz, φ, θ, ψ]T , between the consecutive point cloud pair.

3.2 Map Update

The map update component optimizes LiDAR poses corresponding to point clouds in the keyframe. As a result of this step,estimates of the LiDAR poses for each time instance x(t), the corresponding localization consensus metrics Cm and an updated

5

Page 6: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 3: A simplified schematic of the mapping component of the algorithm. High consensus voxels are utilized to optimizethe objective function and as map landmarks.

posterior map are obtained. A simplified schematic of this process is given in Figure 3.Naıve optimization of the objective function is computationally expensive due to the high dimensionality of the argument

and the large number of features in the accompanying NDT approximations. To solve this problem, map update in consensus-NDT SLAM is performed using only high consensus voxels and with the LiDAR odometry solution for each pair as an initialcondition. For mapping, the objective function

Smap(x1, x2, . . . , xk) =∑l

S (xl;ndtmap; pcl) +∑l

∑m l 6=m

S(xl − xm, ndtl, pcm) (7)

is minimized with respect to the LiDAR pose vectors for each time instant x. x denotes the absolute LiDAR pose in a navigationframe of reference. The map and LiDAR poses share this common navigation frame of reference in which poses are obtained byiteratively combining the relative pose between consecutive point clouds x, which is output by the LiDAR odometry component.

The score of a point cloud from the reference map is captured in the term S (xj ;ndtmap; pcj). xj is the pose of pcj withrespect to the reference map. The score of each point cloud in the keyframe with respect to the NDT approximation of otherpoint clouds in the keyframe is captured in the term S(xi − xj , ndtj , pci). In this term, xj − xi represents the transformationthat maps pci to ndtj . Minimizing the objective in Equation (7) ensures an equal balance between point clouds aligning withthe map (first term) and each other (second term). The point clouds in the keyframe are transformed by the optimal odometryvectors xi and high consensus voxels from these transformed point clouds update the map using Equations (3) and (4).

4 MEASUREMENT CONSENSUS METRIC

The measurement consensus supporting a candidate pose for a given test-reference pair is obtained using a two-tiered approach.The feature-level measurement consensus is first estimated by voxel consensus metrics (Cv) for each occupied voxel. Theseare then combined to obtain the localization consensus metric (Cm), which estimates localization reliability.

4.1 Voxel Consensus Metric

For evaluation of the voxel consensus metric, we extend outlier detection tests [25] [26] for GPS pseudoranges to points insideNDT voxels. The reference Gaussian distribution models the shape, size and location of the feature contained in a voxel.Although this feature is rarely a single point, the mean of the Gaussian can be treated analogous to the expected measurementfor that feature. The covariance of the Gaussian gives a nominal error model for measurements of the feature contained in thevoxel. The measured points are analogous to the received measurements and are assumed to be 3D normally distributed.Forvoxel j, the χ2 distributed Residual Sum of Squares (RSS) [25] is

rj =∑i

qTi,jΣ−1j qi,j

Nj − 4, (8)

where the notation is common to that used in Section 2.2. Nj are the number of measured points binned inside voxel j. Thefactor Nj − 4 in the denominator is the degrees of freedom possessed by the χ2 distribution. At least four points are requiredto obtain a non-degenerate 3D Gaussian approximation for a voxel giving Nj − 4 degrees of freedom for Nj points.

6

Page 7: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 4: Comparison of the voxel consensus metric for different cases: (a) successful feature match between NDT referenceand test measurements; (b) occlusion in test measurements; (c) feature mismatch between NDT reference and test measure-ments, and (d) outliers present in test measurements. In this figure, red points denote the test measurements whereas blue pointsare drawn from the Gaussian distribution approximating the rectangular block present in the voxel. In cases (b), (c) and (d) Cv

decreases, indicating the presence of measurement faults.

To compare measurement consensus across different voxels, the RSS must be scaled to a common range irrespective of thenumber of points or shape of feature inside a voxel. We smoothly normalize the RSS which gives a consensus metric between 0and 1, indicating low and high consensus between the measured points and underlying Gaussian respectively. A high consensuswould imply that the measured points are drawn from the prior Gaussian with a high probability whereas a low consensusimplies that the measured points represent a feature different from the prior Gaussian.

To achieve this smooth normalization, the RSS is first linearly scaled between τhigh and τlow and then passed through asigmoid function to give

Cv = sigm(s− 2s

(r − τlow

τhigh − τlow

)). (9)

The hyperparameter s = 3 is chosen such that Cv ≥ 0.95 when r ≤ τlow and Cv ≤ 0.05 when r ≥ τhigh. The hyperparametersτhigh = χ2

0.999,Nj−4 and τlow = χ20.001,Nj−4 correspond to a measurement fault detection test for a p-value of 0.001. The

notation χ2p,q represents the inverse CDF of a χ2 distribution at p for q degrees of freedom.

The simulated example shown in Figure 4 gives an example of a case where the presence of outliers (b), feature mismatches(c) and feature occlusions (d) is reflected in the decreased value of the voxel consensus metric.

4.2 Localization Consensus Metric

The Dilution of Precision (DOP) quantifies the effect of satellite geometry on the susceptibility of the navigation solution tomeasurement error [28] [29]. A similar formulation is used to combine voxel consensus metrics and capture the effect of theirarrangement on localization reliability.

The geometric spread of feature-level measurement consensus is measured by the CWDOP,

CWDOP =√∑

Hii , where H =(GT ΩG

)−1. (10)

7

Page 8: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 5: Examples of Cm for different arrangements of the same voxels: (a) voxels with both high and low voxel consensusmetrics spread throughout the LiDAR FOV, (b) low consensus voxels clustered in part of the LiDAR FOV, which susceptible tolarge drifts in the navigation solution. A decrease in Cm detects the presence of low consensus voxel clusters.

The geometry matrix of the occupied voxels is

G =

µx1 µy

1 µz1

...µxk µy

k µzk

...µxM µy

M µzM

, (11)

where[µxj µy

j µzj

]are unit vectors corresponding to the mean of the voxel Gaussian distributions. Ω is a diagonal matrix

containing the corresponding voxel consensus metrics and is expressed as

Ω =

C2v,1

. . .C2

v,j

. . .C2

v,M

. (12)

Similar to our previous approach, we obtain the localization consensus metric by normalizing CWDOP to a common range,irrespective of the number of occupied voxels in the LiDAR’s FOV. The localization consensus metric

Cm =DOP

CWDOP, (13)

lies between 0 and 1, indicating low and high consensus respectively. Here DOP =

√tr(GT G

)−1 is the geometric ar-rangement of the voxels, and captures the best possible scenario: when Cv = 1 for each voxel. The normalization given inEquation (13) quantifies the spread of the features with respect to the voxel consensus. Any arrangement of voxels will have ahigher CWDOP than this case, indicating a poorer arrangement of voxels and a lower value ofCm. AsCm is normalized and notimpacted by the number of features or distance metric used to match them, it quantifies the localization solution measurementconsensus irrespective of the algorithm used.

5 RESULTS AND DISCUSSION

For a practical validation of the SLAM algorithm, we built a map using data collected on the campus of the University of Illinoisat Urbana-Champaign. The point clouds were measured using an off the shelf 16 channel LiDAR.

8

Page 9: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

For validation of the consensus metric, synced and rectified point clouds from the raw dataset of the KITTI Vision Bench-mark Suite [20] with the corresponding ground truth estimates from an independent GPS/IMU system are used.

5.1 Localization and Mapping Results

The algorithm localizes the robot in, and maps the previously unseen environment. The map obtained from this experimentusing consensus-NDT SLAM is given in Figure 6. Localization results from consensus-NDT SLAM are benchmarked againstnaıve NDT SLAM. Our implementation of naıve NDT SLAM does not utilize the consensus metric in any capacity, but issimilar to the consensus-NDT SLAM in all other aspects.

AlgorithmAverage Itera-tions per pointcloud pair

Average time perpoint cloud pair

Average time periteration (coarseoptimization)

Average time periteration (fine op-timization)

Consensus-NDT SLAM (VoxelLength = 50 cm) 20 110 s 7 s 1 s

NDT SLAM (Voxel Length = 50 cm) 28 196 s 7 s 7 s

Table 1: Number of optimization iterations and time per iteration for consensus-NDT SLAM and naıve NDT SLAM imple-mentations. Consensus-NDT SLAM converges faster to a navigation solution than naıve NDT SLAM

Results reported in Table 1 show that consensus-NDT SLAM requires fewer optimization iterations per point cloud pairto converge to the navigation solution. Iterations in the fine optimization step of consensus-NDT SLAM are also faster due toreduced computational load achieved by retaining only high consensus voxels. Both these factors results in faster optimiza-tion convergence for consensus-NDT SLAM. This convergence time improvement is achieved without compromising on thelocalization accuracy of the algorithm as shown in Table 2.

Algorithm ∆x ∆y ∆z ∆φ ∆θ ∆ψ

Consensus-NDT SLAM (VoxelLength = 50 cm) 7.6 cm 7.4 cm 1.4 cm 0.13 0.14 0.38

NDT SLAM (Voxel Length = 50 cm) 7.8 cm 7.3 cm 1.8 cm 0.15 0.14 0.40

Table 2: RMS error for implementations of consensus-NDT SLAM and NDT SLAM. The localization accuracy of bothconsensus-NDT SLAM and naıve NDT SLAM is similar, despite the faster convergence and lower computational load pro-vided by consensus enhancement.

5.2 Voxel Consensus Metric Examples

Thresholding with the voxel consensus metric detects instances of outliers, feature mismatches and occlusions present betweenconsecutive point clouds. An example of the result of this thresholding is given in Figure 7 where sparsely measured anddynamic objects are the primary types of measurement faults detected. A specific example of feature-level measurement faultsdetected is given in Figure 8 which shows a particular feature mismatch from the same test-reference pair as in Figure 7.

9

Page 10: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 6: Map of a corridor that was constructed using point clouds collected on the campus of the University of Illinois atUrbana-Champaign.

Figure 7: Faulty measurements detected by the voxel consensus metric: (a) NDT approximation of reference point cloud; (b)NDT approximation of reference point cloud shaded by measurement consensus with test point cloud. White denotes highconsensus while black denotes low consensus in the heat map color reference.

Figure 8: Feature mismatch detected by the voxel consensus metric: (a) Reference point cloud (cyan) and transformed testpoint cloud (pink) showing feature mismatch; (b) Reference point cloud shaded by voxel consensus metric indicating low valueof Cv and successful mismatch detection.

10

Page 11: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

Figure 9: The variation in Cm with increasing deviation from the ground truth: (a) translational deviation from ground truthestimate along x-axis; (b) rotational deviation from ground truth estimated roll angle. Increasing deviation from ground truth isdetected by a decrease in Cm.

5.3 Localization Consensus Metric Results

The localization consensus metric formulation is validated by plotting its variation with increasing deviation from the estimatedground truth pose. For an effective experiment, dynamic objects are removed from the reference point cloud using the estimatedground truth. As the ground truth estimate is obtained independently from the LiDAR measurements, this method removes mostdynamic objects from the point clouds. The results of this experiment, plotted in Figure 9, show that Cm decreases with anincrease in the translational and rotational deviation from the ground truth estimate. This indicates that the formulation for Cm

provides a viable estimate for the reliability of the navigation solution.

6 CONCLUSIONS

In this work, we presented a consensus-NDT SLAM algorithm that improves the optimization convergence performance whilemaintaining localization accuracy. Our proposed framework utilizes measurement consensus in the LiDAR odometry and mapupdate components. A two-tiered measurement consensus estimation process is utilized to evaluate the reliability of featuresand navigation solutions. The voxel consensus metric detects feature-level measurement faults and the localization consensusmetric quantifies the localization reliability while accounting for the effect of different arrangements of voxels.

Experiments conducted using real world data collected on the campus of the University of Illinois at Urbana-Champaignshow that the accuracy of the proposed SLAM algorithm is comparable to standard NDT SLAM while providing faster con-vergence. Additional experiments on point clouds from the KITTI Vision Benchmark Suite show that the proposed consensusmetric quantifies the goodness of match between a measured point cloud and the reference irrespective of the scan matchingalgorithm used.

7 ACKNOWLEDGMENTS

The material in this paper is based on work supported by the United States Army CERDEC under contract number W911NF-13-1-0086. We would also like to thank Akshay Shetty and Shubhendra Chauhan for their help with the data collection on theUniversity of Illinois at Urbana-Champaign campus.

11

Page 12: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

REFERENCES

[1] T. Bailey and H. Durrant-Whyte, “Simultaneous localization and mapping (SLAM): Part II,” IEEE Robotics and Automa-tion Magazine, vol. 13, no. 3, pp. 108–117, 2006.

[2] H. Durrant-Whyte and T. Bailey, “Simultaneous localization and mapping: Part I,” IEEE Robotics and Automation Mag-azine, vol. 13, no. 2, pp. 99–108, 2006.

[3] C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I. Reid, and J. J. Leonard, “Past, present, and futureof simultaneous localization and mapping: Toward the robust-perception age,” IEEE Transactions on Robotics, vol. 32,no. 6, pp. 1309–1332, 2016.

[4] A. Soloviev, D. Bates, and F. Van Graas, “Tight coupling of laser scanner and inertial measurements for a fully autonomousrelative navigation solution,” Navigation, vol. 54, no. 3, pp. 189–205, 2007.

[5] A. V. Kanhere and G. X. Gao, “Integrity for gps/lidar fusion utilizing a raim framework,” in 31st International TechnicalMeeting of the Satellite Division of the Institute of Navigation, ION GNSS+ 2018. Institute of Navigation, 2018, pp.3145–3155.

[6] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, “The trimmed iterative closest point algorithm,” in Object recogni-tion supported by user interaction for service robots, vol. 3. IEEE, 2002, pp. 545–548.

[7] A. Segal, D. Haehnel, and S. Thrun, “Generalized-icp.” in Robotics: science and systems, vol. 2, no. 4. Seattle, WA,2009, p. 435.

[8] M. Magnusson, N. Vaskevicius, T. Stoyanov, K. Pathak, and A. Birk, “Beyond points: Evaluating recent 3D scan-matchingalgorithms,” Proceedings - IEEE International Conference on Robotics and Automation, vol. 2015-June, no. June, pp.3631–3637, 2015.

[9] R. W. Wolcott and R. M. Eustice, “Fast LIDAR localization using multiresolution Gaussian mixture maps,” Proceedings- IEEE International Conference on Robotics and Automation, vol. 2015-June, no. June, pp. 2814–2821, 2015.

[10] D. Droeschel, J. Stuckler, and S. Behnke, “Local multi-resolution representation for 6d motion estimation and mappingwith a continuously rotating 3d laser scanner,” in 2014 IEEE International Conference on Robotics and Automation(ICRA). IEEE, 2014, pp. 5221–5226.

[11] M. Joerger, M. Jamoom, M. Spenko, and B. Pervan, “Integrity of laser-based feature extraction and data association,”Proceedings of the IEEE/ION Position, Location and Navigation Symposium, PLANS 2016, pp. 557–571, 2016.

[12] M. Joerger and B. Pervan, “Continuity Risk of Feature Extraction for Laser-Based Navigation,” Proceedings of the 2017International Technical Meeting of The Institute of Navigation, pp. 839–855, 2018.

[13] C. D. Larson, “An integrity framework for image-based navigation systems,” Air Force Institute of Technology, Wright-Patterson AFB, OH, Tech. Rep., 2010.

[14] R. Schnabel, R. Wahl, and R. Klein, “Efficient ransac for point-cloud shape detection,” in Computer graphics forum,vol. 26, no. 2. Wiley Online Library, 2007, pp. 214–226.

[15] F. Tarsha-Kurdi, T. Landes, and P. Grussenmeyer, “Hough-transform and extended ransac algorithms for automatic detec-tion of 3d building roof planes from lidar data,” in ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, vol. 36,2007, pp. 407–412.

[16] D. Fontanelli, L. Ricciato, and S. Soatto, “A fast ransac-based registration algorithm for accurate localization in unknownenvironments using lidar measurements,” in 2007 IEEE International Conference on Automation Science and Engineering.IEEE, 2007, pp. 597–602.

[17] W. G. Cochran, “The χ2 test of goodness of fit,” The Annals of Mathematical Statistics, pp. 315–345, 1952.

[18] P. Biber and W. Strasser, “The normal distributions transform: a new approach to laser scan matching,” 2004 IEEE/RSJInternational Conference on Intelligent Robots and Systems, IROS, no. November 2003, pp. 2743–2748, 2004.

12

Page 13: LiDAR SLAM Utilizing Normal Distribution Transform and ...gracegao/publications...In NDT, the 3D FOV of a point cloud is divided into voxels (cubes) and the points measured inside

[19] E. Takeuchi and T. Tsubouchi, “A 3-D scan matching using improved 3-D normal distributions transform for mobilerobotic mapping,” IEEE International Conference on Intelligent Robots and Systems, pp. 3068–3073, 2006.

[20] A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” International Journal of RoboticsResearch (IJRR), 2013.

[21] J. Zhang and S. Singh, “Loam: Lidar odometry and mapping in real-time.” in Robotics: Science and Systems, vol. 2, 2014,p. 9.

[22] D. Titterton, J. L. Weston, and J. Weston, Strapdown inertial navigation technology. IET, 2004, vol. 17.

[23] M. R. Hestenes, “Multiplier and gradient methods,” Journal of optimization theory and applications, vol. 4, no. 5, pp.303–320, 1969.

[24] A. Kanhere, “Measurement consensus metric aided lidar localiztion and mapping,” 2019.

[25] B. W. PARKINSON and P. AXELRAD, “Autonomous GPS Integrity Monitoring Using the Pseudorange Residual,” Nav-igation, vol. 35, no. 2, pp. 255–274, 1988.

[26] S. Hewitson and J. Wang, “Gnss receiver autonomous integrity monitoring (raim) performance analysis,” Gps Solutions,vol. 10, no. 3, pp. 155–170, 2006.

[27] A. Shetty and G. X. Gao, “Covariance estimation for gps-lidar sensor fusion for uavs,” in Proceedings of the 30th Interna-tional Technical Meeting of The Satellite Division of the Institute of Navigation (ION GNSS+ 2017), Portland, OR, USA,2017.

[28] C. S. Chen, “Weighted geometric dilution of precision calculations with matrix multiplication,” Sensors (Switzerland),vol. 15, no. 1, pp. 803–817, 2015.

[29] S. H. Doong, “A closed-form formula for GPS GDOP computation,” GPS Solutions, vol. 13, no. 3, pp. 183–190, 2009.

13