Automatic Segmentation of 3D Laser Point Clouds …Automatic Segmentation of 3D Laser Point Clouds...

Automatic Segmentation of 3D Laser Point Cloudsby Ellipsoidal Region Growing

Frederick Pauling

University of Queensland

Brisbane, Australia

[email protected]

Michael Bosse

CSIRO ICT Centre

Brisbane, Australia

[email protected]

Robert Zlot

CSIRO ICT Centre

Brisbane, Australia

[email protected]

Abstract

We present and evaluate two variants of analgorithm for simultaneously segmenting andmodeling a mixed-density unstructured 3Dpoint cloud by ellipsoidal (Gaussian) regiongrowing. The base algorithm merges initial el-lipsoids into larger ellipsoidal segments with aminimum spanning tree algorithm. The vari-ants differ only in the merge criterion used—athreshold on a generalised distance measure de-fined on the merge candidates. The first variant(shape-distance) considers the relative shape,orientation and position of the ellipsoids, andcan grow regions across missing or sparse data,whilst the second (density-distance) attemptsto maintain a good fit to the data by settinga minimum sample density threshold on themerged ellipsoid. Adjusting the threshold ineach case changes the quality and degree ofsegmentation achieved. The threshold param-eter is tuned by minimising Akaike’s Informa-tion Criterion (AIC) with respect to the thresh-old value. Experiments show that thresholdsselected in this way lead to low complexitymodels and are stable across different environ-ments. The shape-distance measure segmentslarge-scale structures more readily than thedensity-distance measure, but leads to higherAIC scores, and higher model complexity.

1 Introduction

Three-dimensional lidar point clouds are an importanttype of perceptual model for many robotics applicationssuch as autonomous mapping and navigation. However,they often have high redundancy, which incurs process-ing overhead, and they lack explicit structure, which lim-its semantic processes such as object classification. Seg-menting a point cloud into connected regions, and mod-elling those segments with parameterised shape primi-

tives provides both data compression and explicit struc-ture. An appropriate model can also assist in the taskof object classification and identification, and can helpto solve a key problem in mobile robotics—SimultaneousLocalisation and Mapping (SLAM)—by simplifying dataassociation and reducing feature complexity.

The segmentation and modelling algorithm describedin this paper has been designed for use on mixed-densitypoint-clouds obtained in outdoor environments; for ex-ample those containing planar building and ground-plane structures, cylindrical tree-trunks and poles, anddiffuse, amorphous vegetation. It has also been devel-oped specifically to deal with the uneven sampling den-sity and noise present in our dataset due to the acquisi-tion method.

The algorithm simultaneously segments and models apoint cloud by growing ellipsoidal regions via a mini-mum spanning-tree. The region-growing criteria is for-mulated as a threshold on a distance measure, definedon the merge candidate ellipsoids. Two distance mea-sures are proposed in this paper, each based on differentprinciples, and each producing a different style of seg-mentation. Varying the threshold parameter gives dif-ferent degrees of segmentation, and we select the thresh-old value by minimising Akaike’s Information Criterion(AIC) [Akaike, 1974] with respect to the threshold pa-rameter. AIC ranks models according to how they bal-ance variance with bias, and leads to the selection of thesimplest model which best explains the data—the prin-ciple of parsimony or Occam’s Razor. Using AIC alsoallows meaningful comparison of the variants at theiroptimal threshold values, and hence forms a relativelyobjective method of selecting the best algorithm or mea-sure for the segmentation.

By using ellipsoids as the primary data abstraction,we are able to natively represent planar, cylindrical andspherical regions in the data. Each ellipsoid is effectivelythe multivariate normal PDF of the modelled segment,and this allows the algorithm to deal with uneven sam-pling density and noise by defining the distance mea-

Australasian Conference on Robotics and Automation (ACRA), December 2-4, 2009, Sydney, Australia

sures to operate on the principal distribution compo-nents, rather than purely local properties. Ellipsoids aregenerated from the moment statistics of the modelledsegments, and region growing is achieved by incremen-tally merging these statistics (rather than re-computingthem each time). We have selected efficient methodswherever possible with a view towards online implemen-tation in the future, however the implementation is notoptimised and the computational efficiency is not evalu-ated in this paper.

Many techniques have been developed for segment-ing high-resolution point clouds derived from terres-trial 3D lidar scanners, such as those used for sur-veying and reverse-engineering civil works [Haala andBrenner, 1999; Slob and Hack, 2004]. However, thesetechniques typically rely on the terrestrial scanner’shigh sampling density, low-error and regular acquisitionscan pattern for accurate results [Klasing et al., 2008;Rabbania et al., 2006]. Because fixed 3D terrestrial scan-ners are expensive, relatively slow, and produce verylarge amounts of data, many roboticists use cheaper2D scanners to acquire 3D point clouds. This is usu-ally achieved by mounting the scanner on a vehicle andmoving through the environment. Successive 2D scansare registered in the world-frame to produce a 3D pointcloud. Typical approaches include mounting the scan-ner in a fixed vertical plane [Fruh and Zakhor, 2001;Thrun et al., 2000], or actuating the scanner about someaxis: either nodding [Cole et al., 2005] or rotating [Bosseand Zlot, 2009a].

The point clouds we use are derived from an in-housespinning 2D SICK laser scanner with the spin axis alongthe centre beam of the scan plane [Bosse and Zlot,2009a]. The rotating platform is mounted on a vehi-cle (Figure 2), and driven around semi-industrial andoff-road woodland environments (Figure 1). The scan-ner generates a forward-facing hemispherical scan sweepfrom each one-second half-revolution. This unusual scanpattern generates very uneven sampling densities whenthe scans are globally registered [Bosse and Zlot, 2009b],and some slight registration errors persist from the scanmatching process thereby introducing additional noise.The approach introduced in this paper, however, is notdependent on the scanning modality, and has been suc-cessfully applied to 3D point cloud data acquired usingother laser configurations as well.

In general, this method of acquiring 3D point cloudscan introduce significant noise into the mapping pro-cess due to vibration and uncertainty in the pose ofthe robot. The sampling density is also typically un-even due to varying linear and angular rates of the ve-hicle, as well as density drop-off with distance for ac-tuated scanners. Pose error can also lead to ghostingand surface diffusion when a feature is composed of mul-

(a) Woodland Scene (b) Light-industrial Scene

Figure 1: Two representative environments for which seg-mentations are produced and evaluated in the paper

tiple overlapping scans. These problems present a diffi-cult challenge when designing a segmentation algorithm.Indeed, many existing segmentation methods are ren-dered unsuitable for use on such data as they rely tooheavily on even and dense sampling in the point cloud;for example, segmenting based on local surface nor-mal and curvature estimates [Huang and Menq, 2001;Rabbania et al., 2006]. In addition, many such methodsemploy intensive pre-processing steps to clean the datawhich can limit their efficiency. Therefore, the challengewe faced in this work was to develop a segmentationmethod which naturally dealt with the mixed samplingdensity and noise present in our dataset, yet was efficientenough to bring it within reach of on-line application.

We now review some recent related work in the area.Rusu et al. [2008] use outlier removal, robust plane es-timation, persistent feature histograms and re-samplingto limit the effects of noise and uneven sampling den-sity. However their work is mainly concerned with mod-elling planar surfaces in high resolution scans of indoorenvironments, and the processing is conceived of as ini-tially taking place offline. Our work foregoes such pre-processing steps in favour of efficiency.

Unnikrishnan and Hebert [2003] use projection pur-suit and robust estimators to extract planar structuresfrom point clouds similar to those we examine. Clus-tering is performed at the voxel level via a mean-shiftin parameter space, followed by further mode extractionin projection space to separate aligned but offset planes.By contrast, our shape-distance measure natively sepa-rates dominant modes in normal and Euclidean space asregions grow, without sequential processing steps (thesemeasures will be introduced in Section 2.4). Their ap-proach is also less general than ours since it considersnon-planar structure to be clutter, where we extractcylindrical and spheroidal structure explicitly.

Vandapel et al. [2004] use neighbourhood covarianceand filtering techniques to first classify points and thengenerate segments. Whilst our approach uses a similarlocal-shape analysis for the initial step, we do not classifyshape, but instead use the descriptors as parameters inone of our distance measures to estimate neighbourhood


Figure 2: Data is collected from the spinning laser mountedon the roof of a Toyota Prado 4WD.

similarity thereby guiding the region growing process.Stamos and Allen [2002] segment relatively dense

point clouds using a local planar similarity constraintwith the goal of discovering lines of intersection betweenplanes for image registration purposes. The planarityconstraint is similar to one component of our shape-distance measure, and is used to segment smoothly vary-ing surfaces of buildings from the point cloud. The au-thors fit a plane to each of the produced segments underan assumption of global planarity, however, the planarityconstraints only act locally (where ours are global), andthe authors do not mention whether global planaritytests are performed. Their paper also does not considercylindrical or diffuse distributions, which are prominentin natural and outdoor environments.

Melkumyan [2009] uses an efficient surface segmenta-tion method based on region continuity and edge-pointdetection. The approach requires the scan order of thepoints to be known, unlike the present method, and itis unclear how effective the technique would be for thespinning laser configuration used in this paper.

The remainder of the paper is organised as follows.In Section 2, we cover our approach to the problem. Wefirst provide background on the ellipsoid modelling, thendefine the distance measures used, and finally describean overview of the algorithm. In Section 3 we coverthe AIC ranking methodology and in Section 4 discusssome results in the context of their AIC ranking andtheir subjective qualities. Examples illustrate the resultswhere appropriate. Finally in Section 5 we summarisethe work and look at future research directions.

2 Approach

Our method simultaneously segments and models apoint cloud through a graph-based ellipsoidal region-growing process. At the heart of the method is a mini-mum spanning-tree (MST) implementation which growsellipsoidal segments from initial ellipsoids as the tree ex-pands. A maximum edge-weight condition is imposed to

Figure 3: Ellipsoid shape-spectrum, coloured by RGB =√

[c, s, p]. Spherical shapes are at top, planar shapes lower-left, cylindrical shapes lower-right, and scalene (amorphous)ellipsoids are near the centre.

prevent bad merges from occurring and this truncatesthe tree leaving the graph (and hence the point cloud)segmented into discrete regions. These regions typicallyrepresent objects or sub-components of objects in theenvironment. The resulting set of ellipsoids is a param-eterised model of the point cloud data, and the labellingof points belonging to each ellipsoid is a segmentationof the point cloud. Two methods of edge-weight calcu-lation are presented, and defined as generalised distancemeasures on the ellipsoids in Section 2.4. These mea-sures lead to different segmentation styles, and varyingthe maximum edge-weight threshold for each leads todifferent granularities of segmentation.

2.1 Ellipsoids

Ellipsoids are used as the only shape primitive becausethey naturally model planar, cylindrical, and spheroidaldistributions such as walls, tree-trunks and foliage, re-spectively. Another compelling reason for modellingwith ellipsoids is that they can be efficiently generateddirectly from the moment statistics of the underlyingpoint distribution.

An ellipsoid is defined as the set of points x which fallwithin the region:

(x − µ)T Σ−1(x − µ) < 1 (1)

where µ is the centroid and Σ is the covariance of theellipsoid, which is given by

Σ =1

k

k∑

i=1

(pi − µ)T

(pi − µ) (2)

where pi = (xi, yi, zi) is the ith point, and k is the totalnumber points in the point set. The eigenvectors of Σ


give the principal axes of the ellipsoid, and the square-root of the eigenvalues gives the one-sigma axis-lengths.Furthermore, we can classify how planar (p), cylindrical(c), or spherical (s) an ellipsoid is by looking at the therelative sizes of the eigenvalues as:

p = 2e2 − e3

e1 + e2 + e3

(3)

c =e1 − e2

e1 + e2 + e3

(4)

s = 1 − p − c =3e3

e1 + e2 + e3

(5)

where e1, e2, e3 are the descending sorted eigenvaluesof the ellipsoid’s covariance matrix (equivalent to thesquared axis lengths). These shape descriptors wereoriginally developed for the purpose of 3D scan match-ing [Bosse and Zlot, 2009a], and in this paper we applythem to define a distance measure between two ellipsoidsin Section 2.4.

The ellipsoid shape spectrum (Figure 3) demonstratesthe versatility of using ellipsoids as a model. More com-plex shapes can be approximated reasonably well bygroupings of these shapes, and thus we consider the el-lipsoid shape spectrum to be a sufficient basis on whichto model the world. A limitation of our approach is thatsmoothly varying non-planar regions, or curved cylindersare typically either broken into multiple segments, or ap-proximated with a high variance by a single ellipsoid.Other researchers have used local smoothness criteriato perform the segmentation [Stamos and Allen, 2002;Rabbania et al., 2006] and achieve a low-variance modelbut at the expense of a high number of parameters.

Two or more ellipsoids can be easily and efficientlymerged into a new ellipsoid by combining their momentstatistics and re-centralising the covariance about thenew centroid. This ability to incrementally combine themodel parameters, instead of recomputing them at eachmerge, is fundamental to the efficiency of our region-growing approach. Also, by simultaneously modelingand segmenting with the same abstraction, we precludethe need for separate segmentation and modeling stages,and thus obtain a further increase in efficiency.

This method of (directly) generating the model pa-rameters is not robust to outliers and these can arbitrar-ily attract the mean and inflate/skew the principal axesof the covariance, affecting the segmentation process.

We intend to look into using a more robust pa-rameter estimator, such as Minimum Volume Ellip-soid [Rousseeuw and Zomeren, 1990], but this approachwould be less efficient to compute, and would require asignificant improvement in the results to be justified.

2.2 Initial Processing

The ten nearest neighbours of each point are found usinga kd -tree, and we remove isolated points whose closest

neighbour distance is above two metres. This reducesthe effect of outliers on the segmentation process andalso reduces the computation time. Initial ellipsoids aregenerated about each point from the eigendecompositionof the sample covariance Σ of the local neighbourhood.The union of intersecting local neighbourhoods forms agraph on the point cloud, where each point is a ver-tex and neighbouring vertices defining an edge. Typi-cal scenes contain disjoint graphs where regions are spa-tially separated. The MST algorithm operates on thesegraphs.

Initial edge weights for the MST algorithm are com-puted based on the Euclidean distance between the cen-troids of the neighbouring ellipsoids, with additionalweight applied to planar ellipsoids depending on theirplanarity, degree of normal misalignment and planar off-set. Thus, lower edge weights correspond to ellipsoidslikely to be part of the same region (and vice versa).

2.3 Segmentation and Modelling

The point cloud is segmented and modeled by progres-sively and selectively merging connected ellipsoids intolarger surfaces or regions. We use a thresholded versionof Kruskal’s MST algorithm—a greedy region growingalgorithm—to perform the segmentation.

Edges are processed in ascending order of initialweight, and ellipsoids are merged only if the generaliseddistance (edge-weight) between the merge candidatesdoes not exceed a user-defined threshold. Emerging sub-trees of the MST are modelled as ellipsoids resulting fromthe merging of all the initial ellipsoids at the sub-tree ver-tices. Hence the algorithm attempts to construct maxi-mally similar regions by only merging sufficiently similarellipsoids. Enforcing a maximum edge-weight (distance)condition truncates the growth of the minimum spanningtree, and causes the point cloud to remain segmentedinto separate connected regions.

There are four major design decisions for segmenting apoint cloud with a MST: the representation, the distancemeasure for calculating the initial edge weights, the dis-tance measure for re-calculating the edge weights basedon bulk properties (if any), and finally, the distancethreshold for adding an edge to the tree (i.e., for merg-ing regions). The first three are structural choices whichdetermine the ‘style’ of segmentation, and the fourth isa user tunable parameter (or set of parameters) whichdetermines the granularity of segmentation. Finding thebest set of choices is essentially a multi-dimensional opti-misation problem. We use an information theoretic mea-sure to help navigate the solution space as it provides aconsistent method of scoring the resulting models.

2.4 Distance Measures

We present and evaluate two different distance measuresfor comparing two ellipsoids. Each measure leads to


qualitatively and quantitatively different segmentationswhen used in the present algorithm. The first measure,density-distance, sets a threshold on the minimum ‘den-sity’ of the merged ellipsoid and attempts to maintaina good fit to the data. The second measure, shape-distance, utilises relative shape, orientation, position andmass information, and can grow regions across missingor sparse data.

Density Distance Since we are segmenting a pointcloud via an evolving model of the data, we considerthat the model should remain faithful to the underly-ing point distribution by fitting it closely with minimalempty space. Analogous to maintaining a high samplingdensity in the ellipsoids, density-distance is defined sothat ellipsoids are allowed to merge only if the post-merge sample density remains above a threshold.

The density ρ of an ellipsoid is defined as ρ = m/V ,where m, V are the mass (a function of the sample count)and volume of the ellipsoid respectively. Uneven sam-pling density in our data means that setting the massequal to the sample count does not lead to sensible seg-mentations. To mitigate this problem we assume thatmost points lie on a surface, and normalise the surfacesample density by setting the mass (m′) of each point tothe square of the mean neighbour distance

m′ =

(

1

k

k∑

i=1

|p − pi|

)2

(6)

where k is the number of neighbours (excluding self), pis the point under consideration, and pi is the ith neigh-bouring point

Then, given two ellipsoids a and b to be merged, thedensity of the merged ellipsoid ρab is given by ρab =(m′

a + m′

b) /Vab, where m′

a and m′

b are the normalisedmasses of ellipsoids a and b and Vab is the volume of themerged ellipsoid.

Shape-Distance This measure adapts the s, p andc shape parameters (defined in Section 2.1) to a three-component distance measure that generates a dissimilar-ity (distance) between two ellipsoids. This measure wasdesigned to overcome the problems of uneven samplingdensity and noise by comparing merge candidates basedon their dominant distribution characteristics: mass, lo-cation, shape, and orientation. It is more general thantypical measures for plane extraction since it nativelyhandles comparisons between any ellipsoids and hencebetween any combination of planar, cylindrical, spheri-cal, and intermediate distributions. By definition, thismeasure assumes that the segments should be topologi-cally homogeneous (e.g., globally planar segments are lo-cally planar), and thus shape-distance estimates how un-

likely it is that two ellipsoids belong to the same region.A threshold on the shape-distance determines which el-lipsoids are too dissimilar to be merged.

Planar ellipsoids are given a distance based on howout-of-plane and misaligned they are. Cylindrical ellip-soids have a distance based on how off-axis and mis-aligned they are. Finally, spherical ellipsoids are given adistance related only to the Euclidean distance betweenthe centroids (since there is no preferred orientation orposition). Most ellipsoids have a shape identity some-where between these extremes, so we weight each of theshape-specific component measures by how planar, cylin-drical and spherical the ellipsoids are respectively, andnormalise with respect to the relevant eigenvalue. Shape-distance Dshape is computed as

Dshape = Dplanar + Dcylindrical + Dspherical (7)

where

Dplanar =||vT

3a(∆µ)||2

e3a

na

na + nb

+

||vT3b(∆µ)||2

e3b

nb

na + nb

+

(1 − ||vT3av3b||

2)

pana+pbnb

na+nb

e3a

e2a

+ e3b

e2b

Dcylindrical =(1 − pa)||vT

2a(∆µ)||2

e2a

na

na + nb

+

(1 − pb)||vT2b(∆µ)||2

e2b

nb

na + nb

+

(1 − ||vT1av1b||

2)

cana+cbnb

na+nb

e2a

e1a

+ e2b

e1b

Dspherical =(sa)||vT

1a(∆µ)||2

e3a

na

na + nb

+

(sb)||vT1b(∆µ)||2

e3a

nb

na + nb

where, for distributions a and b, ∆µ is the vector be-tween the centroids, via,ib and eia,ib are the ith eigen-vector and ith eigenvalue of the covariance, na,b are thesample counts, and pa,b, ca,b, sa,b are the planarity, cylin-dricality, and sphericity measures.

3 Evaluation Using AIC

Akaike’s Information Criterion (AIC) is an informationtheoretic criterion developed by Hirotsugu Akaike [1974]

for ranking statistical models based on how they bal-ance bias (the number of model parameters) and vari-ance (modelling error). AIC does not require the exis-tence of ground-truth, and scores candidate models ona relative scale which allows for comparisons betweenmodels.


Given a particular data set and a selection of modelswhich attempt to explain the data, AIC assigns a scoreto each model as

AIC = −2 log (L) + 2K (8)

where log(L) is the is log of the maximised value of thelikelihood function for the estimated model, and K isthe number of free parameters in the model. For anellipsoid we have K = 9; i.e., 3 for the mean, and 6 forthe covariance. The model with the lowest AIC score isthe one which best explains the data with a minimumnumber of free parameters, and may be considered thebest model of the data (from the given selection). Weuse a standard variant of AIC, known as AICc, whichcorrects for low bias estimates when there are a relativelylarge number of estimated parameters compared to thenumber of samples:

AICc = −2log(L) + 2K

(

n

n − K − 1

)

(9)

where n is the number of samples [Burnham and Ander-son, 2001]. AICc converges asymptotically to AIC as ngrows. We use AICc since many of the ellipsoids haverelatively small sample counts compared to the numberof parameters. In practice we see some degenerate el-lipsoids with fewer than K + 2 points so, to prevent thebias term from going to infinity or negative we set a lowerlimit on the denominator term as max(1, n − K − 1).

AIC must be evaluated with respect to the the Max-imum Likelihood (ML) model parameters. Since themodelling ellipsoids are effectively multivariate Gaussiandistributions then the ML parameters for our model aresimply the mean and covariance of the segments, whichwe have already computed.

Using the formulation given by Bozdogan [1986], thevariance term in Equation 9 for a multivariate Gaussiandistribution is:

−2 log(L) = n (d log(2π) + log |Σ| + d) (10)

where d = 3 is the number of dimensions. Thus AICccan be written as

AICc = n (d log(2π) + log |Σ| + d) + 2K

(

n

n − K − 1

)

(11)Many features of interest (such as planes and cylin-

ders) have relatively uniform sample distributions alongone or more principal axes, and this invalidates the as-sumption of normal error distribution. However, theseresiduals are not of great importance since (for exam-ple) we would typically care only about the out-of-planeresiduals for planar features, and the off-axis residuals for

thin cylindrical features. Hence we modulate the vari-ance along each eigenvector according to the shape ofthe distribution (again utilising p, c, s). This is achievedby first decomposing Σ into principal components S andV with

Σ = V SV T (12)

where V is the matrix of eigenvectors of Σ, and S is thediagonal matrix of eigenvalues of Σ (in descending orderof size). We then modulate the eigenvalues in S by theshape of the ellipsoid, and impose a lower bound of 0.01(the square of our measurement noise estimate) on theeigenvalues to prevent degeneracy when p = 1 or c = 1:

Smod = diag (e1s, e2 (1 − p) , e3) (13)

where e1, e2 and e3 are the descending sorted eigenval-ues, and s = 1 − p − c. The covariance matrix is finallyrecomposed as

Σmod = V SmodVT (14)

To see how the shape modulation works, consider thecases where p = 1, c = 1 and s = 1 (i.e., p = 0, c = 0).For the p = 1 case, only the out of plane variance (alonge3) is significant, for the c = 1 case the variance hase2 and e3 components, perpendicular to the major axis,and for the s = 1 case all three components are utilised.

To calculate the overall AICc score for a modeledscene, the AICc score for each ellipsoid is computed sep-arately, and these scores are summed. When compar-ing algorithms, the algorithm which generates the lowestAICc score for a given scene can be considered to havethe best trade-off between fit and bias.

4 Experimental Results

The segmentation technique was applied to point cloudsfrom a number of different environments, including lightindustrial and off-road lightly wooded scenes (as shownin Figure 6). For each environment, 25 seconds of scandata were segmented with varying distance thresholds,using both distance measures. The AICc score was com-puted for the resulting models in each case. The opti-mal threshold for each distance measure (on each scene)was selected as that which minimised the AICc score ofthe produced model. Some example plots of AICc vs.threshold for the light-industrial scene are shown in Fig-ure 4, where the minima are clearly seen. The optimalthreshold selected by AICc for shape-distance appearsto be quite stable across different environments, as in-dicated by the approximately coincident minima in Fig-ure 7. This desirable property may negate any require-ment to re-tune the threshold for each local point cloud,although preliminary experiments indicate that different


(a) Point Cloud - Woodland - 115,828points

(b) Point Cloud - Light Industrial -150,250 points

(c) Shape Distance - Minimum AICcSegmentation - 12,391 Ellipsoids

(d) Shape Distance - Minimum AICcSegmentation - 7,173 Ellipsoids

(e) Density Distance - Minimum AICcSegmentation - 4,057 Ellipsoids

(f) Density Distance - Minimum AICcSegmentation - 5,061 Ellipsoids

Figure 6: Comparison of segmentations produced by using shape-distance and density-distance at their minimum AICc(optimal) threshold values for two different environments. The left column is an off-road woodland scene, and the rightcolumn is a light industrial scene. See Figure 1 for photographs.

sensor arrangements have different optima. While wehave seen similar results with other distance measures,further investigation is required to identify whether thisoutcome extends to all distance measures in general.

Plots of AICc vs. merge completion (Figure 5) revealthat AICc minima occur very far into the merge pro-

cess (generally 80% completion or higher). This reflectsAIC’s preference for low complexity models that explainthe most data.

Representative Segmentations Figure 6 shows theminimum AICc threshold segmentations for each dis-


0

500000

1e+06

1.5e+06

2e+06

0 0.5 1 1.5 2 2.5 3

AIC

c S

core

(un

itles

s)

Threshold Value

AICc vs. Threshold (Shape Distance, Light-Industrial Scene),Minimum at (0.13162, 8.949e+05)

Variance TermBias Term

AICc = Variance + Bias

(a) AICc vs. Threshold - Shape Distance, IndustrialScene the minimum AICc occurs at a threshold valuenear 0.13

0

500000

1e+06

1.5e+06

0.001 0.01 0.1 1 10 100

AIC

c S

core

(un

itles

s)

Threshold Value

AICc vs. Threshold (Density Distance, Light-Industrial Scene),Minimum at (1.7567, 1.3584e+05)

Variance TermBias Term

AICc = Variance + Bias

(b) AICc vs. Threshold - Density Distance, IndustrialScene the minimum occurs at a threshold value ofaround 1.7

Figure 4: AICc score vs. threshold parameter plots used forselecting optimal threshold values in each case, AICc minimaare indicated. Segmentations at the optimal threshold valuesappear in Figure 6. Note that the measures are more easilycompared using the relative merge completion in Figure 5.

tance measure in two representative environments, cor-responding to the photos in Figure 1. The left-handcolumn in Figure 6 shows an off-road woodland scenecomposed of a ground-plane, cylindrical tree trunks, anddiffuse vegetation. The right-hand column shows a lightindustrial scene composed primarily of planar buildingfacade and road surfaces, as well as some trees on theright hand side of the road. The shape-distance mea-sure shows good generalisation and tends to extractslarge-scale features, however this comes at the expenseof higher variance, and a higher number of ellipsoids;whereas density-distance maintains a tight fit with fewerellipsoids at the expense of large-scale structure. Den-sity distance typically produces lower AICc segmenta-tions with fewer ellipsoids as a result of the better fitto the data and the lack of geometric constrains onsegment growth. The colouration of the ellipsoids is

500000

1e+06

1.5e+06

2e+06

0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1

AIC

c S

core

(un

itles

s)

Relative Merge Completion

AICc vs. Merge Completion (Density Distance and Shape Distance, Light-Industrial Scene)

Shape DistanceDensity Distance

(a) AICc vs. Relative Merge Completion - Shape Dis-tance and Density Distance, Forest Scene

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

0.7 0.75 0.8 0.85 0.9 0.95 1

AIC

c S

core

(un

itles

s)

Relative Merge Completion

AICc vs. Merge Completion (Density Distance and Shape Distance, Woodland Scene)

Shape DistanceDensity Distance

(b) AICc vs. Relative Merge Completion - Shape Dis-tance and Density Distance, Industrial Scene

Figure 5: Plots of AICc score versus relative merge comple-tion ratio (number of points/number of merges) show thatthe AICc minima occur very late in the merge process, laterin the case of density distance and with a lower AICc score.This effect is due to AIC’s preference for low-complexity mod-els.

based on their shape according to Figure 3. Shape-distance leads to intuitively appealing segmentations byemphasising dominant structures, particularly in well-structured environments such as the industrial scene.Density-distance leads to subjective over-segmentation,but gives better data compression by providing a wellfitted model with fewer parameters. Thus the choice ofmeasure to use in a given situation depends on whetheran accurate model is more important than a good seg-mentation. Figure 5 compares the measures based onAICc scores versus model simplicity for each scene. Thedensity-distance measure clearly generates the most par-simonious models.

Woodland Scene This scene is dominated by adiffuse tree canopy and well-connected ground-plane.Density-distance appears to deal with the diffuse regionsbetter than shape-distance and leads to a simpler model


0 0.5 1 1.5 2 2.5 3 3.5 40.5

1

1.5

2

2.5

3

3.5

4

4.5x 10

6 AICc vs. Threshold (Shape Distance, All Scenes)

Threshold Value

AIC

c S

core

(un

itles

s)

MixedIndustrialWoodlandRoadwayIndustrial

Figure 7: The optimal threshold parameter for shape-distance selected by AICc is stable across very different typesof environments, as evidenced by the similar AICc minima.In addition to the example industrial and woodland scenes,we also include results from a road scene and a more struc-tured industrial scene.

of such regions in general. The ground plane is composedof tessellated ellipsoids due to point-cloud density vari-ations. This effect could be suppressed by re-samplingthe point cloud to a more uniform density, but so far wehave tried to limit pre-processing of the data as muchas possible. Shape-distance performs extremely well onplanar surfaces and this is evident as the ground-planehas been extracted as a single segment. Adjacent tree-trunks tend to be merged by shape-distance in this scene(although some of the effects are due to smearing fromscan mis-registration). The cylinder merging effect canbe reduced (if desired) by increasing the cylindrical offsetweighting in the measure (see Equation 8).

Industrial Scene This scene is typical of light indus-trial building complexes and is dominated by the pla-nar features of the roadway and building walls, with amoderate amount of vegetation in garden areas. Shape-distance generates a very simple model of the dominantfeatures in such environments and even does a fair jobof clustering compact canopy structures into spheroidalsegments. Many small ellipsoids remain unmerged (notrendered) and the total model complexity remains higherthan density distance. By contrast, the density-distancemeasure leads to a better model of the data with fewerparameters, but does not reveal large-scale structure.

Issues Diffuse objects generally have no good model(except globally perhaps) and are arbitrarily segmented

based on the emerging shapes of the seed regions.This property makes shape-distance a worse choice thandensity-distance for scenes composed of many such dis-tributions. Curved objects and surfaces tend to be over-segmented or represented by a single ellipsoid with highvariance. Initial ellipsoids are not good models of sharpfeatures and tend toward more spherical shapes that canaffect the resulting model in various (detrimental) ways.

5 Conclusion and Future Work

In this paper we sought to show that a segmenta-tion/model which was generated by manipulating in-termediate models of the data can be evaluated on aconsistent basis by evaluating the model (rather thanthe produced segments). AICc has been demonstratedto be a powerful and consistent method of model selec-tion and parameter tuning. We also presented a novelshape-distance measure which generates a dissimilaritybetween a pair of ellipsoids in Euclidean space.

The shape-distance measure was shown to give moreintuitively pleasing segmentations than density-distanceat the cost of comparatively poor model fit and highAICc scores. Density-distance maintains a good fit tothe data, but subjectively over-segments. The resultsillustrate the benefit of using AICc for parameter tuningby selecting sensible threshold values.

The sampling characteristics of our current sensor re-strict us to fairly coarse global models at this stage, andwe are looking at using a different sensor with higher res-olution to help generate finer-grained models with morediscriminative power. We will also investigate the effectsof additional pre-processing of the point cloud to furtherremove noise, and perhaps generate more robust initialellipsoids, before segmenting the data. Post-processingthe results might also achieve a more desirable model;for example, by removing small ellipsoids to reduce localover-segmentation.

Future work will attempt to further verify the validityof the AICc threshold tuning method theoretically.

One issue that can confound the segmentation algo-rithm is the interaction of multiple overlapping objects.This problem is particularly apparent in the woodlandenvironments we investigated, where a sparse shrub ap-pears at the base of a tree. In some cases—for instancewhere one component is more porous, or one passesthrough another—additional analysis may be able to re-cover the independent components, thereby allowing usto separate these objects into multiple overlapping seg-ments (e.g., a cylindrical tree trunk within a sphericalshrub). Another issue is that relatively small holes inplanar surfaces, such as windows and open doorways,may also be ignored in the model in favour of fitting alarger wall segment to the data. This problem is chal-lenging since, in the case of a hole due to occlusion, it


would be preferable to fill in such a gap. This trade-offcan be considered when designing an appropriate dis-tance measure, and depends on the relative importanceof model compactness and accuracy.

As these algorithms are presently implemented in Mat-lab, they are not yet suitable for online application. How-ever, we envision that this goal could be achieved byre-implementation in a more efficient programming lan-guage.

Overall, the modeling and segmentation techniquesdeveloped appear to form a good basis for future work inobject classification and change detection in the contextof outdoor autonomous robotics applications. Anotherpotential area of application is in data association forSLAM. The current approach we have taken uses the rawellipsoids extracted from the point cloud data [Bosse andZlot, 2009a], and we plan to further investigate whetherthe merged ellipsoids are sufficiently stable such thatthey can be used to improve data association efficiency.

References

[Akaike, 1974] H. Akaike. A new look at the statisticalmodel identification. Automatic Control, IEEE Trans-actions on, 19(6):716–723, December 1974. originalpaper published in 1971.

[Bosse and Zlot, 2009a] M. Bosse and R. Zlot. Contin-uous 3D scan-matching with a spinning 2D laser. InIEEE International Conference on Robotics and Au-tomation, 2009.

[Bosse and Zlot, 2009b] M. Bosse and R. Zlot. Placerecognition using regional point descriptors for 3Dmapping. In Field and Service Robotics, 2009.

[Bozdogan, 1986] H. Bozdogan. Multi-sample clusteranalysis as an alternative to multiple comparison pro-cedures. Bulletin of Informatics and Cybernetics,22(1):95–130, 1986.

[Burnham and Anderson, 2001] K. P. Burnham andD. R. Anderson. Kullback-Lebler information as a ba-sis for strong inference in ecological studies. WildlifeResearch, 28(2):111–119, May 2001.

[Cole et al., 2005] D. Cole, A. Harrison, and P. New-man. Using naturally salient regions for SLAM with3D laser data. In International Conference on Roboticsand Automation, SLAM Workshop, 2005.

[Fruh and Zakhor, 2001] C. Fruh and A. Zakhor. 3Dmodel generation for cities using aerial photographsand ground level laser scans. In IEEE ComputerSociety Conference on Computer Vision and PatternRecognition, volume 2, 2001.

[Haala and Brenner, 1999] N. Haala and C. Brenner.Extraction of buildings and trees in urban environ-

ments. ISPRS Journal of Photogrammetry and Re-mote Sensing, 54(2-3):130–137, 1999.

[Huang and Menq, 2001] J. Huang and C.-H. Menq. Au-tomatic data segmentation for geometric feature ex-traction from unorganized 3-D coordinate points.IEEE Transactions on Robotics and Automation,17(3):268–279, June 2001.

[Klasing et al., 2008] K. Klasing, D. Wollherr, andM. Buss. A clustering method for efficient segmen-tation of 3D laser data. In IEEE International Con-ference on Robotics and Automation, 2008.

[Melkumyan, 2009] N. Melkumyan. Surface-based Syn-thesis of 3D Maps for Outdoor Unstructured Environ-ments. 2009.

[Rabbania et al., 2006] T. Rabbania, F. A. van denHeuvelb, and G. Vosselman. Segmentation of pointclouds using smoothness constraint. In ISPRS Com-mission V Symposium ‘Image Engineering and VisionMetrology’, Dresden, Germany, September 2006.

[Rousseeuw and Zomeren, 1990] P. J. Rousseeuw andB. C. van Zomeren. Unmasking multivariate outliersand leverage points. Journal of the American Statis-tical Association, 85(411):633–639, 1990.

[Rusu et al., 2008] R. B. Rusu, Z. C. Marton,N. Blodow, M. Dolha, and M. Beetz. Towards3D Point cloud based object maps for householdenvironments. Robotics and Autonomous Systems,56(11):927–941, 2008.

[Slob and Hack, 2004] S. Slob and R. Hack. 3D terres-trial laser scanning as a new field measurement andmonitoring technique. Lecture Notes in Earth Sci-ences, pages 179–189, 2004.

[Stamos and Allen, 2002] I. Stamos and P. K. Allen. Ge-ometry and texture recovery of scenes of large scale.Computer Vision and Image Understanding, 88:94–118, November 2002.

[Thrun et al., 2000] S. Thrun, W. Burgard, and D. Fox.A real-time algorithm for mobile robot mapping withapplications to multi-robot and 3D mapping. In IEEEInternational Conference on Robotics and Automa-tion, 2000.

[Unnikrishnan and Hebert, 2003] R. Unnikrishnan andM. Hebert. Robust extraction of multiple structuresfrom non-uniformly sampled data. In IEEE/RSJ In-ternational Conference on Intelligent Robots and Sys-tems, 2003.

[Vandapel et al., 2004] N. Vandapel, D. F. Huber,A. Kapuria, and M. Hebert. Natural terrain classi-fication using 3-D ladar data. In IEEE InternationalConference on Robotics and Automation, 2004.


Automatic Segmentation of 3D Laser Point Clouds …Automatic Segmentation of 3D Laser Point Clouds...

Documents

Transcript of Automatic Segmentation of 3D Laser Point Clouds …Automatic Segmentation of 3D Laser Point Clouds...