A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR...

8
17 Abstract We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs from existing methods in the following aspects: First, we consider point visibility with respect to a given viewpoint, and use only visible points for interpolation; second, we present a multi- resolution depth map based visibility computation method; third, we present ray casting methods for upsampling mobile LiDAR data incorporating constraints from color information of spherical images. The experiments show the effectiveness of the proposed approach. 1. Introduction There are various ways to measure range. For instance, stereo vision aims at creating range maps at the resolution of a camera image from two or more images. This method is known to be quite fragile in practice due to the difficulty of establishing correct correspondence in cases of sparse texture or occlusion in the images. Laser range cameras can generate accurate and dense 3D points reliably but the resolution of a range image is normally lower than that of a camera image. Due to the nature of scanning the scene line by line, laser range cameras are not suited for the acquisition of dynamic scenes. Recent new sensors such as time of flight (TOF) cameras can capture a full frame depth at video frame rates. However, the image resolution of such sensors is low and the level of noise is very high [5]. This paper addresses the problem of upsampling LiDAR (Light Detection And Ranging) data using panoramic images, both collected from a moving vehicle. We focus on scanning laser range finders since they are currently the only viable sensors for high-resolution range data acquisition in outdoor environments. Other sensors such as flash LiDARs do not work in bright sunlight or at long distances [7]. Outdoor environments are more challenging than well- controlled laboratory environments. Fig.1 shows a typical scene in an urban environment and its related problems. The first is the see-through problem in the LiDAR data. Figure 1 See-through problem and invalid LiDAR points returned from building interior Fig.1a and Fig.1b show the image and LiDAR data viewed from the same viewpoint respectively. One can clearly see the side face of the building from the LiDAR data in the Fig.1b which is not visible in the corresponding intensity image in Fig.1a. The reason is that LiDAR data are sparse and one can see occluded objects through the gaps in the LiDAR points. Upsampling this type of LiDAR data is problematic for any of the existing upsampling methods. The second problem arises from the invalid LiDAR points returned from the building interior. In urban settings, windows without curtains often return signals from the interior. Fig.1c shows a top-down view of the same building of Fig. 1b. One can clearly see the erroneous LiDAR points as indicated by the red arrows. This paper proposes a new upsampling method which effectively deals with these problems. The novelty of this paper is A new upsampling method using panoramic images incorporating LiDAR point visibility information A new method to compute LiDAR point visibility with respect to a given viewpoint by using multi-resolution depth maps generated from a Quadrilateralized Spherical Cube A new interpolation scheme that computes A New Upsampling Method for Mobile LiDAR Data Ruisheng Wang, Jeff Bach, Jane Macfarlane NAVTEQ Corporation 425 W Randolph Street, Chicago, IL, USA {ruisheng.wang, jeff.bach, jane. macfarlane}@navteq.com Frank P. Ferrie Centre for Intelligent Machines, McGill University, Montreal, Quebec, Canada [email protected]

Transcript of A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR...

Page 1: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

17

Abstract

We present a novel method to upsample mobile LiDAR

data using panoramic images collected in urban environments. Our method differs from existing methods in the following aspects: First, we consider point visibility with respect to a given viewpoint, and use only visible points for interpolation; second, we present a multi-resolution depth map based visibility computation method; third, we present ray casting methods for upsampling mobile LiDAR data incorporating constraints from color information of spherical images. The experiments show the effectiveness of the proposed approach.

1. Introduction There are various ways to measure range. For instance,

stereo vision aims at creating range maps at the resolution of a camera image from two or more images. This method is known to be quite fragile in practice due to the difficulty of establishing correct correspondence in cases of sparse texture or occlusion in the images. Laser range cameras can generate accurate and dense 3D points reliably but the resolution of a range image is normally lower than that of a camera image. Due to the nature of scanning the scene line by line, laser range cameras are not suited for the acquisition of dynamic scenes. Recent new sensors such as time of flight (TOF) cameras can capture a full frame depth at video frame rates. However, the image resolution of such sensors is low and the level of noise is very high [5].

This paper addresses the problem of upsampling LiDAR (Light Detection And Ranging) data using panoramic images, both collected from a moving vehicle. We focus on scanning laser range finders since they are currently the only viable sensors for high-resolution range data acquisition in outdoor environments. Other sensors such as flash LiDARs do not work in bright sunlight or at long distances [7].

Outdoor environments are more challenging than well-controlled laboratory environments. Fig.1 shows a typical scene in an urban environment and its related problems. The first is the see-through problem in the LiDAR data.

Figure 1 See-through problem and invalid LiDAR points

returned from building interior Fig.1a and Fig.1b show the image and LiDAR data

viewed from the same viewpoint respectively. One can clearly see the side face of the building from the LiDAR data in the Fig.1b which is not visible in the corresponding intensity image in Fig.1a. The reason is that LiDAR data are sparse and one can see occluded objects through the gaps in the LiDAR points. Upsampling this type of LiDAR data is problematic for any of the existing upsampling methods. The second problem arises from the invalid LiDAR points returned from the building interior. In urban settings, windows without curtains often return signals from the interior. Fig.1c shows a top-down view of the same building of Fig. 1b. One can clearly see the erroneous LiDAR points as indicated by the red arrows.

This paper proposes a new upsampling method which effectively deals with these problems. The novelty of this paper is

• A new upsampling method using panoramic images incorporating LiDAR point visibility information

• A new method to compute LiDAR point visibility with respect to a given viewpoint by using multi-resolution depth maps generated from a Quadrilateralized Spherical Cube

• A new interpolation scheme that computes

A New Upsampling Method for Mobile LiDAR Data

Ruisheng Wang, Jeff Bach, Jane Macfarlane

NAVTEQ Corporation 425 W Randolph Street, Chicago, IL, USA

{ruisheng.wang, jeff.bach, jane. macfarlane}@navteq.com

Frank P. Ferrie Centre for Intelligent Machines, McGill University, Montreal, Quebec, Canada

[email protected]

Page 2: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

18

interpolated points using Ray-Plane and Ray-Triangle Intersection.

1.1. Related Work There appears to be relatively little work in using co-registered intensity images to upsample range data, at least in comparison to pure image-based super-resolution, e.g., [9, 10, 11]. One of the first attempts reported was based on Markov Random Fields (MRFs) [1, 2, 3, 4]. A common assumption here is that depth discontinuities in a scene often co-occur with color or brightness changes within the associated camera images. The problem occurs when a depth discontinuity is not visible in the color channel. Yang et al. [5] proposed an iterative bilateral filtering method for enhancing the resolution of range images. The authors also compared their approach with MRF, showing that this method allows for sub-pixel accuracy. Andreasson et al. [6] compared five different interpolation schemes with the MRF method [3] and summarized four different metrics for confidence measures of interpolated range data. The assumption in their approach is similar to that of the MRF method which uses color similarity as an indication of depth similarity. In [8] points are projected onto segmented color images, and bilinear interpolation is used to compute the depth value of the grid samples that belong to the same region. [7] addresses the problem of upsampling range data in dynamic environments based on a Gaussian framework. There is some work that does superresolution from depth data only. Basically the goal is to enhance the resolution by using depth maps of a static scene that were acquired from slightly displaced viewpoints [12, 13].

2. Data Acquisition Data is collected by NAVTEQ using the data collection vehicle shown in Fig. 2. This mobile mapping system is composed of a 360 degree LiDAR sensor (Velodyne HDL-64E), a Ladybug 3 video camera, six high-resolution cameras, GPS, Inertial Measurement Unit (IMU), and Distance Measurement Instrument (DMI). The Velodyne LIDAR sensor consists of 64 lasers mounted on upper and lower blocks of 32 lasers each and the entire unit spins. This design allows for 64 separate lasers to each fire thousands of times per second, generating over one million points per second. The Ladybug 3 covers more than 80 percent of a full sphere, with six high quality 1600x1200 Sony CCD sensors, and provides up to 12 MP images at 15 Frames Per Second (FPS). All of these sensors are geo-referenced through a GPS and IMU.

Figure 2 Data collection vehicle “NAVTEQ True”

3. The Method This research work focuses on man-made large-scale outdoor environments; the method consists of the following steps. The first is the refinement of the registration of LiDAR and panoramic images. Then multi-resolution depth maps are generated from the LiDAR data by using a Quadrilateralized Spherical Cube mapping [16]. Based on this, point visibility with respect to a given viewpoint is computed. All the visible points are then projected onto the images. Finally, the interpolated points are computed using one of two ray casting methods. Each of these steps is explained in greater detail in the following subsections.

3.1. LiDAR-to-Image Registration The LiDAR and image data collected from the

“NAVTEQ TRUE” platform are co-registered. However, due to system calibration errors and IMU drift, there is often a misalignment between the LiDAR points and images, which is a common problem in many current mobile mapping systems. The accurate registration between LiDAR points and images is crucial for image-based upsampling methods. The registration accuracy must be improved. We first convert the geodetic coordinates into Earth-Centered, Earth-Fixed (ECEF) coordinates [14], and then transform into local tangent plane (LTP) coordinates. The panoramic images are mapped onto a sphere and viewed through a single perspective (the center of the Ladybug camera) to create linear perspective images. Each LiDAR point

Tzyxp ),,(= in LTP coordinates is converted to a spherical coordinate system ),( ϕθ by

,),(2tan),arccos(

222xya

zyxz =

++= ϕθ (1)

where θ is the inclination [ ]),0( πθ ∈ , and φ is the azimuth ]),(( ππϕ −∈ . Each point’s corresponding location in the panoramic image ),( cr is computed by

Page 3: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

19

r = int[θπ

* H],c = int[(ϕ2π

+ 0.5) *W ] , (2)

where H and W are the height and width of the panoramic images respectively. We then project the spherical image and LiDAR points onto a plane using OpenGL rendering. The Mutual Information (MI) metric [15] is employed to measure the similarity of the range image and the camera image in each iteration of the registration algorithm. We utilize the downhill simplex method to infer camera pose parameters in six degrees of freedom (6DOF) by maximizing the MI of the range and camera images. Fig.3 shows an example of this registration. The left picture shows the original panoramic image, and the right shows the LiDAR-to-Image registration rendered on a plane.

Figure 3 An example of the LiDAR-to-Image registration. Left: Original panoramic image Right: Registration of the LiDAR and panoramic image displayed on a plane

3.2. Quadrilateralized Spherical Cube To generate depth maps for the data collected from the rotating 360 degree Velodyne laser, we need a projection which can preserve photometric integrity, and provide equal “attention" to every direction of view. The desired projection also does not produce singularities at the poles or elsewhere, a problem with alternative mappings (e.g., cylindrical equal area projection and plate carrée projection). The Quadrilateralized Spherical Cube (QSC), developed in [16], is an ideal projection that satisfies these requirements. It is an equal-area projection for mapping data collected on a spherical surface using a curvilinear projection [16].

3.3. Depth Map Generation Depth Map Generation We start by dividing the sphere (representing directions of view) into 6 equal areas which correspond to the faces of an inscribed cube with vertices |x| = |y| = |z|. Then each face of the cube is further divided into square bins, where the number of bins along each edge is a power of 2. If the level of hierarchical subdivision is N, the number of bins on each face is 22N and the total number of bins is 6*22N. To generate depth maps with different resolutions, one can simply choose an appropriate value of N to increase or decrease the size of each bin. In this paper, we use N=8 for generating a high-resolution depth map and N=5 for a low resolution one. For simplicity, we give the mapping equations for the case

of |z| > |x| > |y|. A point on a cube face (u, v) can be represented by

.)

)1(2

sin()tan((12

,

2

11

2

2

2

2222

xyx

yaxyauv

xyzyx

zu

+

−=

+

−++

−=

π

(3)

Depth maps are generated at panoramic image locations. All LiDAR points are converted into local coordinates centered at the panoramic image locations and then mapped onto the bins in the six cube faces. If multiple points fall into the same bin, the point with minimum distance to the image location is chosen to represent the depth map. Since the QSC mapping is an equal area projection, the approximate width of each bin wi mapped by a unit sphere can be computed by Equation (4), .

6*0.2 π

Nwi = (4)

This bin size will be used for adaptive thresholding in later steps. Outlier Removal Depth maps are assumed to be piecewise-smooth, hence outliers need to be removed before proceeding to visibility computation. LiDAR points falling in each bin are first transformed into depth values by computing their distance to the camera center, and then sorted in ascending order. Starting with the point associated with the minimum depth value, a distance difference histogram is computed with respect to all other points in the bin. The latter is used to gauge the number of proximal points, based on an adaptive threshold that is the product of wi and current minimum depth. If the percentage of proximal points is greater than the prescribed threshold (60% experiments reported here), then the current minimum depth value is taken as the depth of the current bin. Otherwise the process is repeated with the next depth value on the list until a suitable candidate is found. This serves to filter spurious points arising from signal noise and bleed through (e.g. windows).

Page 4: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

20

(a) (b) (c) (d) Figure 4 Visibility computations. (a) The viewing direction. (b) Visible points (red color) without considering the outlier removal rendered from another viewpoint. (c) Visible points (red color) with the outlier removal rendered from another viewpoint. (d) A top-down view of (c).

3.4. Visibility Computation One way to compute point visibility is to reconstruct a surface and then determine the visibility of each point based on the reconstructed surface (triangular mesh). There are many surface reconstruction algorithms in the computer graphics and computational geometry communities, e.g. [22]. However, surface reconstruction from noisy point clouds turns out to be a more difficult problem which often requires additional information, such as normals and sufficiently dense point input. Another alternative is to compute visibility of point clouds directly without explicitly reconstructing the underlying surface [23, 24]. Our method belongs to this category. The purpose of generating depth maps is to compute visibility information for each LiDAR point. As described in the introduction, the “see-through” points and invalid LiDAR points returned from building interiors make upsampling methods problematic. Our solution to this problem is to use only visible LiDAR points with respect to the viewpoint to do the upsampling. We use a smoothness assumption that objects are spatially coherent in any local regions to remove such points. The first step is to build a coarse resolution depth map to divide cube faces. This resolution guarantees that the minimum depth point in each bin is from building façades not building interiors. In our experiments, we found this resolution can effectively eliminate LiDAR points returned from the building interior so that a correct depth map can be generated. Then we build a high-resolution cubic map to record all the points falling into each bin. The visibility assumption rests on the observation that points that are visible in the high resolution map will be within a particular threshold of the depth of the corresponding low resolution bin. Hence visibility is implicitly determined in the high resolution map by eliminating those points in each bin which exceed a prescribed distance threshold. For the experiments reported in this

paper, the latter threshold is proportional to the real size of each bin in 3D space1. Fig. 4 shows the results of visibility computation. Fig. 4a shows the viewing direction on a Ladybug image, and Fig. 4b the visible points in red without considering the outlier removal. The white circles indicate the locations of anomalies (holes) caused by the outliers. This is corrected in Fig.4c; notice that the side face of the building is invisible as it should be. Fig. 4d is a top-down view of Fig. 4c which shows all the interior points are invisible.

3.5. Interpolation We use all the visible points with respect to a given viewpoint for interpolating LiDAR points and present two methods: ray-plane and ray-triangle intersection.

Ray-Plane Intersection To incorporate color information from camera images, we first process the image with an efficient graph-based segmentation algorithm [17] to identify the main surfaces of the scene. All the visible points with respect to the viewpoint are then projected onto the segmented image. We localize all the projected points in terms of each segmented region, and then trace all the projected points back to 3D space. A robust RANSAC plane fitting algorithm [18] is then applied to obtain the plane equation. Here the underlying assumption is that each segmented region corresponds to a plane in 3D space which is reasonable for man-made objects such as buildings or walls and hallways in an indoor environment. For each pixel in the segmented region, we compute a 3D line vector in 3D space passing through the camera center. The corresponding 3D point must be located along this line and can be computed from the intersection of this 3D line with the corresponding 3D plane. In addition, the point density for each region (#3D points / #pixels) is computed and used to inhibit interpolation for small data samples.

1 In this paper we solved a simplified visibility problem that is

suitable for our building models. In the future work we will implement a complete algorithm for the visibility computation.

Page 5: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

21

Ray-Triangle Intersection Here we first project all the visible points onto the image and record the pairing relationship between every projected 2D point and its corresponding 3D point. Then a 2D Delaunay Triangulation (DT) is constructed by using all the projected points. Since the DT results in a convex full polygon, redundant triangles must be eliminated for the further process. To do so, we compute the average perimeter of all the triangles using their corresponding 3D points instead of using the projected 2D points. Thus we avoid the perspective distortion and the average perimeter of the triangles is perspective invariant which well represents the real geometry. The triangles with perimeters larger than the average are eliminated. For each of the remaining 2D triangles, a bounding box is created. We use a fast implementation of the point in triangle test in the 2D case to identify the image pixels within each triangle. Then for each pixel, we compute a 3D line vector in 3D space passing through the camera center. The intersection of the rays with the corresponding 3D triangles will be the interpolated LiDAR points.

4. Experiments and Discussion

4.1. Implementation and Experimental Setup The implementation is in C++. All the LiDAR data

and intensity images are in binary format and loaded by using internal libraries. We use an efficient implementation of RANSAC from the Mobile Robot Programming Toolkit [19], and the GNU scientific library [20] for linear algebra calculations. We use the Computational Geometry Algorithms Library (CGAL) [21] for the implementation of Delaunay Triangulation, and the code from [17] for the graph-based segmentation.

We evaluated our algorithm on four data sets. In Fig. 5, Data 1-3 are typical buildings in urban settings. Data 4 is a street scene. The original LiDAR data is subsampled to 1.56% of the original set by using the LiDAR points from a single laser instead of the 64 laser array, or 10% depending on the fact that the sparse LiDAR points should be able to represent a certain level of detail of the objects. Another benefit of using the data from a single laser is that it avoids the intra-calibration errors of the Velodyne HDL-64E sensor. The original LiDAR data excluding the subsampled points are used as ground truth. The first row in Fig. 5a-5c shows the original LiDAR data in which one can clearly see the “see-

through” points. The second shows the corresponding Ladybug images. Note that since the data are collected from a ground based acquisition system, to render a complete façade image, the viewing directions have to be pointing from bottom to top.

The interpolated points may be a good estimate of the actual LiDAR data or it might deviate substantially from the true value. A confidence measure for the correctness of the interpolation is desired. To evaluate the results quantitatively, we build a KD tree using the original LiDAR data excluding the subsampled points. For each interpolated point, we query its nearest neighbor point from the KD tree and then computed their distance. We use proximity to the nearest LiDAR data as a confidence measure. If an interpolated point is close to a point where a real LiDAR measurement is available, we consider this interpolation more reliable.

4.2. Experimental Results Fig. 6 shows the experimental results. Fig. 6a shows the subsampled LiDAR points and segmentation of the corresponding Ladybug images. Fig. 6b shows the interpolated results. Fig. 6c shows the visualization of the confidence measure, and Fig. 6d shows the synthesized views. Rows 1a, 2a, and 3a show the results from Ray-Plane Intersection (RP), and Rows 1b, 2b, 3b, and 4 show the results from Ray-Triangle Intersection (RT). The RP interpolation uses color information from the segmented Ladybug images, while the RT interpolation does not use color information. From the results in Fig. 6, we can see that the accurate geometric information is preserved in the interpolated points by using RP method. This is evident in the building corner in Fig. 6b (1a), and the flag pole on the top of the building in Fig. 6b (3a). Some windows leave holes in the interpolated points since the interpolation is constrained by the point density. The point density in window areas is tricky. The windows with curtains have LiDAR points while those without curtains often leave holes on the façade. The point density is also related to the segmentation results. Without the use of color information, the interpolation is constrained by the perimeter of the triangles. The areas of the building corner in Fig. 6b (1b) and the flag pole in Fig. 6b (3b) are erroneous. Fig. 6 (4) shows the interpolation on the data of a street scene using the RT method.

Page 6: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

22

(a) (b) (c) (d)

Figure 5 Data. (a) Data1. (b) Data 2. (c) Data3. (d) Data 4

(1a)

(1b)

(2a)

(2b)

(3a)

(3b)

(4) (a) (b) (c) (d)

Figure 6 Experimental results. a) Subsampled LiDAR data or segmentation of the corresponding Ladybug images. (b) Interpolated LiDAR points and corresponding close up of the scene. (c) Visualization of the confidence measure

(d) Synthesized views

Page 7: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

23

Table 1 Result from the experiments

Subsampled

Data

Methods

Statistics Outliers m σ 0.2 0.4 0.6 0.8

1.56% of the Data 1

RP 0.085 0.08 0.12 0.005 0.0001 0.0000

RT 0.089 0.71 0.09 0.03 0.02 0.01

1.56% of the Data 2

RP 0.085 0.09 0.1 0.01 0.002 0.001

RT 0.13 3.59 0.19 0.07 0.02 0.00410% of the Data 3 RP 0.096 0.13 0.1 0.03 0.01 0.008

RT 0.101 0.25 0.12 0.03 0.01 0.00810% of the Data 4 RT 0.035 0.067 0.013 0.004 0.002 0.0009

For the confidence measure in Fig. 6c, the dark points mean the interpolations are close to the real LiDAR data. The bright points mean the interpolations deviate from the actual data. The most deviations occur around window areas, where the interpolated points fill up the windows most of which are holes in the real LiDAR data. For all the interpolated points, a mean m and a standard deviation σ are computed using Euclidean distances between an interpolated point and its closest point in the original data to measure the quality of the interpolation. The smaller mean and standard deviation values indicate better interpolations. The ratio of outliers for different thresholds (0.2, 0.4, 0.6, 0.8 meters) are also computed as listed in Table 1. In general, the RP method performs better than the RT method in a scene containing many planar structures. The RP method has smaller mean and standard deviations.

Figure 7 gaps on the street in the textured interpolated points

4.3. Limitations There are limitations in the current implementation of

the approach. Planar Structure Assumption It is hard to have a flexible model with many degrees of freedom that is completely general. In the ray-plane Intersection method, we employ a planar structure assumption that each segmented region corresponds to a planar structure in 3D space. Since most urban objects are locally planar, such

as most of the buildings, roads, signs, this assumption is reasonable for urban environments. For buildings with complicated shapes, the planar structure assumption can still be a first-order approximation of an arbitrary surface. In the RT method, we do not use any assumption or color information. This method achieves better interpolation accuracy in the case where the input is comprised of relatively dense LiDAR points, and works better than RP method when the planar structure assumption is violated. Gaps in Depth Map Depth maps are generated in this paper using a Quadrilateralized Spherical Cube mapping which projects all LiDAR points onto six cube faces. This mapping will lead to gaps in horizontal objects such as roads in the depth map since the data are collected at street level. The camera, which is the projection center in the depth map generation, sits just above the ground at a couple of meters. The visibility computation is also based on this representation. Fig. 7 shows the textured interpolated points using the RT method. Note that the dark gaps along the street are curved strips because of the curvilinear projection, and would be straight strips under perspective projection. Our method also requires an accurate registration between LiDAR and images. Misalignment will lead to artifacts and errors in the interpolated points.

5. Conclusions and Future Work We present a new upsampling method for mobile LiDAR data. The input is a set of sparse LiDAR points and their corresponding spherical images, with a dense point set as output. Although the algorithm is designed for mobile LiDAR data, it is sufficiently general to apply to more generic range data with similar constraints. The major contributions of this paper are: 1) incorporation of visibility information and intensity data (where available) in the upsampling procedure, 2) the use of multi- resolution depth maps generated from a Quadrilateralized Spherical Cube mapping to compute point visibility, and 3) a comparison of two interpolation methods for upsampling sparse LiDAR points.

Page 8: A New Upsampling Method for Mobile LiDAR Data · We present a novel method to upsample mobile LiDAR data using panoramic images collected in urban environments. Our method differs

24

Our method is based on a deterministic formulation of the problem. There are certainly some aspects of the algorithm where the improvements can be made. For instance, we solve the problem of outlier removal in the depth map using a simple model. In practice, the desired algorithm should be able to intelligently differentiate foreground and background, and identify whether points are outliers or foreground by statistical modeling methods. This would be very helpful for improving visibility computation. Second, the strategy of our interpolation is a uniform sampling in 2D space but a non-uniform one in 3D space. We can also improve current results by using a uniform sampling in 3D space. To do so, we need to project each relevant region boundary to 3D space, and form a 3D bounding box for each projected boundary. A uniform sampling can then be performed by choosing an appropriate resolution in 3D space. Third, we can relax the planar structure restriction and assume that each segmented region corresponds to a smooth 3D surface. By doing so, we hope that the RT method will be applicable to more situations where the planar assumption is violated. We also plan to incorporate color information in the RT method. Here the idea is to use the segmentation boundary to aid in building a constrained 2D Delaunay Triangulation on the projected 2D LiDAR points. Upsampling based on a constrained Delaunay Triangulation would effectively remove erroneous interpolations as shown in Fig. 6b (1b) and Fig. 6b (3b). We hope that this will increase the accuracy of the RT method, and eventually will result in more accurate building model generation. Acknowledgement The first author would like to thank Roman Ostrovskiy for his help and discussions. References [1] Luz A. Torres-Méndez and Gregory Dudek. Range

Synthesis for 3D Environment Modeling In 2002 IEEE Workshop on Applications of Computer Vision, pp. 231--236, Orlando, FL, USA

[2] Luz Abril Torres-Méndez and Gregory Dudek. Reconstruction of 3D models from intensity images and partial depth. Proceeding American Association for Artificial Intelligence (AAAI), 2004, pp. 476-481.

[3] J. Diebel and S. Thrun. An application of markov random fields to range sensing. In Proceedings of Conference on Neural Information Processing Systems (NIPS), Cambridge, MA, 2005. MIT Press.

[4] Zhaoyin Jia, Yao-Jen Chang, Tzung-Han Lin, and Tsuhan Chen, "Dense 3D-Point Estimation Based on Surface Fitting and Color Information," 2009 Western New York Image Processing Workshop, Henrietta, NY, USA, Sept. 25, 2009.

[5] Q. Yang, R. Yang, J. Davis, and D. Nist´er. Spatial-depth super resolution for range images. In CVPR, 2007.

[6] H. Andreasson, R. Triebel, and A. J. Lilienthal. Noniterative Vision-based Interpolation of 3D Laser Scans, volume 76 of Studies in Computational Intelligence, pages 83–90. Springer, Germany, Aug 14 2007.

[7] Dolson, J.; Jongmin Baek; Plagemann, C.; Thrun, S. Upsampling range data in dynamic environments. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.

[8] V. Garro, C. Dal Mutto, P. Zanuttigh, and G. M. Cortelazzo. A novel interpolation scheme for range data with side information. In Proc. of CVMP Conf., London, UK, November

[9] S. Farsiu, M. Robinson, M. Elad, and P. Milanfar. Fast and robust multiframe super resolution. IEEE Transactions on Image Processing, 13(10):1327–1344, Oct. 2004.

[10] S. Borman and R. L. Stevenson. Super-resolution from image sequences - a review. Proc. Midwest Symp. Circuits and Systems, 5, 1998.

[11] M. Irani and S. Peleg. Improving resolution by image registration. CVGIP: Graph. Models Image Process., 53(3):231---239, 1991.

[12] Schuon, S. Theobalt, C. Davis, J. Thrun, S. LidarBoost: Depth superresolution for ToF 3D shape scanning. IEEE conference on Computer Vision and Pattern Recognition, 2009.

[13] Y. Kil, B. Mederos, and N. Amenta. Laser scanner super-resolution. Eurographics Symposium on Point-Based Graphics, 2006.

[14] http://en.wikipedia.org/wiki/Geodetic_system [15] P. Viola and W. Wells III. Alignment by maximization of

mutual information. In Proceedings of IEEE International Conference on Computer Vision, pages 16---23, 1995

[16] P. Chan, F.K. and O'Neill 1975. Feasibility Study of a Quadrilateralized Spherical Cube Earth Data Base, Computer Sciences Corp., EPRF Tech. Report 2-75. Prepared for the Environmental Prediction Research Facility, Monterey, California.

[17] P. Felzenszwalb, D. Huttenlocher. E_cient Graph-based Image Segmentation. International Journal of Computer Vision, Vol. 59, No. 2, Pages 167-181, September 2004.

[18] Martin A. Fischler and Robert C. Bolles 1981. "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography". Comm. of the ACM 24: 381–395.

[19] http://www.mrpt.org/ [20] http://www.gnu.org/software/gsl/ [21] http://www.cgal.org/ [22] B. Curless and M. Levoy, A volumetric method for

building complex models from range images," SIGGRAPH '96, pp. pp. 303-312.

[23] S. Katz, A. Tal, and R. Basri, Direct visibility of point sets," ACM Trans. Graph.,vol. 26, July 2007.

[24] R. Mehra, P. Tripathi, A Sheer, and N.J. Mitra, Visibility of noisy point cloud data," Comput. Graph., vol. 34, pp. 219-230, June 2010