RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments

Post on 22-Feb-2016

151 views 1 download

Tags:

description

RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments. Peter Henry 1 , Michael Krainin 1 , Evan Herbst 1 , Xiaofeng Ren 2 , and Dieter Fox 1,2 1 University of Washington Computer Science and Engineering 2 Intel Labs Seattle. RGB-D Data. Related Work. SLAM - PowerPoint PPT Presentation

Transcript of RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments

1

RGB-D Mapping:Using Depth Cameras for Dense 3D Modeling of

Indoor Environments

Peter Henry1, Michael Krainin1, Evan Herbst1,Xiaofeng Ren2, and Dieter Fox1,2

1University of Washington Computer Science and Engineering2Intel Labs Seattle

2

RGB-D Data

3

4

Related Work SLAM

[Davison et al, PAMI 2007] (monocular) [Konolige et al, IJR 2010] (stereo) [Pollefeys et al, IJCV 2007] (multi-view stereo) [Borrmann et al, RAS 2008] (3D laser) [May et al, JFR 2009] (ToF sensor)

Loop Closure Detection [Nister et al, CVPR 2006] [Paul et al, ICRA 2010]

Photo collections [Snavely et al, SIGGRAPH 2006] [Furukawa et al, ICCV 2009]

5

System Overview

1. Frame-to-frame alignment2. Loop closure detection and global consistency3. Map representation

6

Alignment (3D SIFT RANSAC)

SIFT features (from image) in 3D (from depth)RANSAC alignment requires no initial estimateRequires sufficient matched features

7

RANSAC Failure Case

8

Alignment (ICP)

Does not need visual features or colorRequires sufficient geometry and initial estimate

9

ICP Failure Case

10

Joint Optimization (RGBD-ICP)

Fixed feature matches Dense point-to-plane

11

Loop ClosureSequential alignments accumulate errorRevisiting a previous location results in an

inconsistent map

12

Loop Closure DetectionDetect by running RANSAC against previous

framesPre-filter options (for efficiency):

Only a subset of frames (keyframes)Every nth frameNew Keyframe when RANSAC fails directly against

previous keyframeOnly keyframes with similar estimated 3D posePlace recognition using vocabulary tree

13

Loop Closure CorrectionTORO [Grisetti 2007]:

Constraints between camera locations in pose graph

Maximum likelihood global camera poses

14

16

Map Representation: Surfels

Surface Elements [Pfister 2000, Weise 2009, Krainin 2010]

Circular surface patchesAccumulate color / orientation / size informationIncremental, independent updatesIncorporate occlusion reasoning750 million points reduced to 9 million surfels

17

18

Experiment 1: RGBD-ICP16 locations marked in environment (~7m

apart)Camera on tripod carried between and placed at

markersError between ground truth distance and the

accumulated distance from frame-to-frame alignment

α weight 0.0 (ICP) 0.2 0.5 0.8 1.0 (RANSAC)

Mean error (m)

0.22 0.19

0.16

0.18

1.29

19

Experiment 2: Robot ArmCamera held in WAM arm for ground truth pose

MethodMean

Translational Error

(m)

Mean Angular

Error (deg)

RANSAC .168 4.5ICP .144 3.9RGBD-ICP (α=0.5) .123 3.3RGBD-ICP + TORO

.069 2.8

20

Map Accuracy: Architectural Drawing

21

Map Accuracy: 2D Laser Map

22

ConclusionKinect-style depth cameras have recently become

available as consumer productsRGB-D Mapping can generate rich 3D maps using

these camerasRGBD-ICP combines visual and shape information

for robust frame-to-frame alignmentGlobal consistency achieved via loop closure

detection and optimization (RANSAC, TORO, SBA)Surfels provide a compact map representation

23

Open QuestionsWhich are the best features to use?How to find more loop closure constraints between

frames?What is the right representation (point clouds, surfels,

meshes, volumetric, geometric primitives, objects)?How to generate increasingly photorealistic maps? Autonomous exploration for map completeness?Can we use these rich 3D maps for semantic

mapping?

24

Thank You!