Multistage SFM : Revisiting Incremental Structure from Motionrajvi.shah/... · Rajvi Shah1, Aditya...
Transcript of Multistage SFM : Revisiting Incremental Structure from Motionrajvi.shah/... · Rajvi Shah1, Aditya...
Multistage SFM : Revisiting Incremental Structure from Motion
Rajvi Shah1, Aditya Deshpande1,2, and P J Narayanan1
Performance Evaluation
Coarse Reconstruction
K-cover Model Computation
Mean SIFT
Computation
Direct 3D-2D
Localization
Find Candidate Image Pairs
Guided Feature
Matching
Triangulation
& Merging
2. Add Cameras 3. Add Points
1. Coarse Model
Reconstruction Full Reconstruction
top η% SIFTs Remaining SIFTs
Goal: Pose-estimation of the cameras unlocalized
in the coarse reconstruction stage.
Cover-set: Compute a subset of 3D points that
cover the cameras and compute their mean SIFTs.
Create a Kd-tree of image SIFTs and search the
mean SIFTs of 3D points in this tree one by one.
Pose-estimate the camera using 2D-3D matches.
Fast but contains fewer points (Coarse)
Most cameras are localized (Global).
Stable due to incremental Bundle Adjustment.
Point co-visibility among cameras is known.
Epipolar geom. of localized cameras is known.
Properties of Coarse Model
Analysis of reconstructed features by scales in models
Extract Features
Feature Matching
Geometric Verification
Match-graph
Construction
1. SFM for 2 images
2. Pose-estimate Image
3. Triangulate Features
Incremental Structure
from Motion (SFM)
Traditional Pipeline for Incremental SFM Multistage SFM with embarrassingly parallel point and camera addition stages
Sort SIFT features based on scales
Pairwise match high-scale η% Features
Reconstruct model using robust SFM
Li et al. ECCV’10, Sattler et al. ICCV’11, ECCV’12, Chowdhary et al. ECCV’12
Goal: Triangulate the features of localized images.
Instead of O(n2) pairwise matching, match each
image with only k candidate images (O(nk)), using
the co-visible 3D points between images.
Use epipolar geometry between the localized
cameras for guided matching and form tracks.
Merge the tracks using connected-components
algorithm and triangulate the points.
Epipolar geometry guided search for fast feature matching
Shah et al. WACV’15
Ours (η = 20%) Bundler Visual SFM+ PM
Dataset Cameras Points Cameras Points Cameras Points
Pantheon Int. 538 241K 574 241K 466 52K
Pantheon Ext. 780 211K 782 211K 777 117K
St. Peters Int. 926 416K 950 416K 901 105K
St. Peters Ext. 1126 495K 1154 495K 1138 123K
Visual SFM Our Multistage Approach Bundler
Dataset With PM 200-cores 8-cores 1-core 1-core
Pantheon Int. 19m 26m 69m 6h 48m 1d 12h
Pantheon Ext. 110m 60m 97m 12h 43m 6d 15h
St. Peters Int. 81m 51m 107m 15h 13m 5d 21h
St. Peters Ext. - 121m 181m 1d 8h 12d 2h
Input Images
http://cvit.iiit.ac.in/projects/multistagesfm/ This work is supported by Google PhD Fellowship and IDH Project of DST.
3D point cloud of
CGM/cover-set
Mean SIFTs
of 3D Points 2D
Image Featu
res
SIFTs of 2D Feat. Kd-tree of Image Feat.
Establish 3D-2D correspondences Pose-estimate the camera
1CVIT, IIIT Hyderabad, India
Camera Addition (Localization)
Point Addition (Triangulation)
Fraction of connected image pairs at
different stages of our pipeline vs.
VisualSFM with Preemptive Matching.
Postponing the matching of (100 – η) %
features after coarse modeling allows to,
Use co-visibility for selecting fewer
candidate images for matching
Perform geometry-guided matching
that is both faster and produces denser
correspondences as compared to zero-
knowledge feature matching.
Advantages
2UIUC, Illinois, USA
Goal: To break the sequentialiy of incremental SFM for faster reconstruction.
Achieved by first reconstructing a coarse model and enriching it in stages.
Coarse model is reconstructed quickly using only a few image features.
The coarse model is made dense by adding cameras and points in later stages.
More cameras are added to the model using direct 3D-2D localization.
More points are added to the model using geometry-guided matching.
The point and camera addition stages are fast, independent, and parallel.
As a result, our method produces denser point clouds in less time.
Coarse Global Model (CGM)
Colosseum : η = 20%
#Cam: 1657, #Pts:967K
Pantheon Ext. : η = 20%
#Cam: 780, #Pts:211K
St. Peters Int. : Coarse Model
#Cam: 800, #Pts:54K
St. Peters Int. : Full Model
#Cam: 889, #Pts:420K
4. Bundle Adjust &
Repeat from 2.