Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester...

61
Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology Army Research Laboratory, MD Date: 09/02/2009

Transcript of Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester...

Page 1: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Automated 3D Object Modeling from Aerial Video Imagery

Prudhvi Krishna GurramPh.D. Student

Chester F. Carlson Center for Imaging Science,

Rochester Institute of Technology

Army Research Laboratory, MD

Date: 09/02/2009

Page 2: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Outline

Introduction Motivation Research objectives Approach & Results

Pre-processing step Building stereo mosaics 3D object identification 3D object modeling

Summary and conclusions Future work recommendations

Prudhvi Gurram, Research Seminar04/19/23 2

Page 3: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Introduction Applications of physically realistic 3D scenes

Military applications include target area simulation and moving target detection

Civilian applications include damage assessment in case of natural disasters

Other applications include medical imaging, robotic vision Reconstructed 3D scenes must conform to real scenes in terms of

Geometry Radiometry

RIT has the Digital Imaging and Remote Sensing Image Generation (DIRSIG) tool which can create spectrally accurate synthetic imagery by simulating different sensor types.

DIRSIG requires pre-defined 3D geometrical scene with spectra assigned to each facet of the scene

04/19/23Prudhvi Gurram, Research Seminar

3

Page 4: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Motivation

04/19/23Prudhvi Gurram, Research Seminar

4

High-Resolution VideoHigh-Resolution VideoLidar DataLidar Data

Spectral ImagerySpectral Imagery

Spectrally-Accurate Spectrally-Accurate Scene ModelScene Model

Rapidly construct radiometrically-correct scene models based on

multi-sensor data for use in DIRSIG synthetic scene generation

Page 5: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Motivation Objective

Extraction of 3D geometry of a scene from aerial video over a large scene

Possible Approaches Manual Interpretation of Stereo Imagery (Very intensive and time

consuming for large areas in the order of days or even months) Automated processing of video frames to build stereo mosaics for the

extraction of 3D geometry

Combine this with information from Lidar to improve the accuracy of the 3D Scene

Combine the 3D coordinates with material properties from Hyperspectral imaging to render a 3D Scene which conforms both geometrically and radiometrically to real world

04/19/23Prudhvi Gurram, Research Seminar

5

Page 6: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Research Objectives

Build stereo mosaics from video frames over large scenes

Identify 3D objects like buildings and trees in the scene using stereo mosaics

Accurately model 3D buildings in the scene Improve the accuracy of 3D object

identification and modeling by fusing Lidar data with visual imagery

04/19/23Prudhvi Gurram, Research Seminar

6

Page 7: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Approach

04/19/23Prudhvi Gurram, Research Seminar

7

Video Fram

es

Pre-processing of

the video frames

Ray Interpolation

3D Object Identificatio

n and Modeling

3D Models

Exterior Orientation (EO) and Interior Orientation (IO)

parameters

Orientation-corrected video

frames

Stereo Mosaics

Inputs

Intermediate output and visual aid

Final output

Page 8: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

8

Video Fram

es

Pre-processing of

the video frames

Ray Interpolation

3D Object Identificatio

n and Modeling

3D Models

Exterior Orientation (EO) and Interior Orientation (IO)

parameters

Orientation-corrected video

frames

Stereo Mosaics

Page 9: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Initial Video FramesExterior and Interior Orientation parameters

04/19/23Prudhvi Gurram, Research Seminar

9

Rotation matrix

Camera centeror

Viewpoint

1R

1T

2R

2T

NR

NT

Video Camera

100

/0

0/

p

p

ydf

xdf

K

Interior orientation matrix

Page 10: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

10

Video Fram

es

Pre-processing of

the video frames

Ray Interpolation

3D Object Identificatio

n and Modeling

3D Models

Exterior Orientation (EO) and Interior Orientation (IO)

parameters

Orientation-corrected video

frames

Stereo Mosaics

Page 11: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Pre-processing of Video Frames

04/19/23Prudhvi Gurram, Research Seminar

11

Correct the orientation of the frames so that all the frames have same orientation (nadir looking).

Observed motion parallax of objects is due to translational motion of camera only.

)()( TPRTPRRRP worldworld

A world point can be expressed in camera coordinate system with Rotation matrix R and camera center at T as

worldP

World coordinate system ↓

Camera coordinate system

Page 12: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Pre-processing (contd…)

04/19/23Prudhvi Gurram, Research Seminar

12

The image coordinates in any frame i are transformed by matrix A, to observe only translational motion in the sensor

)(11'iworldimiimim TPKPKKRAPP

iii

TPKR

k

ky

kx

P worldim

The image coordinates of this point are given by Interior Orientation parameters embedded in matrix K

Camera coordinate system ↓

Image coordinate system

Page 13: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

13

Video Fram

es

Pre-processing of

the video frames

Ray Interpolation

3D Object Identificatio

n and Modeling

3D Models

Exterior Orientation (EO) and Interior Orientation (IO)

parameters

Orientation-corrected video

frames

Stereo Mosaics

Page 14: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

14

Parallel Ray Interpolation Why are we using Parallel Ray Interpolation?

To convert the view from perspective view to parallel-perspective view Simulating a linear pushbroom camera To use motion parallax information (while creating mosaics) To make the stereo mosaics seamless

Perspective view Parallel view

Page 15: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Perspective ViewParallel ViewParallel View from Perspective View

Using Fixed Lines

Parallel and Perspective views

04/19/23Prudhvi Gurram, Research Seminar

15

Page 16: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

16

Ray InterpolationViewpoint 2Viewpoint 1

InterpolatedViewpoint

Image (Mosaic) Plane

Point in the image planefrom viewpoint 1

Point in the image planefrom viewpoint 2

Point in the image plane from the interpolated viewpoint

Acknowledgement:

Zhigang Zhu et al.,

City College of New York,

New York City, NY

Page 17: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

17

PRISM (Parallel Ray Interpolation for Stereo Mosaicing)

Frame 1 Frame 2

Page 18: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

18

Frame 1:

Frame 2:

Fixed Line

Fixed Line

Overlapped Region

Fixed Lines

Image Frame

Fast PRISM

Page 19: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

19

Fra

me

1

Fra

me

2

Source Triangles

Fast PRISM

Des

tinat

ion

Tria

ngle

sin

the

Lef

t M

osai

c

Page 20: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

20

Motion Parallax

Frame 1 Frame 2Interpolated Frame(before triangulation)

Can happen with low-flying aircraft and high-rise buildings

Page 21: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

21

Interpolation

Frame 1 Frame 2Overlay of Frames 1 and 2Interpolation

Page 22: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

22

Interpolation

Frame 1 Frame 2Interpolation

Page 23: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

23

Triangulation Problem

Frame 1 Frame 2Interpolated Frame(after triangulation)

Page 24: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

24

Modified PRISM

Make sure that none of the triangles include regions with different motion parallax

Find edges of different regions and align the sides of triangles with the edges

Use segmentation to obtain different planar surfaces

The inner boundary of each segment forms an edge of a region/object

Page 25: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

25

Overlapped region

Frame 1:

Frame 2:

Page 26: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

26

Segmented images

Segmented Frame 1:

Segmented Frame 2:

Page 27: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

27

Frame k

Segments in Overlapped Region

One of the segments

Significant points using Convex Hull around the segment

Triangulation

Page 28: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

28

Frame k+1Matching curve

The other part of the segment between matching curve and fixed line

Triangulation

Page 29: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

29

On the Mosaic

“Orphan” Pixels Orphan pixels filled

• Using a constraint inherent in the Parallel-Perspective view

• Parallel view in dominant motion direction and Perspective view in the other direction

• Do not consider motion parallax along x direction

• In 3D translational case, use sequential linear interpolation to fill the orphan pixels

X

Y

Direction

Page 30: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – Set 1

04/19/23Prudhvi Gurram, Research Seminar

30

MotionParallax

Frame 1 Frame 2

Fast PRISM Modified PRISM

Page 31: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – Set 2

04/19/23Prudhvi Gurram, Research Seminar

31

Fast PRISM Modified PRISM

Publications:1. P. Gurram, E. Saber, and H. Rhody, “A Novel Triangulation Method for Building Parallel-Perspective

Stereo Mosaics”, Proceedings of Electronic Imaging Symposium, SPIE, San Jose, CA, January 2007.2. P. Gurram, E. Saber, and H. Rhody, “Segment-based Mesh Design for Building Parallel-Perspective

Stereo Mosaics”, Accepted for publication in IEEE Transactions on Geoscience and Remote Sensing

Page 32: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

32

Stereo Mosaic – Modified PRISM

Page 33: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

33

Video Fram

es

Pre-processing of

the video frames

Ray Interpolation

3D Object Identificatio

n and Modeling

3D Models

Exterior Orientation (EO) and Interior Orientation (IO)

parameters

Orientation-corrected video

frames

Stereo Mosaics

Page 34: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

3D Object Identification and Modeling (Deterministic Approach)

Build a nadir mosaic along with left and right stereo mosaics with fixed line looking in nadir direction

Use image segmentation to identify the various homogeneous surfaces in the mosaics

Manually set the segmentation input parameters Each homogeneous surface is treated as a planar surface Use deterministic thresholds to identify polyhedral building surfaces based on

elevation map generated using stereo mosaics

04/19/23Prudhvi Gurram, Research Seminar

34

Image Plane

Scene

Viewpoints

Left mosaic Nadir mosaic Right mosaic

Sensor motion

Page 35: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Deterministic Approach (contd…)

04/19/23Prudhvi Gurram, Research Seminar

35

Nadir mosaic Segmentation

Tree regions

Building surfaces extraction using

height information

Each surface

Boundary of each surface

Right mosaic

Left mosaic

Plane fit for each surface using disparity

between mosaics

Corners through Curvature Scale

Space

Edges through line fit between corners

CAD model of each building

DTM

Digital Elevation Model (DEM)

Reconstructed scene

100 200 300 400 500 600 700 800 900 1000 1100

50

100

150

200

250

300

350

400

450

500

Noise

Solar shadow

Problems in 3D model of a

building due to solar shadow and noise in

images

Stereo pair

Nadir mosaic

Page 36: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

36Raw Lidar CAD Model from Lidar

Good Elevation

and Planes

Solar shadow

Noise

There is no information in these cases as one planar surface merges with a neighboring surface at a different height during segmentation

Video

Lidar

Fusing Lidar data with visual imagery

Page 37: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Problems with Deterministic Approach Had to set the segmentation input parameters manually Manually select Regions-Of-Interest (ROI) around buildings which is

tedious over large scenes Sparse depth/elevation map from stereo mosaics led to inaccurate 3D

models Noise in the images led to problems with modeling 3D buildings (missing

surfaces etc.) Deterministic thresholds led to the models being overfitted to a particular

data set Some of the above problems can be avoided by fusing Lidar data with visual

imagery

04/19/23Prudhvi Gurram, Research Seminar

37

Publications:1. Prudhvi Gurram, Eli Saber and Harvey Rhody, "Extraction of Digital Elevation Map from Parallel-

Perspective Stereo Mosaics", IS&T/SPIE Electronic Imaging Symposium, San Jose, CA, Jan. 2008.2. Prudhvi Gurram, Steve Lach, Eli Saber, Harvey Rhody and John Kerekes, "3D Scene Reconstruction

through a Fusion of Passive Video and Lidar Imagery", IEEE Applied Imagery Pattern Recognition Workshop, Washington, DC, Oct. 2007.

Page 38: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

3D Object Identification and Modeling Two Parts

Object Identification (Buildings, Trees, Terrain, etc.) Using Global Statistics of various features in Bayesian network for

classification of surfaces Features include elevation map from visual imagery (stereo mosaics),

elevation from Lidar data, color information, edges and corners extracted from visual imagery

Object modeling (3D Buildings) Identified building surfaces have inaccurate 3D geometry due to sparse depth

maps provided by stereo mosaics Improving accuracy of the building geometry measurements obtained from

stereo mosaics using local optimization and individual video frames.

04/19/23Prudhvi Gurram, Research Seminar

38

Page 39: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

3D Object Identification

Why Bayesian Networks? Useful in representing causal relationships between the

features/nodes Specify conditional independence among the features Easier to combine prior knowledge (structure of the BN)

with data Easier for an expert to intervene and predict the effects of

such an intervention Avoid overfitting of models to a particular data set

04/19/23Prudhvi Gurram, Research Seminar

39

Page 40: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Bayesian Network Semantics The features constitute the nodes of the

BN If X is connected to Y (causal

relationship), X is called the parent of Y Any node has conditional probability

distribution P(X|Parents(X)) → P(X|A,B) The probabilities associated with each

node are called parameters BN defined by

Structure (causal relationships) Parameters (probabilities – conditional or

prior)

04/19/23Prudhvi Gurram, Research Seminar

40

X

Y Z

A B

Parents of X

Conditionally independent

given X

Causal relationship

Page 41: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Bayesian Network Structure

04/19/23Prudhvi Gurram, Research Seminar

41

BN structure to use visual imagery and Lidar data

BN structure to use visual imagery only

Region

RegionOther features in the BN1.Elevation information from stereo mosaics2.Elevation information from Lidar data3.Corner information from nadir mosaic4.Color information from nadir mosaic5.Texture information from nadir mosaic6.Area of regions

Classes: 1 – Buildings, 2 – Grass, 3 – Trees, 4 – Asphalt, 5 – Misc.

Page 42: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Stereo Mosaics

04/19/23Prudhvi Gurram, Research Seminar

42

Original Video frames

Left

mosa

icN

adir

m

osa

icR

ight

mosa

ic

Page 43: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Mosaics and Lidar Data

04/19/23Prudhvi Gurram, Research Seminar

43

Stereo Mosaics

Lidar data (Rasterized and registered to nadir mosaic by flying a linear pushbroom camera over

Lidar point cloud)

Page 44: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Truth Map and Pseudo Truth Map

04/19/23Prudhvi Gurram, Research Seminar

44

Tru

th m

apN

adir

mos

aic

Seg

men

t m

ap

Pse

udo

Tru

th m

ap

Generated using mean-shift image segmentation with spatial bandwidth and color bandwidth15sh 10ch

Classes: 1 – Buildings, 2 – Grass, 3 – Trees, 4 – Asphalt, 5 – Misc.

Page 45: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Feature Extraction Elevation information from stereo mosaics

Match corresponding points in left, right and nadir mosaic (along the boundaries of segments) using correlation technique and epipolar constraints of parallel-perspective stereo mosaics

Use stereo geometry of parallel-perspective stereo mosaics to extract depth/elevation map

Fit least squares planes for each segment in nadir mosaic Use RANdom SAmple Consensus (RANSAC) algorithm to remove outliers (due to bad

segmentation and bad point matches) during plane fit Use mean height, minimum height, maximum height, number of inliers during plane fit as

features in the BN

04/19/23Prudhvi Gurram, Research Seminar

45

Observe the noise in the elevation map – this is due to over segmentation of tree regions and bad matches of points

Page 46: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Feature Extraction Elevation information from Lidar data

Elevation map readily available from Lidar point cloud Fit least squares planes for each segment in nadir mosaic Use RANdom SAmple Consensus (RANSAC) algorithm to remove outliers (due to bad

segmentation) during plane fit Use mean height, minimum height, maximum height, number of inliers during plane fit as

features in the BN

Corner information from nadir mosaic Extract 2D corners from each segment of nadir mosaic Orthorectify the corners using initial elevation map from stereo mosaics Use total number of corners, number of right angle corners, 45 degree, 135 degree

corners as features in the BN

Surface area of each segment in absolute units (m2) Mean values of hue and saturation of each segment

04/19/23Prudhvi Gurram, Research Seminar

46

Page 47: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Feature Extraction

04/19/23Prudhvi Gurram, Research Seminar

47

Visual entropy determined over 9x9 window on nadir mosaic

Lidar entropy determined over 9x9 window on rasterized Lidar data

Entropy represents the energy of the data over a window – can be used to represent the presence of texture in visual images and presence of changes of height in Lidar data

Page 48: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Bayesian Network Learning and Inference

Equal frequency binning used to discretize the features Known structure and Incomplete data

Hidden nodes introduced by expert to make causal dependencies explicit

Use Expectation-Maximization (EM) algorithm to learn the parameters of the nodes given all the training data

During testing phase, use Junction tree inference algorithm to marginalize over the nodes for which evidence is provided and obtain the posteriori probabilities of desired node (Region)

04/19/23Prudhvi Gurram, Research Seminar

48

Page 49: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Decision Theory

Decision theory = ‘Probability theory’ + ‘Utility theory’

User-defined utilities are used to determine how the posteriori probabilities are used for making decisions

For region classification, utilities are set in such a way that the class with Maximum A Posteriori (MAP) probability is chosen

04/19/23Prudhvi Gurram, Research Seminar

49

Page 50: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Automated Choice of Best Segmentation Input Parameters Mean-shift image segmentation input parameters

Spatial bandwidth is set at a constant level since its variation does not considerably change the segmentation results

Color bandwidth is varied from 2 to 20 in steps of 2 10 sets of input parameters

The best set of parameters chosen based on the quality of classification results

Quality metric: Weighted sum of differences of True Positive Rate and False Positive Rate

where represents one case of input parameter

represents the class number True Positive is a hit and False positive is a false alarm True positives and false positives are calculated in terms of pixels but not

regions

04/19/23Prudhvi Gurram, Research Seminar

50

N

i

ki

kii

hc FPTPWh

kc 1

max

sh

ch

kch

i

Page 51: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – Visual Data Only

04/19/23Prudhvi Gurram, Research Seminar

51

Building Class Trees Class

2 0.8326 0.0802 0.7524 0.9033 0.086 0.81734 0.8375 0.0705 0.767 0.8711 0.0764 0.79476 0.8109 0.0491 0.7618 0.9099 0.1034 0.80668 0.7871 0.0482 0.7389 0.8951 0.0806 0.814510 0.8474 0.0774 0.77 0.9111 0.0968 0.814312 0.7951 0.0634 0.7318 0.8841 0.081 0.803114 0.7735 0.0594 0.7142 0.8415 0.0658 0.775716 0.8322 0.0664 0.7658 0.7879 0.0646 0.723318 0.8131 0.0829 0.7302 0.8267 0.0596 0.767220 0.8026 0.0711 0.7315 0.7648 0.0523 0.7125

ch TP FP FPTP TP FP FPTP

Best Parameter 10ch

Classification map

Weights used:

0,0,0,0,1 54321 WWWWW

Page 52: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – Visual Data Only

04/19/23Prudhvi Gurram, Research Seminar

52

Best Parameter 2ch

Classification map

Weights used:

0,0,0,9.0,1.0 54321 WWWWW

Page 53: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – Visual and Lidar fusion

04/19/23Prudhvi Gurram, Research Seminar

53

Building Class Trees Class

2 0.9478 0.0273 0.9204 0.9552 0.0291 0.92614 0.9651 0.0499 0.9151 0.9516 0.0273 0.92436 0.9576 0.0346 0.923 0.9527 0.028 0.92488 0.9609 0.0476 0.9133 0.9512 0.0278 0.923410 0.9604 0.0151 0.9453 0.9504 0.0277 0.922712 0.959 0.0197 0.9393 0.9483 0.0252 0.923114 0.9424 0.0225 0.9199 0.942 0.0227 0.919316 0.9433 0.0481 0.8952 0.9471 0.0224 0.924818 0.9189 0.0996 0.8194 0.9457 0.0226 0.923120 0.9188 0.1236 0.7952 0.9419 0.0225 0.9195

ch TP FP FPTP TP FP FPTP

Best Parameter 10ch

Classification map

Weights used:

0,0,0,0,1 54321 WWWWW

Page 54: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Comparison of the Two Classifiers

04/19/23Prudhvi Gurram, Research Seminar

54

Building Class Trees Class

2 0.8326 0.0802 0.7524 0.9033 0.086 0.81734 0.8375 0.0705 0.767 0.8711 0.0764 0.79476 0.8109 0.0491 0.7618 0.9099 0.1034 0.80668 0.7871 0.0482 0.7389 0.8951 0.0806 0.814510 0.8474 0.0774 0.77 0.9111 0.0968 0.814312 0.7951 0.0634 0.7318 0.8841 0.081 0.803114 0.7735 0.0594 0.7142 0.8415 0.0658 0.775716 0.8322 0.0664 0.7658 0.7879 0.0646 0.723318 0.8131 0.0829 0.7302 0.8267 0.0596 0.767220 0.8026 0.0711 0.7315 0.7648 0.0523 0.7125

ch TP FP FPTP TP FP FPTP Building Class Trees Class

2 0.9478 0.0273 0.9204 0.9552 0.0291 0.92614 0.9651 0.0499 0.9151 0.9516 0.0273 0.92436 0.9576 0.0346 0.923 0.9527 0.028 0.92488 0.9609 0.0476 0.9133 0.9512 0.0278 0.923410 0.9604 0.0151 0.9453 0.9504 0.0277 0.922712 0.959 0.0197 0.9393 0.9483 0.0252 0.923114 0.9424 0.0225 0.9199 0.942 0.0227 0.919316 0.9433 0.0481 0.8952 0.9471 0.0224 0.924818 0.9189 0.0996 0.8194 0.9457 0.0226 0.923120 0.9188 0.1236 0.7952 0.9419 0.0225 0.9195

ch TP FP FPTP TP FP FPTP

Visual imagery only Visual and Lidar fusion

Visual imagery only Visual and Lidar fusionTruth map

Page 55: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

3D Object Modeling

Need for further optimization of 3D buildings Sparse depth maps obtained from stereo mosaics Height of each point quantized into levels defined by

the view angle of the fixed line used for building the stereo mosaics

Project initial 3D models on to individual video frames

Minimize the distance between the projected corners of the building and the actual corners detected in the 2D images

04/19/23Prudhvi Gurram, Research Seminar

55

Page 56: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

04/19/23Prudhvi Gurram, Research Seminar

56

Initial estimate of orthorectified, georeferenced 3D corners

Project initial 3D corners on to individual video frames in which the building is visible

Find corresponding point pairs between projected 3D corners and 2D corners in individual video frames

Optimize the 3D position of corners in object space to reduce the sum of squared distances between each corner’s

projected 2D coordinates and the actual 2D coordinates of the points in the video frames (nonlinear least squares

problem - Levenberg-Marquardt algorithm)

Fit a RANSAC least squares plane through the optimized 3D corners of the surface to remove outliers due to bad point

pairs

Recalculate the accurate 3D positions of the corners using the plane equation

Over all surfaces

For each surface s

Combine all the corners of adjacent surfaces with common edges by applying appropriate constraints to build accurate

3D model of the building

),,( ZYX

Projected 3D corners

Initial 3D corners

)ˆ,ˆ( yx

Actual 2D corners

),( yx

1

ˆ

33 Z

Y

X

TIKR

k

yk

xk

N

iiiii

ZYXyyxx

1

22

),,()ˆ()ˆ(min

Page 57: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Results – 3D Object Modeling

04/19/23Prudhvi Gurram, Research Seminar

57

Building surfaces identified by Bayesian Network

Surfaces belonging to a single building identified using connected

components

Page 58: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Summary and Conclusions

Automatically built stereo mosaics from video frames over large scenes using ray interpolation

Automatically identified 3D objects like buildings and trees in the scene using features from stereo mosaics

Improved the accuracy of 3D object identification and modeling by fusing Lidar data with visual imagery

Accurately modeled 3D buildings in the scene In summary, an automated system has been designed to

model 3D buildings from aerial video

04/19/23Prudhvi Gurram, Research Seminar

58

Page 59: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Original Contributions

04/19/23Prudhvi Gurram, Research Seminar

59

A segment-based mesh design for aerial triangulation without any prior 3D knowledge of the scene has been developed This new design helps in avoiding visual artifacts in the parallel-perspective stereo mosaics

that are built using ray interpolation Consequently, the errors in the final 3D models of buildings are reduced Publication: P. Gurram, E. Saber, and H. Rhody, “Segment-based Mesh Design for Building

Parallel-Perspective Stereo Mosaics”, Accepted for publication in IEEE Transactions on Geoscience and Remote Sensing

A novel method to set the input parameters of vision algorithms like color segmentation using the data-driven probabilistic inference in Bayesian networks has been designed This method automates the 3D object identification process and precludes the need for

manual intervention to set the accurate input parameters for best quality of the final 3D building models

Publication: P. Gurram, E. Saber, and H. Rhody, “An Automated System for Modeling 3D Buildings from Aerial Video”, to be submitted to ASPRS Photogrammetric Engineering and Remote Sensing Journal

Page 60: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Future Work

Structural learning of Bayesian Networks Design of hierarchical Bayesian Networks Fusion of Lidar data with visual imagery

during 3D modeling step

04/19/23Prudhvi Gurram, Research Seminar

60

Page 61: Automated 3D Object Modeling from Aerial Video Imagery Prudhvi Krishna Gurram Ph.D. Student Chester F. Carlson Center for Imaging Science, Rochester Institute.

Acknowledgements

Dr. Eli Saber, RIT Dr. Harvey Rhody, RIT Dr. Ferat Sahin, RIT Major Steve Lach, USAF Jason Faulring, LIAS, RIT Matthew Montanaro Archana Devasia Mustafa Jaber04/19/23

Prudhvi Gurram, Research Seminar61