Obstacle Avoidance for Unmanned Air Vehicles Using Optical...

10
Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability Distributions Paul Merrell, Dah-Jye Lee, and Randal Beard Department of Electrical and Computer Engineering Brigham Young University, 459 CB Provo, Utah 84602 ABSTRACT In order for an unmanned aerial vehicle (UAV) to safely fly close to the ground, it must be capable of detecting and avoiding obstacles in its flight path. From a single camera on the UAV, the 3D structure of its surrounding environment, including any obstacles, can be estimated from motion parallax using a technique called structure from motion. Most structure from motion algorithms attempt to reconstruct the 3D structure of the environment from a single optical flow value at each feature point. We present a novel method for calculating structure from motion that does not require a precise calculation of optical flow at each feature point. Due to the effects of image noise and the aperture problem, it may be impossible to accurately calculate a single optical flow value at each feature point. Instead we may only be able to calculate a set of likely optical flow values and their associated probabilities or an optical flow probability distribution. Using this probability distribution, a more robust method for calculating structure from motion is developed. This method is being developed for use on a UAV to detect obstacles, but it can be used on any vehicle where obstacle avoidance is needed. Keywords: Obstacle Avoidance, Unmanned Air Vehicles, Optical Flow, Motion Parallax, Structure from Motion 1. INTRODUCTION One of the fundamental problems in computer vision is how to reconstruct a 3D scene using a sequence of 2D images taken from a moving camera in the scene. A robust and accurate solution to this structure from motion problem would have many important applications for UAVs and other mobile robots. One important application is obstacle avoidance. With an accurate obstacle detection algorithm in place, a UAV would be capable of autonomous low-level flight through complex, cluttered environments. The process of recovering 3D structure from motion is typically accomplished in two separate steps. First, optical flow values are calculated for a set of feature points using image data. From this set of optical flow values, the rotational and translational motion of the camera is estimated, as well as the depth of the objects in the scene. Noise from many sources prevents us from calculating a completely accurate optical flow estimate. A better understanding of the noise could provide a better end result. Typically, a single optical flow value is calculated at each feature point without calculating the covariance of the optical flow. A covariance matrix is a way of quantifying the accuracy of the optical flow estimate along any direction. By knowing the accuracy of each feature point in any direction, we can devise a method that relies more heavily on the more accurate data. The challenge is to find the best way to use all of this information to our advantage. This new method is particularly useful for edge points. An edge point is any point on the image around which there is a high spatial gradient in only one direction. Corner points are points around which there is a high spatial gradient in two directions. Corner points are more useful because they are easy to track along both directions, but edge points are more common in most images. Edge points have the property of being much more accurate in one direction than the other, and so a probability distribution is particularly useful for describing the accuracy of the optical flow estimate for an edge point. SPIE USE, V. 1 5609-4 (p.1 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59 Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within the margin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Transcript of Obstacle Avoidance for Unmanned Air Vehicles Using Optical...

Page 1: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability Distributions

Paul Merrell, Dah-Jye Lee, and Randal Beard Department of Electrical and Computer Engineering

Brigham Young University, 459 CB Provo, Utah 84602

ABSTRACT

In order for an unmanned aerial vehicle (UAV) to safely fly close to the ground, it must be capable of detecting and avoiding obstacles in its flight path. From a single camera on the UAV, the 3D structure of its surrounding environment, including any obstacles, can be estimated from motion parallax using a technique called structure from motion. Most structure from motion algorithms attempt to reconstruct the 3D structure of the environment from a single optical flow value at each feature point. We present a novel method for calculating structure from motion that does not require a precise calculation of optical flow at each feature point. Due to the effects of image noise and the aperture problem, it may be impossible to accurately calculate a single optical flow value at each feature point. Instead we may only be able to calculate a set of likely optical flow values and their associated probabilities or an optical flow probability distribution. Using this probability distribution, a more robust method for calculating structure from motion is developed. This method is being developed for use on a UAV to detect obstacles, but it can be used on any vehicle where obstacle avoidance is needed.

Keywords: Obstacle Avoidance, Unmanned Air Vehicles, Optical Flow, Motion Parallax, Structure from Motion

1. INTRODUCTION

One of the fundamental problems in computer vision is how to reconstruct a 3D scene using a sequence of 2D

images taken from a moving camera in the scene. A robust and accurate solution to this structure from motion problem would have many important applications for UAVs and other mobile robots. One important application is obstacle avoidance. With an accurate obstacle detection algorithm in place, a UAV would be capable of autonomous low-level flight through complex, cluttered environments.

The process of recovering 3D structure from motion is typically accomplished in two separate steps. First, optical flow values are calculated for a set of feature points using image data. From this set of optical flow values, the rotational and translational motion of the camera is estimated, as well as the depth of the objects in the scene.

Noise from many sources prevents us from calculating a completely accurate optical flow estimate. A better understanding of the noise could provide a better end result. Typically, a single optical flow value is calculated at each feature point without calculating the covariance of the optical flow. A covariance matrix is a way of quantifying the accuracy of the optical flow estimate along any direction. By knowing the accuracy of each feature point in any direction, we can devise a method that relies more heavily on the more accurate data. The challenge is to find the best way to use all of this information to our advantage.

This new method is particularly useful for edge points. An edge point is any point on the image around which there is a high spatial gradient in only one direction. Corner points are points around which there is a high spatial gradient in two directions. Corner points are more useful because they are easy to track along both directions, but edge points are more common in most images. Edge points have the property of being much more accurate in one direction than the other, and so a probability distribution is particularly useful for describing the accuracy of the optical flow estimate for an edge point.

SPIE USE, V. 1 5609-4 (p.1 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 2: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

2. RELATED WORK

A significant amount of work has been done to try to use vision in a variety of applications on a UAV, such as terrain-following [1], navigation [2], and autonomous landing on a helicopter pad [3]. Vision-based techniques have also been used for obstacle avoidance [4,5] on land robots. We hope to provide a more accurate and robust vision system by using optical flow distributions.

Dellaert et al. explore a similar idea [6] to the one presented here. Their method is also based upon the principle that the exact optical flow or correspondence between feature points in two images is unknown. They attempt to calculate structure from motion without a known optical flow. Langer and Mann [7] discuss scenarios in which the exact optical flow is unknown, but the optical flow is known to be one of a 1D or 2D set of optical flow values. While these methods both discuss possible scenarios where multiple optical flow values could be correct, neither of them calculate the probability of each optical flow value, which is the advantage of the method we are presenting.

In this paper, we will make extensive use of three closely-related papers. The method described by Simoncelli et al. [8] will be used to calculate the optical flow distributions. The optical flow distributions will then be incorporated into two methods for calculating structure from motion [9, 10] that originally did not use them.

3. METHODOLOGY

3.1. Optical Flow Probability Distribution

To calculate the optical flow probability distributions, the method described by Simoncelli et al. [8] will be used. In another paper [12], we described an alternative method for calculating the optical flow distributions. One of the key differences between the two methods is that the method of Simoncelli et al. only produces Gaussian probability distributions, whereas our method in [12] can produce much more complex distributions. The advantage of Gaussian distributions is that the problem becomes much simpler when only Gaussian distributions are used. In order to simplify the problem, it will be necessary throughout the remainder of this paper to assume that the distributions are Gaussian. There is, however a disadvantage to using only Gaussian distributions. In general, a Gaussian distribution is a good approximation to the kinds of distributions typically encountered using the method described in [12], but in certain unusual cases a Gaussian distribution may not be a good approximation.

Due to space limitations, the following method will be presented without a derivation. For a complete derivation, please see [8]. The first step is to construct the following matrix and vectors, based upon the spatial gradients, fx and fy, along the x and y directions and the temporal-derivative ft:

= 2

2

yyx

yxx

fff

fffM ,

=

ty

tx

ff

ffb ,

=

y

x

s f

ff . (1)

Each of these quantities is calculated at a single pixel. Instead of examining a single pixel, each of the pixels surrounding the position of interest can be used. Let Mj and bj be the values of M and b at the j-th position, (xj,yj), and let ωj be the weight attached to each position such that the positions closer to the desired position are given more weight. These values are then used to calculate the covariance of the optical flow, Ωu, by the equation

1

1

2

2

1 ),(

Ω+

+=Ω ∑

jp

jjs

jju

yx σσ

ω

f

M (2)

with Ωp being the covariance matrix of the prior distribution of the optical flow and with σ1 and σ2 being the variances associated with two different sources of noise. σ1 describes the errors introduced from the failure of the planarity assumption and σ2 describes the errors introduced by an inaccurate temporal derivative. Each of these parameters may

SPIE USE, V. 1 5609-4 (p.2 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 3: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

need to be adjusted based upon the quality of the image and the characteristics of the scene. In a typical image sequence, they have been found empirically to be approximately: σ1 = 0.08, σ2 = 1.0, σp = 2.0.

The mean value of the optical flow is given as

∑+

⋅Ω−=j

jjs

jju

yx 2

2

1 ),( σσ

ωµ

f

b. (3)

3.2. Structure from Motion

We have chosen two different structure from motion methods which we will improve upon in the following sections by incorporating the use of optical flow distributions. There certainly are other structure from motion algorithms that could be improved in much the same way. These two methods were selected in particular because they have several advantages over other methods. The main advantage of the first method called linear structure from motion [9], is that it is completely linear, and so it avoids the problems associated with non-linear optimization. The mathematics of linear estimation is better understood than non-linear estimation, and non-linear estimation may be unstable and computationally inefficient. One of the main disadvantages of the linear method is the estimate is biased using images with small field-of-fields [9, 13]. The second method we will present is optimal structure from motion [10]. This method is designed to be the optimal solution in the presence of noise. This method is non-linear which is a disadvantage, but it is usually more accurate in the presence of noise.

The main difference between the two methods is that the optimal structure from motion requires an iteration. The iteration requires an initial guess for the rotation, which is used to then estimate the translation. From this estimated translation, a new rotation vector is calculated. From this new rotation vector, a new translation vector is calculated. The translation and rotation vectors are iteratively refined in this way several times. The linear structure from motion method can not rely on any such iterations because iterations are inherently non-linear. Instead, the translation vector is calculated without an estimated rotation vector, after which the rotation is calculated from the estimated translation. Neither of these two steps is repeated.

Although the iterations in the optimal structure from motion method are nonlinear, the parts that compose each iteration are in fact linear operations. So, the same method that is being used to calculate the rotation from a known translation in the optimal structure from motion algorithm can also be used in the linear structure from motion algorithm without causing any problems. Similarly, the same method for calculating the depth at each feature point can be used in both algorithms.

3.3. Linear Structure from Motion

Due to space limitations, we will only be able to present the linear structure from motion method without deriving it. For a complete derivation please see [9]. This method and the next method both use spherical projection to simplify the mathematics. In this spherical projection, the image obtained from the camera is projected onto a sphere whose center is the same as the optical center of the camera. If there are n feature points in the image and if the i-th

feature point is located at the position ),( ii yx ′′ in the coordinates of the image, then the i-th feature point is projected

onto the position

T

iiii

i

ii

ii

fyx

f

fyx

y

fyx

x

+′+′+′+′′

+′+′′

=2

022

0

20

2220

22p (4)

on a unit sphere, where f0 is the focal length of the camera. The translational motion of the camera will be represented by a unit vector v pointed in the direction of translation. The rotation of the camera will be represented by a vector r, where the direction of r is the axis around which the camera is rotated and the magnitude of r is the angle at which the camera is rotated. Let λi be the magnitude of translation divided by the distance from the optical center of the camera to

SPIE USE, V. 1 5609-4 (p.3 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 4: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

the i-th feature point, so that 1/λi is the desired depth multiplied by a scale factor. The optical flow for the i-th point is represented by a vector fi, which lies on the surface of the unit sphere tangent to the vector pi. The optical flow of the i-th point is given by

[ ] [ ]iiiii prppvf ×+××= )(λ . (5)

To estimate the translation v, a matrix is formed using the three components of the position vector pi = [xi yi zi]

T. We need to find all values of w that satisfy the equation

0

1

1

1

22

22222222

22

11111121

21

=

nnnnnnnn

T

zyzxyxyx

zyzxyxyx

zyzxyxyx

MMMMMMw . (6)

In general, there are n-6 solutions to this linear system of equations. These solutions constitute the left nullspace of the above n×6 matrix. Let wj be the j-th solution for w and wij be the i-th component of the j-th solution. A new set of vectors is formed c1, c2, … , cn-6 using the equation

j

n

iiiijw cpf =×∑

=1

)( . (7)

For each of the cj, vectors the following relationship exists with the translation vector v

0=+ jjT ncv , (8)

where nj is the noise on the j-th equation. A matrix C can be formed by combining all of the cj vectors

[ ]TTn

TT cccC K21= . If the noise is identically-distributed white Gaussian noise, our estimate of the translation

v is the minimum-eigenvalue eigenvector of the matrix CTC. This was the original approach for finding v as presented in [9]. From our analysis of the noise in the optical flow, we know that the noise in the cj vectors is, in general, not identically-distributed and not white. If the noise is not identically-distributed or not white, then a better estimate of v

can be found using the covariance of the noise. If a vector of the noise, [ ]Tnnn K1=n , has a covariance matrix,

[ ]TE nn=Ω , then v is obtained by finding the minimum-eigenvalue eigenvector of the matrix CTΩ-1C.

Unfortunately, this problem is more complex because the noise appears in with the cj vectors not separate from them as equation (8) assumes.

( )0

0

=′+

=′+

jT

jT

jjT

nvcv

ncv (9)

So the noise in each equation depends upon the parameter to estimate v, but the parameter v depends upon our estimate of the noise. This dilemma can be overcome through a simple iteration. First, an initial estimate of v is chosen. If vk is our estimate of v on the k-th iteration, then the noise vector can be written as

[ ]T

nTk

Tkk nvnvn ′′= K1 (10)

SPIE USE, V. 1 5609-4 (p.4 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 5: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

and has a covariance matrix, Ωk. Our estimate for vk+1 is the minimum-eigenvalue eigenvector of the matrix

CC 1−Ω kT .

It is possible to construct a series of matrices that will convert a 2n×1 vector, u, containing all of the optical flow values into a desired 3(n-6)×1 vector which contains all of the cj vectors. In this series of matrices, there is first a matrix, Auf, which converts the optical flow values, u, into a 3n×1 vector containing each of the fi vectors. Then there is a matrix, Afy, which performs the cross-product fi × pi, after which there is a matrix, Ayc, which uses the wj vectors according to equation (7) to produce a 3(n-6)×1 vector c, which contains all of the cj vectors, so that

[ ] Tyc

Tfy

Tufuuffyyc

T

uffyyc

E AAAAAAnn

uAAAc

Ω=′′

= (11)

Finally, if we construct a matrix, Acn, which performs the dot product of vk with each of the cj vectors according to equation (10), then the covariance of the noise is given by

Tcn

Tyc

Tfy

Tufuuffyyccnk AAAAAAAA Ω=Ω (12)

The problem with this approach is that it uses an iteration and iterations are nonlinear. Much of the appeal of this method comes from it being completely linear and so we can not introduce any nonlinear parts to it. To be linear only a single iteration is allowed, meaning that an initial guess for v is chosen and this initial value is used to estimate v with no further iterations. This requires that we have a fairly good initial guess for v. In our particular application, we do have an excellent initial estimate of v. The UAV is only capable of flying in the forward or near-forward directions. Furthermore, the UAV can not change directions suddenly. The direction the UAV is headed in the current camera frame is likely to be very close to the direction it was headed in the previous camera frame. In other applications, this initial guess may be more of a problem, but even with a poor initial estimate of v, this method should still perform better than the original method.

It is possible for us to adapt the linear methods for estimating the rotation and depth from [9] and modify them in much the same way we modified the translation estimate so that they too use optical flow probability distributions. We have tried this and have been able to achieve better results after doing so. However, in the following section, section 3.4, we will present the optimal rotation and depth estimates that were modified from a different method. These estimates are more accurate than the estimates adapted from [9]. We suggest using the rotation estimate and the depth estimate described below in either a completely linear structure from motion algorithm or in a nonlinear optimal structure from motion algorithm.

3.4. Optimal Structure from Motion

We will introduce some new notation in order to be more consistent with [10]. The “hat” operator will be used

to indicate the matrix that performs the cross product between two vectors: vpvp ×=ˆ . We will also introduce a new

vector iii pfy ×= . Equation (5) can now be written as

0ˆˆ 2i =−+ rpvpy iii λ (13)

We introduce a new matrix, Gi, which converts the optical flow value, ui, to the vector, yi, so that

0ˆˆ 2i =−+ rpvpuG iiii λ . (14)

We now apply the left pseudo-inverse of Gi, which is ( ) Ti

Tii

Ti GGGG =−1

so that

SPIE USE, V. 1 5609-4 (p.5 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 6: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

0ˆˆ 2 =−+ rpGvpGu iTiii

Tii λ . (15)

Our goal is to minimize the quantity

∑=

Ω−−+n

ii

Tiii

Tii

i1

221

ˆˆ rpGvpGu λ . (16)

where Ωi is the covariance matrix of ui from equation (2) and the weighted-norm 1−Ω⋅

i is given by xxx 12

1

−Ω

Ω=− iT

i.

The norm in (16) is equivalent to a new weighted-norm given by

∑=

−+n

iiiii

i1

22ˆˆW

rpvpy λ . (17)

where Tiiii GGW 1−Ω= . Our goal is to find the values of v, r, and λi that minimize this weighted-norm. Using a

generalized least squares approximation, the optimal value of λi for any given v and r is found to be

vpWpv

rpyWpv

iiiT

iiiiT

i ˆˆ)ˆ(ˆ 2−−

=λ . (18)

This solution for λi can now be placed back in equation (17), so that now we are trying to find the values of v and r that minimize the quantity

∑=

−−

−n

ii

iiiT

iiiiT

ii

i1

2

22

ˆˆˆ

)ˆ(ˆˆ

W

rpvpWpv

rpyWpvvpy (19)

At this point it becomes too difficult to solve for v and r simultaneously. Instead, we pick an initial value for r

and then find the optimal value of v given that particular value of r. Then we use the estimate for v to find a new estimate for r. The estimates of v and r improve with each iteration. The estimates of v and r on the k-th iteration will be denoted by vk and rk.

If we define the matrix Qi to be kiii

Tk

iiTkki

i vpWpv

WpvvpIQ

ˆˆ

ˆˆ−= , then the weighted-norm (19) can be written as

∑∑=

+=

+ −=−n

ikii

n

ikiii

iiTii 1

2

12

1

2

12 ˆ)ˆ(

QWQWrpyrpyQ . (20)

The optimal value of rk+1 is found using a generalized least squares approximation as

∑∑=

=+

=n

iiii

Tii

n

iiii

Tiik

1

2

1

1

221 ˆˆˆ yQWQppQWQpr . (21)

Unfortunately, we have not been able to find a closed-form solution for finding v from a given r, but we can

use the original solution for finding v in [10] and then modify it slightly. This original solution is no longer the optimal solution because it omits the Wi terms in the estimation of λi from equation (18), but it is a fairly good approximation.

SPIE USE, V. 1 5609-4 (p.6 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 7: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

The original solution for v is the minimum-eigenvalue eigenvector of the matrix ( )( )∑=

−−n

i

T

iiii1

22 ˆˆ rpyrpy . This

solution does not take into consideration the noise characteristics of yi. The noise characteristics depend on the parameter we are estimating v, so we apply the same argument outlined in section 3.3, which gives us an iterative method that requires an initial guess for v and then uses past estimates of v. Our estimate for vk is the minimum-eigenvalue eigenvector of the matrix

( )( )∑= −− Ω

−−n

i kTiii

Tk

T

kiikii

1 11

22 ˆˆ

vGGv

rpyrpy. (22)

3.5. Positive Depth Constraint

Thus far we have overlooked the fact that the depth of each point in the scene must be positive, and so λi ≥ 0. After recognizing this, the optimal solution for λi becomes

≤−−

>−−−−

=0)ˆ(ˆ,0

0)ˆ(ˆ,ˆˆ

)ˆ(ˆ

2

22

rpyWpv

rpyWpvvpWpv

rpyWpv

iiiiT

iiiiT

iiiT

iiiiT

iλ . (23)

Ideally, we would plug this solution back into equation (17) and solve for v and r, but in doing so, it quickly becomes too difficult to solve in this way. Instead, as we go through each iteration, we keep track of which points are currently estimated to be λi = 0 and then treat those points slightly differently. For a point at which λi = 0, the norm to minimize

becomes 22ˆ

iii Wrpy − . The translation vector, v, no longer appears in this norm. Consequently, points at which λi = 0

provide us with no information about the translation vector and so they should be ignored in the translation calculation. The optimal value for r for points at which λi = 0 is given by

∑∑=

=

=n

iiii

n

iiii

1

2

1

1

22 ˆˆˆ yWppWpr . (24)

Let us order the points such that λi > 0 for 1 ≤ i ≤ p and λi = 0 for p+1 ≤ i ≤ n, then the optimal value for r is given by

+

+= ∑∑∑∑

+==

+==

n

piiii

p

iiii

Tii

n

piiii

p

iiii

Tii

1

2

1

2

1

1

22

1

22 ˆˆˆˆˆˆ yWpyQWQppWppQWQpr (25)

and vk is the minimum-eigenvalue eigenvector of the matrix

( )( )∑

= − Ω−−p

i kTiii

Tk

T

kiikii

1 1

22 ˆˆ

vGGv

rpyrpy. (26)

There is one additional concern we have overlooked until this point, which is that there is a sign ambiguity in the estimate of v. The norm in (17) is exactly the same if v is estimated as being the exact opposite of its true value and if the values of λi are also estimated to be the exact opposite of their true value. However, this ambiguity can be resolved after recognizing that the majority of the λi values should be positive. In the event that the majority are not positive, we know that our estimates of v and the λi values are the opposite of their true values.

SPIE USE, V. 1 5609-4 (p.7 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 8: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

4. RESULTS

A simulation was created to directly compare the different methods. In the simulation, a random rotation and translation vector were selected for each trial. Fifty feature points were used and each was assigned a random position and depth. Using this information, optical flow values were then calculated for each feature point and then corrupted by noise. The noise on the optical flow values was not identically distributed. The noise on each optical flow value in each of the two dimensions was randomly assigned to be somewhere between 0.25 to 1.75 times the mean noise value. The optical flow values were then used in four methods: the original linear structure from motion, the modified linear structure from motion, the original optimal structure from motion, and the modified optimal structure from motion. Instead, the translation and rotation errors were calculated at different mean noise values and then were averaged over 100 trials.

The results of the simulation were plotted on a log-log scale. The translation errors are shown in Figure 1 and the rotation errors are show in Figure 2. The results show that the modified versions of both the linear and optimal structure from motion algorithms perform significantly better in both translation and rotation estimation. The improvements are more significant at the higher noise levels. Of the four methods, the modified optimal structure from motion appears to work the best. The modified versions of the two methods had on average a translation error of about two-thirds the value of the original error. The modified optimal structure from motion has on average a rotation error about three-fourths of the original rotation error, while the linear structure from motion has on average a rotation error about three-fifths of the original rotation error.

Figure 1: Translation Errors for Different Noise Levels Figure 2: Rotation Errors for Different Noise Levels

Several aspects of this simulation are known to be unrealistic. The simulation assumes that we know the exact covariance matrix of the noise. In practice, this too must be estimated using equation (2). Errors in the covariance estimate will cause the modified methods, which use it, to perform more poorly. It is also difficult to say how realistic the values for the covariance matrices are. In scenes that contain many edge points but few good corner points, the modified methods may perform much better than the simulation results would indicate, but in some scenes they may perform more poorly.

The methods were tested on two image sequences: one computer generated sequence and one real image sequence taken from a camera onboard a UAV headed toward a real obstacle. Figure 3 shows one frame from the computer generated sequence. Computer generated images are useful because the true depth of the objects in the image is known precisely. Figure 4 shows the true inverse depth, λi, at each point on the image. The darker areas indicate an object that is further away and lighter areas indicate an object that is close. Figures 5 and 6 show the recovered inverse depth using the modified and the original optimal structure from motion algorithm. These methods only recover the depth at the feature points on the image. The other points on the image can only be found through interpolation, which is why the images have dots in them where the feature points are located. Both Figures 5 and 6 show that the methods

SPIE USE, V. 1 5609-4 (p.8 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 9: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

are working well. The modified optimal structure from motion is working slightly better. The translation error is 0.8° in the modified method compared with 2.0° in the original method. The mean error in the depth estimate is 10% of the true value in the modified method compared with 13% in the original method.

Figure 3: Computer Generated Image used to test Figure 4: True inverse depth of the image. the methods.

Figure 5: Recovered Inverse Depth using the Figure 6: Recovered Inverse Depth using the Modified Optimal Structure from Motion algorithm Original Optimal Structure from Motion algorithm

Figure 7: One Frame from a real Figure 8: Recovered Inverse Depth using Figure 9: Confidence levels for the video sequence from the UAV. Modified Optimal Structure from Motion Recovered Inverse Depth.

Figure 7 shows one frame taken from a video that was recorded onboard a UAV headed directly into a tree. Figure 8 shows the recovered inverse depth, which shows that the tree has been detected as being close to the camera. One additional benefit of this method is that we know the accuracy of each feature point, so we can produce a confidence level for each point that is shown in Figure 9. The logic behind this it that it would be useful not only to

SPIE USE, V. 1 5609-4 (p.9 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.

Page 10: Obstacle Avoidance for Unmanned Air Vehicles Using Optical ...beard/papers/preprints/MerrellLeeBeard04… · Obstacle Avoidance for Unmanned Air Vehicles Using Optical Flow Probability

know where the obstacles are in the image, but also how confident we are that an obstacle does exist there. Notice that since there are few good feature points in the sky, the recovered depth in the sky is very inaccurate, but this inaccurate data is not a problem because we can tell that it should be inaccurate from looking at the confidence level in Figure 9.

5. CONCLUSIONS We have modified two different structure from motion algorithms to use optical flow probability distributions. The modified optimal structure from motion algorithm is an almost optimal solution in the presence of noise. We have demonstrated that the modified methods work significantly better than the original methods and we have demonstrated that the algorithm performs well on both computer-generated and real image sequences.

There is still room for improvement especially in the optical flow estimate. There are dozens of different optical flow methods [14], but few of these methods compute the covariance matrix of the optical flow estimate. In the past, there was little reason to do this because the covariance matrix was not used in the structure from motion calculation. After having demonstrated that the covariance matrix can significantly improve the structure from motion calculation, we believe that a covariance estimate should be an essential part of any new optical flow method.

6. REFERENCES 1. Netter, T. and N. Francheschini. “A robotic aircraft that follows terrain using a neuromorphic eye,” Conf.

Intelligent Robots and System, vol. 1, pp. 129-134, 2002. 2. B. Sinopoli, M. Micheli, G. Donato, and T. J. Koo, “Vision based navigation for an unmanned aerial vehicle,”

Proc. Conf. Robotics and Automation, pp. 1757-1764, 2001. 3. S. Saripalli, J. F. Montgomery, and G. S. Sukhatme. “Vision-based autonomous landing of an unmanned aerial

vehicle,” Proc. Conf Robotics and Automation. Vol. 3, pp 2799-2804, 2002. 4. L. M. Lorigo, R. A. Brooks, and W. E. L. Grimsou. “Visually-guided obstacle avoidance in unstructured

environments,” Proc. Conf. Intelligent Robots and Systems. Vol. 1, pp 373-379, 1997. 5. M. T. Chao, T. Braunl, and A. Zaknich. “Visually-guided obstacle avoidance,” Proc. Conf. Neural Information

Processing. Vol. 2, pp. 650-655, 1999. 6. F. Dellaert, S.M. Seitz, C.E. Thrope, and S. Thrun, “Structure From Motion without Correspondence,” Proc. Conf.

Computer Vision and Pattern Recognition, pp. 557-564, 2000. 7. M. S. Langer and R. Mann. “Dimensional analysis of image motion,” Proc. Conf. Computer Vision, pp. 155-162,

2001. 8. E. P. Simoncelli, E. H. Adelson, and D. J. Heeger, “Probability distributions of optical flow,” Proc. Conf.

Computer Vision and Pattern Recognition, pp. 310-315, 1991. 9. I. Thomas and E. Simoncelli. Linear Structure from Motion. Technical Report IRCS 94-26, University of

Pennsylvania, 1994. 10. S. Soatto and R. Brocket, “Optimal Structure from Motion: Local Ambiguities and Global Estimates,” Proc. Conf.

Computer Vision and Pattern Recognition, pp. 282-288, 1998. 11. J. Weng, N Ahuja, and T. Huang. “Motion and structure from two perspective views: algorithms, error analysis,

and error estimation.” IEEE Trans. Pattern Anal. Mach. Intell. 11 (5): 451-476, 1989. 12. P.C. Merrell, D.J. Lee, and R.W. Beard, "Statistical Analysis of Multiple Optical Flow Values for Estimation of

Unmanned Air Vehicles Height Above Ground", SPIE Optics East, Robotics Technologies and Architectures, Intelligent Robots and Computer Vision XXII, vol. 5608-28, Philadelphia, PA USA, October 25-28, 2004.

13. A. Jepson and D. Heeger. Linear subspace methods for Recovering translational direction. Cambridge University Press, 1992.

14. J. Barron, D. S. Fleet, S. S. Beauchemin, and T. A. Burkitt, “Performance of optical flow techniques,” in Proc. IEEE CVPR, Champaign, IL, 1992, pp. 236-242.

SPIE USE, V. 1 5609-4 (p.10 of 10) / Color: No / Format: Letter/ AF: Letter / Date: 2004-09-27 20:23:59

Please verify that (1) all pages are present, (2) all figures are acceptable, (3) all fonts and special characters are correct, and (4) all text and figures fit within themargin lines shown on this review document. Return to your MySPIE ToDo list and approve or disapprove this submission.