Monocular Vision Aided Autonomous UAV Navigation in Indoor ...thealphalab.org/papers/Monocular...

2377-3782 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSUSC.2018.2810952, IEEETransactions on Sustainable Computing

1

Monocular Vision Aided Autonomous UAVNavigation in Indoor Corridor Environments

Ram Prasad Padhy, Feng Xia, Senior Member, IEEE, Suman Kumar Choudhury,Pankaj Kumar Sa, Member, IEEE, and Sambit Bakshi, Member, IEEE

Abstract—Deployment of autonomous Unmanned Aerial Vehicles (UAV) in various sectors such as disaster hit environments,industries, agriculture etc. not only improves productivity but also reduces human intervention resulting in sustainable benefits. In thisregard, we present a model for autonomous navigation and collision avoidance of UAVs in GPS-denied corridor environments. In thefirst stage, we suggest a fast procedure to estimate the set of parallel lines whose intersection would yield the position of the vanishingpoint (VP) inside the corridor. A suitable measure is then formulated based on the position of VP on the intersecting lines in referenceto any of the image boundary axes which helps safe navigation of the UAV avoiding collisions with side walls. Furthermore, the relativeEuclidean distance scale expansion of matched scale-invariant keypoints in a pair of frames is taken into account to estimate the depthof a frontal obstacle. However, turbulence in the UAV arising due to its rotors or external factors may intruduce uncertainty in depthestimation. It is rectified with the help of a constant velocity aided Kalman filter model. Necessary set of control commands are thengenerated to avoid the frontal collision. Exhaustive experiments advocate the efficacy of the proposed scheme.

Index Terms—Collision avoidance; Monocular vision; Vanishing point; UAV navigation; Scale-invariant features; Kalman filter.

✦

1 INTRODUCTION

SUSTAINABLE robotics is an emerging area of research,which deals with the deployment of energy efficient

robots in the fields of scientific research, geology, renewalenergy, agriculture, and many more. It poses an excitingchallenge, where researchers from around the world canequally contribute and benefit. Development of UnmannedAerial Vehicles (UAV), popularly known as drones, isturning out to be a great ally of sustainability. UAVs withon-board camera(s) can be used to analyze real-time videodata of disaster hit environments, such as floods, earthquakeaffected areas, forest fire etc. from a remote location. This inturn helps in taking reasonable decisions before hand, andthereby saves time and energy. Furthermore, UAVs withthe power of autonomy acts as an additive for attainingsustainability.

Autonomous navigation of UAVs with collisionavoidance has been a key area of research in thefield of robotics and computer vision. It has enormousapplications in both public and private sectors suchas, aerial surveillance, search and rescue movements,militarily applications, disaster management, intelligenttransportation, precision agriculture, and so forth. Also,UAVs can venture to places, where humans can’t and collectvideo data for further understanding. The data accumulatedby the UAVs are used for scientific studies and launchalerts whenever necessary, thus assists a great deal towardssustainability. However, a robust collision avoidance

• R. P. Padhy, S. K. Choudhury, P. K. Sa and S. Bakshi are with theDepartment of Computer Science and Engineering, National Institute ofTechnology, Rourkela 769008, India.E-mail: {ramprasad.nitr, sumanchoudhury.nitr, sambitbaksi}@gmail.com, [email protected]

• F. Xia is with the Department of Cyber Engineering, Dalian University ofTechnology, Dalian 116620, China.E-mail: [email protected]

Fig. 1: UAV is navigating autonomously inside aGPS-denied corridor environment. While navigation, it hasto avoid collision with the side walls and at the same time,it should cease its motion before colliding with the frontalwall present at the end of the corridor.

algorithm with autonomous navigation is essential to carryout the desired tasks with absolute aplomb.

In general, most of the UAVs are equipped within-built GPS technology, which acts as a reliable sensorin maintaining the precise pose of the UAV in unknownoutdoor environments [1, 2, 3, 4, 5]. However, in indoorscenarios, the GPS precision is very low or the GPSsignal is not present at all, and thereby cannot berelied upon for obtaining the precise pose. Hence, inthese scenarios, proximity sensors are utilized to carryoutthe task of autonomous navigation [6, 7, 8]. However,most of the proximity sensors, such as LIDARs are highpower-consuming and heavy-weight sensors, and hencethese are more suitable for ground-based vehicles ascompared to UAVs.

Autonomous UAV navigation with monocular



2

Fig. 2: The proposed System Architecture with two major components: UAV Module and a Vision Module (VM). The videofeed from the UAV front camera is transmitted on a frame-by-frame basis to the VM via Robot Operating System (ROS)interface for further processing. VM first tries to locate the position of the vanishing point in each frame with the proposedfast algorithm (Section 3.1). The relative position of the vanishing point on the image plane with respect to any of theimage boundary axes is then utilized to devise an algorithm for deriving the next set of control commands that will ensuresafe navigation of the UAV through the corridor without colliding with the side walls (Section 3.2). Once the vanishingpoint could not be located, our algorithm utilizes the scale invariant visual features to avoid the collision with the frontalobstacle; usually a wall at the end of the corridor (Section 3.3).

vision [9, 10, 11, 12, 13, 14, 15] is an emerging, yetmore challenging task. This paper presents a monocularvision aided method to navigate a UAV safely insidecorridor environments. The UAV is equipped with aforward facing static camera and the video extracted fromthis camera is processed to generate the next set of controlcommands for safe autonomous navigation. It may benoted that the forward facing camera of the UAV is the onlysensor that is used by our algorithm for accomplishing thetask. The whole method is a two stage process occurringsequentially —(a) avoid collision with the side walls(b) at the end of the corridor, detect and avoid the frontalobstacle, usually a wall.

The first task is achieved with the help of afast vanishing point algorithm, while for accomplishingthe second task, scale-invariant features are used. Thisobstacle avoidance algorithm can be used in manyapplications, such as intelligent transportation, navigationinside collapsed buildings for monitoring purpose etc.Autonomous navigation of a UAV inside a corridor is shownin Figure 1.

Overview of solution strategy: The proposed systemarchitecture, as shown in Figure 2, comprises of twomajor components: a UAV and a vision module (VM).Communication between the two components is achievedby means of wireless LAN, generated by the UAV. Thevideo from the UAV front camera is first extracted and thenprocessed by the VM on a frame-by-frame basis. In the firststage, VM tries to find the vanishing point and dependingon its relative position on the image plane with referenceto any of the image boundary axes, the next set of controlcommands are generated to avoid collision with the sidewalls. In the second stage, at the end of the corridor, VMestimates the depth of the frontal obstacle (usually a wall)based on the scale-invariant visual features extracted fromthe video frames. Noisy depth measurements arising due toturbulence in the UAV is rectified by means of a Kalmanfilter model. The corrected depth value is then utilized todesign the next set of control commands to avoid the frontalwall before collision. The internal PID controller of the UAVis responsible for modifying the control commands (changein yaw, roll and pitch values).

Contributions: The contributions of our proposed work canbe summarized as follows:



3

(a) We propose a monocular vision aided fast novelmethod to locate the vanishing point of the corridor on aframe-by-frame basis on the 2D image plane.(b) An algorithm is then designed, based on the knowledgeof vanishing point location alongside a formulated measure,to generate the required set of control commands for safeand secured navigation of the UAV without colliding theside walls.(c) At the end of the corridor, an algorithm based onscale-invariant features is applied to estimate the depthof the frontal wall. However, depth value might be noisyowing to the turbulence in the UAV, caused due to windand its rotors. A constant velocity aided 1D-Kalman filteris then employed to rectify noisy depth measurements.Depending on the rectified depth, next set of essentialcontrol commands are designed to avoid the frontal collisionwell before time.(d) Efficacy of the proposed algorithm is demonstrated bynavigating a UAV in GPS-denied indoor corridors withvarying dimensions.

Rest of the paper is organized as follows. Section 2presents an overview of the prior research. Section 3delineates the proposed methodology followed by adetailed elaboration on the proposed collision avoidancemethod. The experimental results are discussed in Section 4.Finally, Section 5 provides the concluding remarks.

2 RELATED WORK

Quite a significant amount of work on autonomous UAVnavigation in varying environments have been reported inthe recent past. The sensors utilized by different navigationmethods vary according to the application and availability.Hence, depending upon the use of sensors, the prior workin this field can broadly be categorized into two majorcategories: (a) Proximity sensors, such as LIDAR, and (b)Vision sensors, such as monocular or stereo cameras.Proximity sensors: These sensors usually employ theelectromagnetic radiations, such as infrared rays to detectthe obstacles in the environment. Few of the research workshave reported the use of LIDAR and ultrasonic sensors tomeasure the distance of the obstacle from the UAV [16, 17,18]. However, the constrained pay-load of the light-weightUAVs make it infeasible to install laser range-finders andover-whelming weight sensors. Cruz et al. applied theHarmonic functions to model the UAV goal position andthe obstacles with help of on-board proximity sensors,and thereby avoided the presence of local minima [19].Chen et al. recorded the data from several IMU readingsand ultrasonic sensors to create a SLAM system thatmaps the environment using an android application [20].Stubblebine et al. proposed a laser-light guided algorithmfor obstacle detection followed by the use of LIDAR tocombine the present data with the previous readings [21].Roberts et al. [22] used a Kalman filter model along withthe readings from an ultrasonic sensor to precisely estimatethe UAV pose in GPS-denied environments. In their work,at each feasible location, the UAV estimates a favorablepath based on the observation density, calculated from anexisting map. However, proximity sensors are not alwaysa good choice for obstacle detection, because of their

heavy weight and high power consumption, and also createmounting problem in UAVs.Vision sensors: These light-weight sensors are idealreplacements for obstacle detection because of the propertyof low power consumption. The modern UAVs usuallyemploy the vision sensors for obstacle avoidance andnavigation task. For instance, a UAV with a pair ofcameras, usually apply the parallax of stereo vision toobtain the depth information that governs the UAVmovement [23, 24, 25]. However, the computational costwith two cameras sometimes troubles the algorithm towork in real time. Subsequently, UAV navigation withmonocular vision (using one camera only) becomes anemerging, yet more challenging task. Most of the monocularvision based collision avoidance systems apply cues likeoptical flow, visual odometry to sense the obstacles inthe vicinity of a robot [9, 11, 12, 13, 14, 15]. He et al.applied the motion field data of the camera and other IMUreadings of the UAV to precisely assess the ego-motionparameters to fine-tune the motion estimates [26]. Achteliket al. employed the IMU measurements in co-ordinationwith the vision data of the front camera to subdue the effectof less-frequent vision updates, where the accurate scaleof the pose is estimated using a pressure sensor. [27]. Saet al. developed an autonomous PID controller algorithmusing Parallel Tracking and Mapping (PTAM) and Kalmanfilter [15]. Lioulemes et al. proposed a vanishing point basedalgorithm for autonomous navigation of UAV in indoorcorridors [28]. Engel et al. employed a monocular SLAMtechnique to approximate the UAV pose and integratedthe results with an extended Kalman filter to compensatethe communication delay. They further designed a PIDcontroller algorithm to autonomously navigate the UAVwithout any collision [29, 30]. In another work, Wang et al.introduced a learning based optical flow method for UAVnavigation [13]. Carranza et al. implemented a methodbased on visual odometry for controlling and localizing theUAV in GPS-denied environments [14]. Bills et al. proposedan autonomous UAV navigation algorithm for avoidingcollision in different indoor environments such as corridors,staircases and closed rooms using single image perspectivecues [31]. Martins et al. utilized the video from a staticmonocular camera to navigate the UAV in unknown indoorenvironments [32] with the help of door detection at theend of the corridor using image processing techniques.A group of researchers from Czech Technical Universityin Prague utilized the inputs from a monocular cameraand odometry to develop a navigation system based ondead-reckoning and histogram voting scheme [33]. Anothergroup of researchers utilized the visual features and theirscale change to estimate the distance from the frontalobstacle [11, 12, 34].

Our current work utilizes the video from the frontcamera to safely navigate the UAV in indoor corridorenvironments. Unlike existing methods, where multiplesensors are employed to accomplish the task of autonomousnavigation, our proposed method uses only a monocularstatic camera. Apart from using the front facing camera,other sensors are neither used nor any extra sensors areadded to the UAV for carrying out the task of obstacledetection and safe navigation.



4

3 PROPOSED METHODOLOGY

In this article, we present a collision avoidance algorithm forautonomous UAV navigation through a corridor. In otherwords, the UAV, in motion, should be able to navigatewithout any collision to the side wall, at the same time,detect and avoid the frontal obstacle at the end of thecorridor. The overall task, using the proposed approach, canbe achieved using the following steps.a. Prior knowledge: Estimating the position of vanishingpoint in each processed frame.b. Collision avoidance against side wall: An algorithm isdesigned to guide the appropriate direction for the flyingUAV based on the knowledge of vanishing point on the 2-Dimage plane and a formulated measure.c. Collision avoidance against front-wall: An algorithmis designed to estimate the depth of a frontal obstaclefrom the current position of the flying UAV, inspired bythe perspective vision principle, to cease the motion beforehitting the corridor end.

3.1 Vanishing point estimation

Vanishing point (VP) is a point, present at infinity, to whichall the parallel lines intersect. In other words, it can be saidthat the lines having same directional vector in 3D willconverge to the same VP. If the camera axes are parallelto the the 3D world co-ordinate system, the lines parallelto the axis which is cut by the camera plane will convergeat the infinity, i.e the VP. VP formation inside a corridor isdepicted with the help of Figure 3.

Earlier schemes usually apply the least-square paradigmthat requires matrix inversion to obtain the position ofVP [28], a time consuming process. In addition, lack ofsufficient input lines often create the rank-deficiency ill-poseproblem. In our work, we present a simple, yet fast methodto compute the position of VP on a 2-D image plane. Thesteps required for VP estimation are as follows:(a) Convert the input RGB image to its gray equivalent.(b) Reduce the noise content by applying Gaussian blur.(c) Extract the set of parallel lines intersecting at infinity:The suitable candidates for the set of required parallel lines,in a corridor environment, correspond to the intersectionsof the side walls with the floor and the roof as these lineshave a high probability of intersecting perpendicularly withthe image plane. To accomplish this task, we first apply thepopular Canny edge detector on the top one-third portionsof image as well as on the bottom one-third portions of theimage that yield a binary map with a set of edges. However,these edges are discontinuous and random. Therefore, theHough line transform is applied on the binary map toacquire a set of perspective lines. Hough line transform usesthe polar coordinate system (ρ, α) to represent a line onthe image plane, given by x cosα + y sinα = ρ, where ρrepresents the length of the perpendicular drawn from theorigin onto the line and α represents the angle made by theperpendicular with positive X-axis. Pictorial representationof Hough line transform is depicted in Figure 4. It canbe noticed from the figure that the direction of Y-axis onimage co-ordinate system differs by 180◦ from the standardCartesian co-ordinate system. The set of Hough lines arethen grouped into two classes: positive-slopped-lines (lines

Fig. 3: Formation of vanishing point inside a corridorenvironment. The blue point represents the vanishing point.It can be deduced that the lines in the 3D world co-ordinatesystem which are perpendicular to the image plane andhaving same direction cosines converge at the infinity andform the vanishing point on the image plane.

Fig. 4: Hough line representation on an image plane

with slope of the perpendicular > 0, i.e 0◦ ≤ α < 90◦), andnegative-slopped-lines (lines with slope of the perpendicular< 0, i.e 90◦ ≤ α < 180◦). However, all the extracted linesdo not correspond towards the estimation of VP. Hence,the outliers need to be removed for a better prediction.



5

The outlier lines are discarded based on the followingconditions:Condition 1: In particular, the horizontal and vertical linesdo not play any role in the estimation of VP as these lines areparallel to image co-ordinate axes. Accordingly, we consideronly those lines where α lies within the following range,decided by means of experimental analysis.

{

20◦ < α < 70◦ for positive-slopped-lines

110◦ < α < 160◦ for negative-slopped-lines

Condition 2: Furthermore, it has been observed that VPusually lies in the close proximity of the image center.Therefore, we include only those lines that pass acrossa rectangular window taken around the image centerhaving 〈 width × height 〉 = 1

4× 〈 input-image-width ×

input-image-height 〉. This rectangular window makes surethat the lines which contribute towards the formation of VPare only considered.(d) Estimation of VP location: VP lies at the minimumdistance from all the filtered out Hough lines. We proposea fast procedure named Mean line Method (MLM) toestimate the VP. ρ and α values of all the positive-slopped-linesare averaged out to yield a single positive-slopped-line,parameterized by

(

ρp, αp

)

. The same is also applied forall the negative-slopped-lines to yield one negative-slopped-line,parameterized by (ρn, αn). The equations for both the linesare given by —{

x cosαp + y sinαp = ρp for positive-slopped-lines

x cosαn + y sinαn = ρn for negative-slopped-lines

where{

αp =∑

αi

Np, ρp =

∑ρi

Npfor 20◦ < αi < 70◦

αn =∑

αi

Nn, ρn =

∑ρi

Nnfor 110◦ < αi < 160◦

The intersection of these two lines is estimated as theVanishing Point. Mathematically —

V P =

(

ρp sinαn − ρn sinαp

sin(αn − αp),ρn cosαp − ρp cosαn

sin(αn − αp)

)

(1)

where Np and Nn are the number of positive-slopped-linesand negative-slopped-lines respectively. As compared to theprevious method, which uses least-square paradigm [28]and matrix inversion to estimate the VP, proposed methodis very fast.

3.2 Autonomous navigation of UAV through corridor

The position of the UAV in the real world is directlyrelated to the location of VP on the image plane. VPlies close to the central vertical line of the image plane,whenever the underlying UAV resides approximately nearthe median axis of the two side walls. However, anysignificant deviation of VP from the central-vertical line onthe image plane implies that the UAV is out-of-track fromthe desired path that may collide against either of the sidewalls. A typical example with three possible scenarios ofVP location is depicted in Figure 5. It can be observed thatVP lies within the two blue vertical lines (safe region) inCase-(a), however resides outside in Case-(b) and (c). In thelatter cases, the vision module needs to modify the control

commands such that VP will move within the securedregion.

Besides the position of VP, another major factor,responsible for computing the required control commands,is the relative position of VP on each of the two intersectinglines with respect to a reference axis. Any of the imageboundary axis can be taken as the reference line. We havetaken the X-axis as the reference for determining the relativeposition of VP on the intersecting lines. This has been betterillustrated in Figure 6 with four possible scenarios. One endof the positive-slopped-line meets the X-axis at Q, whereas theother end meets the image plane at R. It has to be carefullyobserved that Q may lie either on the image plane (Case-cand case-d) or beyond the image plane (Case-a and case-b).In other words, one end of the intersecting line is allowedto go beyond the image boundary to meet the referenceaxis, if necessary. The same scenarios are also reflected forthe negative-slopped-line, where S corresponds to the pointof intersection of the line with the reference X-axis and Tcorresponds to the intersection of the line with image plane.

Let W and H be the width and height of the image planeand P(Vx, Vy) be the position of VP at any instant of time.We introduce three terms, Positive-slope-line-ratio (PSLR),Negative-slope-line-ratio (NSLR), and PSLR-to-NSLR-ratio(PTNR), defined below.

PSLR = PR

PQ, NSLR = PT

PS, PTNR = PSLR

NSLR

The vision module requires the knowledge of twoparameters for generating the next command: (a) positionof vanishing point P

(

Vx, Vy

)

, and (b) PTNR. Furthermore,the direction of UAV is decided only by the abscissa ofVP, i.e Vx. In the best case, VP should lie on the centralvertical line, given by X = W

2, and PTNR should be

1 (one). We consider a bounded region enclosed by twovertical lines X = W

2± δ, where δ defines the bandwidth

of safe region within which Vx should lie. In our case,we have taken δ = 0.04× Image-Width. Similarly, weconsider an acceptable range for PTNR, given by 1 ± ǫ;In our work, ǫ is experimentally set 0.02. When the UAVis too far from the median axis of the two side walls(possible collision scenarios), at least any of the followingconditions holds: (a) Vx > W

2+ δ, (b) Vx < W

2− δ,

(c) PTNR ≥ 1+ ǫ, (d) PTNR ≤ 1− ǫ. It may so happen thatwhile maneuvering, in some cases, VP may not be visible.In such cases, for a certain number of frames (f ), the UAVwill not change its direction. We have experimentally setf = 30. If the vanishing point continues to remain invisibleafter that, it can be inferred that the vision module could notextract the required lines for vanishing point formation asthe UAV is approaching towards the end of the corridor. Atthis instant of time, an obstacle detection algorithm shouldbe employed to measure the depth of the frontal obstacle(usually a wall) at the end of the corridor. The possiblescenarios of Vx and PTNR are taken into consideration toprepare a rule base that governs the generation of controlcommands to navigate the UAV within the corridor saferegion, enumerated in Algorithm 1.



6

(a) (b) (c)

Fig. 5: Various possible vanishing point locations. C is the center of the image plane and P is the vanishing point. δ is thebandwidth of the safe region, which is enclosed by two vertical blue lines.

(a) (b)

(c) (d)

Fig. 6: PSLR and NSLR measures with reference to X-axis

3.3 Frontal Collision Avoidance

Modeling: In perspective vision, whenever the relativedistance between an object and the center of projectiondecreases, the object size on the image plane increases.The relative scale of expansion of an object, being capturedin successive frames by a front facing camera of UAV, isformulated to approximate the depth of the obstacle fromthe current position. Consider a UAV that takes t unit oftime to traverse from one defined location to another witha constant velocity, and let the size of a head-on object,in the camera view, increases from h to s × h, where sbeing the scaling factor. The whole scenario is depicted inFigure 7. Height of the frontal obstacle AB is H0 in 3D worldcoordinates. The UAV navigates from position E to C witha constant velocity v in time t. At the same time, height ofthe frontal object increases from h to sh on the image plane(pixel-wise expansion). Let f be the focal length of the staticpin-hole camera of the UAV, installed at the front.

It may be observed from Figure 7 that ∆ABC ∼ ∆DPC

Fig. 7: Pin-hole camera model. Visualizing the relative scalechange of an object AB, when the UAV moves towards theobstacle with velocity v from point E to C

and ∆ABE ∼ ∆FGE. Therefore,

DC

DP=

AC

AB⇒

f

sh=

AC

Ho

and

FE

FG=

AE

AB⇒

f

h=

AE

Ho



7

Algorithm 1: Control command generation

Input: Vanishing point abscissa: Vx,PSLR-to-NSLR-ratio: PTNR

Output: UAV direction: [pitch, roll, yaw]Initialisation: Number of frames without vanishing

point N ← 0, f ←Number of thresholdframes to initiate obstacle avoidancealgorithm, UAV initial direction←Pitch Forward

1 if V anishing point not found then2 if N < f then3 Actuate UAV in previous direction4 N ← N + 15 else6 Jump to Algorithm 2 for frontal obstacle

avoidance

7 else8 if 1− ǫ < PTNR < 1 + ǫ then

9 if Vx < W2− δ then

10 Actuate UAV in Yaw Left

11 else if Vx > W2+ δ then

12 Actuate UAV in Yaw Right13 else14 Actuate UAV in Pitch Forward

15 else if PTNR ≤ 1− ǫ then16 Actuate UAV in Roll Left17 else if PTNR ≥ 1 + ǫ then18 Actuate UAV in Roll Right

19 N ← 0

20 return [pitch, roll, yaw]

From the above two equations, it can be deduced that

s =AE

AC⇒ sAC = AE ⇒ sAC = AC + CE

⇒ AC =CE

s− 1

⇒ AC =v × tEC

s− 1

⇒ AC =v × (tC − tE)

s− 1

The above equation represents the depth of the UAVfrom the current position C to the frontal obstacle stationedat point A. Hence, the final equation for depth can beexpressed as —

DC =v × t

s− 1, s > 1 (2)

where t = tC − tE , indicates the time required by the UAVto navigate from position C to A with a constant velocityv. However, The above expression assumes that the UAVmoves with a constant velocity on a straight path. In realisticsituations, we may encounter two major issues as describedbelow-.Issues:

(a) A pragmatic scenario could be the absence of anywell-shaped objects at the end of the corridor. In most ofthe indoor environments, the end of a corridor is usually a

wall. Hence it is implausible to detect the scale change forestimating depth.(b) Rotor turbulence and gusty wind may deviate the UAV’sflight path from a straight line, hence yielding noisy depthmeasurements.Solution:(a) First issue is resolved with the help of scale-invariantkeypoints detected on the frontal obstacle. Instead ofconsidering the scale change of an object on the imageplane, relative scale expansion of the Euclidean distance ofscale-invariant keypoints from their centroid in a number offrames is considered to estimate the depth (Figure 8).(b) To alleviate the second issue, well-established constantvelocity based 1D-Kalman filter is employed to remove thenoisy depth measurements.

Detailed explanation of above solutions is delineated infollowing sections.Depth Estimation with Scale-invariant Keypoints: Thepremise of our approach lies with the relative increase in theEuclidean distance of the matched scale-invariant keypointsfrom their centroid position in the respective frames. Inrecent years, scale invariant visual features such as SIFT [35],SURF [36], ORB [37] have gained huge popularity inaccomplishing various vision-centric tasks. However, in ourwork, we employ the SURF feature descriptor as it is fasterthan SIFT and more robust than ORB.

As the UAV approaches an obstacle, the Euclideandistance of the keypoints from their centroid position inrespective frames increases and so the ratio, which ispictorially depicted in Figure 8. If the frame-f occurs beforeframe-g with KPf and KPg denoting their SURF keypointlists respectively, the SURF match between two frames canbe expressed as —

Mfg = Match(KPf ,KPg)

Mfg contains the tuple of pair-wise scale invariantkeypoints, one from frame-f and other from frame-g.

The Euclidean scale for a matched pair of keypointsis defined as the ratio of the Euclidean distances of thekeypoints from the centroid of their respective frames.Mathematically, the Euclidean scale for the ith entry in Mfg

can be defined as —

si =egi

efi(3)

where egi and efi denote the Euclidean distances of theith matched keypoint from the centroid of the matchedkeypoints in respective frames.

Our algorithm requires only those matches in Mfg,where the Euclidean distance scale can be utilized toestimate a valid depth value. Hence, some of the matchesare discarded based on following condition.Condition: As the UAV is moving in the forward direction,the Euclidean distance of the matched keypoints from theircentroid in respective frames should increase subsequentlyand the Euclidean scale must be greater than 1. Thosematches, where the Euclidean scale is decreasing orremaining same will yield an erroneous depth value andhence, should be discarded. In other words, if si ≤ 1, ith

match is discarded from Mfg.



8

Fig. 8: Depth estimation at current frame-g using the SURF matches with frame-f . It may be noted than the frame-f isnot necessarily the immediate predecessor of frame-g, rather it is a frame among all the previously processed frames inlast 1 second and contains maximum number of SURF matches with the current frame. Blue keypoints represent the SURFmatches and the red point is the centroid of the matched keypoints in respective frames. It can be noticed that some of thekeypoints (green) are not matched. The Euclidean scale si =

egiefi

, where efi and egi denote the Euclidean distances of the

ith matched keypoint from the centroid in respective frames.

After incorporating the above condition and using onlythe valid Euclidean scale values, the depth equation in (2)can be redefined as —

Dfgi =v × |tg − tf |

si − 1(4)

where Dfgi denotes the depth of the frontal obstacle fromthe current frame g, considering only the ith matchedkeypoint from the SURF matches obtained between frame-fand frame-g. tf and tg denote the time-stamps for frame-fand frame-g respectively. Putting the value of si from (3) in(4), the depth equation can be rewritten as —

Dfgi =v × efi × |tg − tf |

egi − efi(5)

The representative depth is defined as the depth of thefrontal obstacle from the UAV current position consideringall the valid matched keypoints. Hence, it is averaged overall the depth values. Mathematically —

Dfg =1

|Mfg|

|Mfg |∑

i=1

Dfgi (6)

where Mfg contains only the valid matched keypoint pairs.

Algorithm 2: Depth Measurement at frame g

Input: The frame-f is a frame processed in last 1 secondand has the largest SURF match list with frame -g

Result: Depth at current frame-g: Dfg

1 Mfg ←Match(KPf , KPg);2 Calculate the Euclidean scale si (3) for each entry i

in Mfg ;3 Discard the entry i from Mfg , if si ≤ 1;4 for each match i in Mfg do5 Evaluate Dfgi by (5);

6 Compute the representative depth of all |Mfg| matchedkeypoints as in (6):

7 Dfg ←1

|Mfg|

∑|Mfg|i=1 Dfgi , |Mfg| 6= 0;

8 return Dfg

It may be noted that the current frame-g might haveSURF matches with one or more of the previous frames.

In our work, we consider only those frames, which wereprocessed in last 1 second. Out of those matches, thematch having the largest number of entries is consideredfor the final depth estimation. The algorithm for depthmeasurement is outlined in Algorithm 2.Depth Correction with 1D Kalman Filter: The depthestimated in the previous step is based on the assumptionthat the UAV is navigating on a straight line path withoutany deviation. However, in real scenarios, many externaland internal factors such as wind and rotor turbulencemight alter the direction slightly from a straight line path,giving rise to erroneous depth measurements. In our work,we have modeled a constant velocity aided Kalman filter torectify the noisy measurements.

We follow the principles of a 1D-Kalman filter as thereis only one state variable to predict. Let D0 be the initialdepth of the frontal obstacle. If the UAV is navigating witha constant velocity of v units, the depth after time δt can beiterated as —

Di = Di−1 − (v × δt)

However, the depth estimated by the above formula iserroneous because of the deviation in path, as discussedabove. Hence the generative estimate for depth can bewritten as a summation of the estimated depth and a whiteGaussian noise (w).

Di = Di−1 − (v × δt) + w

The expectation and variance of Di can be written as —

{

Di = Di−1 − (v × δt)

σ2i = σ2

i−1 + σ2w

(7)

From the above equation, it can be inferred that the depthcan be precisely obtained in future if there is no uncertainty(σw = 0). However, in real scenarios, σw > 0 and thepredicted depth becomes increasingly erroneous as theprocess continues. Hence, the depth measured by our visionalgorithm acts as a perfect catalyst for a better prediction.As the depth measured by our vision module is not also



9

accurate, it gives rise to another Gaussian noise assumption.The measured depth can be written as —

Xi = Di + y (8)

where y is the noise caused due to measurement.Equations 7 and 8 can be put into the Kalman filter updateequation [38] as follows:

κi =σ2

i

σ2

i +σ2y

D′

i = Di + κi(Xi − Di)

σ′2

i = (1− κi)σ2i

(9)

where κi is the Kalman gain for ith iteration. The aboveequation formulates the rectified depth by taking intoaccount both the predicted and measured depth values.When κi = 0, there is no trust on predicted data, howeverif κi = 1, there is complete trust.Control command generation: The VM designs a desiredcontrol command (change in yaw, pitch and roll values)based on the rectified depth value, and communicates withthe UAV. The UAV PID controller implements the modifiedcommand to navigate the UAV safely before it collides withthe front-wall.

4 EXPERIMENTS

The proposed algorithm is validated in a series of realworld indoor corridors having varying dimensions. In thissection, we first outline the system setup followed by theexperiments carried out for collision free indoor corridornavigation.

4.1 System Setup:

(a) The proposed algorithm is validated using a Parrot ARDrone power edition quadcopter having one front facingstatic camera, one down facing camera, and IMUs likeaccelerometer and gyroscope.(b) Apart from using the video from the front facing camera,our vision algorithm neither uses the input from any othersensor nor extra sensors are used to carry out the task ofautonomous navigation.(c) The frames from the front facing camera are recordedat 30 fps with a 92◦ (fisheye) FoV having 640 × 360 frameresolution. The captured fisheye frames are then convertedto its equivalent rectilinear form for further processing.(d) Once the UAV is powered on, it generates its own wifizone and the ground based VM is connected to it for furthercommunication.(e) The VM is executed on a ground station having Ubuntu14.04 OS, 4 GB RAM, and 2.4 GHz processor. We employedC++ with OpenCV libraries to carry out various computervision tasks. Communication between the UAV and the VMis performed via the Robot Operating System (ROS)1.(f) A UAV successfully completes an experiment, if thefollowing conditions are satisfied:

• It navigates through the corridor with no contactwith the side walls.

1. https://github.com/AutonomyLab/ardrone autonomy

• It successfully ceases the UAV motion at the end ofthe corridor without hitting the front-wall.

(g) It may be noted that, our vision based obstacle avoidancealgorithm will not work as desirable, when the VM exits thewifi zone created by the UAV. However, our proposed worksolely concentrates on vision based autonomous navigationrather than the inherent hardware and software propertiesof the UAV. Hence, while performing the experiments, wemade sure that the VM never exits the wifi zone created bythe UAV.(h) Although the experiments are carried out with aParrot AR Drone quadcopter, the proposed vision basedautonomous navigation algorithm will perform for anyUAV equipped with a forward facing static camera.(i) All experimental results including the tables,figures, some of the video outputs are uploaded athttps://sites.google.com/site/monoVisionUAV.

4.2 Implementation:

We let the Parrot AR Drone quadcopter fly autonomouslyfrom a start point inside indoor corridors having varyingdimensions. A set of corridors used in our experiments isdelineated in Figure 9.

The various steps of Vanishing Point estimation areshown in Figure 10(a)–(e). An instance of the observedcorridor view is shown in Figure 10(a). The steps carriedout for collision free corridor navigation are as follows:Step 1: Our proposed model helps in real-time autonomousUAV navigation inside corridor environments. The visionmodule first extracts the video from UAV front camera on aframe-by-frame basis through the ROS interface. One suchframe is shown in Figure 10(a).Step 2: The input RGB image is converted to its grayequivalent for further processing. A Guassian blur of kernel5 × 5 is applied to the gray image to remove noise in theinput. The parallel lines in 3D world coordinate system,which are perpendicular to the image plane, intersect atthe infinity to form the vanishing point. In a corridorenvironment, most of those lines appear due to the edgesformed by the intersection of the wall with the floor andthe roof. Hence, the binary map, as shown in Figure 10(b),is obtained by applying the Canny edge detector on the topand bottom one-third portions of the image, however noton the middle area.Step 3: The Hough line transform on the Canny edge-mapyields a set of lines (Figure 10(c)). The outlier lines, which donot play any role in the formation of the VP are discarded.The conditions for discarding the outliers are describedin Section 3.1. The lines obtained after filtering are shownin Figure 10(d). The central green box represents the areathrough which all the lines should pass to yield a valid VP.Step 4: After removing the outliers, all thepositive-slopped-lines and negative-slopped-lines are averagedrespectively to yield two resultant lines, whose intersectionis estimated as the vanishing point, represented by a yellowcircle in Figure 10(e). The area bounded by two verticalpink lines and the image plane defines the safe region,where the VP should ideally lie for safe navigation. We haveexperimentally chosen the bandwidth of the safe region asδ = 0.04× Image-Width. For our experiments, Image-Width



10

(a) (b) (c)

Fig. 9: various instances of indoor corridors

= 640 pixels, hence δ ≈ 26 pixels.Step 5: The VP location and the PTNR are responsiblefor navigating the UAV safely in the appropriate directionwithout colliding with the side walls (Algorithm 1). This hasbeen suitably depicted in Figure 10(f)—(l), and summarizedin Table 1; PTNR measures the ratio of PSLR to NSLR.Ideally, when the UAV navigates over the median axis ofthe corridor, PTNR = 1. We have experimentally chosenthe safe range for PTNR as [1− ǫ, 1 + ǫ] and ǫ = 0.02.Step 6: When the UAV approaches towards the end of thecorridor, the VP will not be visible. However, the invisibilityof the VP might be caused due to some erroneous inputs.Hence, in cases where the VP is not visible, the UAVwill move in the previous direction for 30 frames (chosenexperimentally). However, if the VP continues to remaininvisible after that, it can be concluded that the UAV isapproaching towards the end of the corridor and the frontalcollision avoidance algorithm should be employed to avoidhead-on collision.Step 7: SURF scale invariant visual features are utilized toestimate the depth of the frontal obstacle (Algorithm 2);usually a wall at the end of the corridor. The noise in depthestimation arising due to different factors such as wind androtor turbulence, is then rectified with the help of a constantvelocity aided 1D-Kalman filter.

TABLE 1: UAV direction based on vanishing point abscissa(Vx) and PTNR. (δ ≈ 26 pixels, ǫ = 0.02)

Figure 10 PSLR NSLR PTNR Vx Command

(f) 0.731 0.732 0.998 within safe region Pitch Forward(g) 0.462 0.903 0.511 left-side to safe region Roll Left(h) 1.062 0.933 1.137 within safe region Roll Right(i) 0.8593 0.8592 1.001 left-side to safe region Yaw Left(j) 0.905 0.901 1.003 right-side to safe

regionYaw Right

(k) nan nan nan no VP (only a singleline)

Previous Direction

(l) nan nan nan no VP (no line) Previous Direction

4.3 Results and Discussion:

We have conducted experiments on five different corridorenvironments with 50 trials across each environment. It maybe noted that there does not exist any publicly availablestandard dataset in this research area. Also, the UAV andsensors used by different researchers for the task of collisionavoidance are environment and application dependent.Hence the results are shown for our own real-time flightexperiments. We introduce three measures to check theefficacy of the proposed algorithm.(a) No Collision Rate (NCR): Measures the percentage oftimes the UAV successfully navigates without hitting theside walls over the total number of trials.(b) Full Flight Rate (FFRδ): Measures the percentage oftimes the UAV collides the side walls, yet successfullynavigates the corridor without ceasing its motion, over thetotal number of trials, depending on a threshold δ equals tonumber of collisions with the side walls for a single trial.Accordingly, we define the following performance metrics:(i) FFR2: The percentage of trials where δ ≤ 2(ii) FFR5: The percentage of trials where δ ≤ 5(iii) FFR5+: The percentage of trials where δ > 5(c) Obstacle Avoidance Rate (OAR): Measures thepercentage of times the UAV successfully detects thefront-wall at the end of the corridor and ceases its motionover the total number of trials.

The UAV and the vision senor used by our method, andthe methods proposed by Bills et al. [31] and Lioulemeset al. [28] are similar. Hence, we have compared the resultsby implementing their method in our real-world corridorexperiments. However, their method does not performcollision avoidance at the end of the corridor. The resultsof different performance measures in various environmentsare detailed in Table 2.



11

(a) Corridor (b) Canny edges (c) Hough lines before filtering (d) Hough lines after filtering

(e) Estimated vanishing point (f) UAV next direction: pitchforward

(g) UAV next direction: roll left (h) UAV next direction: roll right

(i) UAV next direction: yaw left (j) UAV next direction: yaw right (k) When there is only a single line (l) When there is no line

Fig. 10: (a–e) Various stages of VP estimation, (f–l) possible scenarios of UAV motion

TABLE 2: Various accuracy measures for the proposed method

Environment Corridor 1 Corridor 2 Corridor 3 Corridor 4 Corridor 5 TOTAL

No. of trials 50 50 50 50 50 250

NCRBills et al. [31] 37 (74%) 35 (70%) 33 (66%) 33 (66%) 35 (70%) 173 (69.2%)

Lioulemes et al. [28] 37 (74%) 36 (72%) 35 (70%) 34 (68%) 33 (66%) 175 (70.0%)Proposed 41 (82%) 40 (80%) 45 (90%) 33 (66%) 37 (74%) 196 (78.4%)

FFR2

Bills et al. [31] 40 (80%) 38 (76%) 37 (74%) 36 (72%) 36 (72%) 187 (74.8%)Lioulemes et al. [28] 40 (80%) 37 (74%) 36 (72%) 35 (70%) 35 (70%) 183 (73.2%)

Proposed 45 (90%) 45 (90%) 46 (92%) 37 (74%) 41 (82%) 214 (85.6%)

FFR5


Proposed 48 (96%) 47 (94%) 48 (96%) 39 (78%) 44 (88%) 226 (90.4%)

FFR5+


Proposed 50 (100%) 50 (100%) 50 (100%) 40 (80%) 45 (90%) 235 (94.0%)

OAR Proposed 44 (88%) 37 (74%) 43 (86%) 35 (70%) 34 (68%) 193 (77.2%)

5 CONCLUSION

In this work, a monocular vision assisted scheme forautonomously navigating a UAV in corridor environmentsis proposed. A simple and fast method of vanishing pointestimation is proposed. An algorithm, based on the positionof vanishing point and a formulated measure, is proposed tomodify the yaw, pitch, and roll to safely navigate the UAV inthe close proximity of median axis of the two side walls. Inaddition, a scale invariant feature based obstacle avoidancealgorithm is also proposed to avoid the obstacles/wallspresent at the end of a corridor. Our scheme has beentested under different corridor environments and is foundto be very robust in terms of No Collision Rate, Full FlightRate, and Obstacle Avoidance Rate. As the algorithm usesonly the video extracted from front camera, it is a goodcandidate for UAVs with power and weight constraints.

Also, the proposed model can be implemented in disasterhit indoor environments, where the human intervention isnot advisable, and thereby results in sustainability.



12

ABBREVIATIONS

FFR : Full Flight RateFoV : Field of ViewGPS : Global Positioning SystemIMU : Inertial Measurement UnitLIDAR : Light Detection And RangingNCR : No Collision RateNSLR : Negative-slope-line-ratioOAR : Obstacle Avoidance RateORB Features : Oriented FAST and rotated BRIEF featuresPID Controller : Proportional Integral Derivative ControllerPSLR : Positive-slope-line-ratioPTNR : PSLR-to-NSLR-ratioROS : Robot Operating SystemSIFT Features : Scale Invariant Feature Transform FeaturesSLAM : Simultaneous Localization and MappingSURF : Speeded up Robust FeaturesUAV : Unmanned Aerial VehicleVM : Vision ModuleVP : Vanishing Point

ACKNOWLEDGMENTS

This research work has been partially supported by thefollowing projects:1. Grant Number 1(1)/ISEA-II/PMU/2015 of InformationSecurity Education and Awareness (ISEA) Phase-II projectfunded by Ministry of Electronics and InformationTechnology (MeitY), Government of India.2. Grant Number SB/FTP/ETA-0059/2014 by Science andEngineering Research Board (SERB), Department of Scienceand Technology, Government of India.

REFERENCES

[1] J. Wendel, O. Meister, C. Schlaile, and G. F. Trommer,“An integrated GPS/MEMS-IMU navigation systemfor an autonomous helicopter,” Aerospace Science andTechnology, vol. 10, no. 6, pp. 527–533, 2006, doi:10.1016/j.ast.2006.04.002.

[2] X. Wang, N. Cui, and J. Guo, “INS/VisNav/GPSrelative navigation system for UAV,” Aerospace Scienceand Technology, vol. 28, no. 1, pp. 242–248, 2013, doi:10.1016/j.ast.2012.11.004.

[3] X. Yun, E. Bachmann, R. McGhee, R. Whalen,R. Roberts, R. Knapp, A. Healey, and M. Zyda, “Testingand evaluation of an integrated GPS/INS systemfor small AUV navigation,” IEEE Journal of OceanicEngineering, vol. 24, no. 3, pp. 396–404, 1999, doi:10.1109/48.775301.

[4] Y. Cui and S. S. Ge, “Autonomous vehicle positioningwith GPS in urban canyon environments,” IEEETransactions on Robotics and Automation, vol. 19, no. 1,pp. 15–25, 2003, doi: 10.1109/TRA.2002.807557.

[5] K. Gryte, J. M. Hansen, T. Johansen, and T. I. Fossen,“Robust navigation of UAV using inertial sensors aidedby UWB and RTK GPS,” in AIAA Guidance, Navigation,and Control Conference, Grapevine, Texas, 2017, pp.1035–1050, doi: 10.2514/6.2017-1035.

[6] S. Hening, C. A. Ippolito, K. S. Krishnakumar,V. Stepanyan, and M. Teodorescu, “3D LiDARSLAM integration with GPS/INS for UAVs in urbanGPS-degraded environments,” in AIAA InformationSystems-AIAA Infotech @ Aerospace. Grapevine, Texas:AIAA, 2017, pp. 448–457, doi: 10.2514/6.2017-0448.

[7] F. Amzajerdian, V. E. Roback, A. Bulyshev, P. F.Brewster, and G. D. Hines, “Imaging flash LIDARfor autonomous safe landing and spacecraft proximityoperation,” in AIAA SPACE. Long Beach, California:AIAA, 2016, pp. 5591–5602, doi: 10.2514/6.2016-5591.

[8] S. Ramasamy, R. Sabatini, A. Gardi, and J. Liu,“LIDAR obstacle warning and avoidance system forunmanned aerial vehicle sense-and-avoid,” AerospaceScience and Technology, vol. 55, pp. 344–358, 2016, doi:10.1016/j.ast.2016.05.020.

[9] S. Weiss, M. W. Achtelik, S. Lynen, M. Chli, andR. Siegwart, “Real-time onboard visual-inertial stateestimation and self-calibration of MAVs in unknownenvironments,” in IEEE International Conference onRobotics and Automation. Saint Paul, MN, USA: IEEE,2012, pp. 957–964, doi: 10.1109/ICRA.2012.6225147.

[10] C. Schlaile, O. Meister, N. Frietsch, C. Keßler, J. Wendel,and G. F. Trommer, “Using natural features for visionbased navigation of an indoor-VTOL MAV,” AerospaceScience and Technology, vol. 13, no. 7, pp. 349–357, 2009,doi: 10.1016/j.ast.2009.09.001.

[11] T. Mori and S. Scherer, “First results in detectingand avoiding frontal obstacles from a monocularcamera for micro unmanned aerial vehicles,” in IEEEInternational Conference on Robotics and Automation.Karlsruhe, Germany: IEEE, 2013, pp. 1750–1757, doi:10.1109/ICRA.2013.6630807.

[12] S. Saha, A. Natraj, and S. Waharte, “A real-timemonocular vision-based frontal obstacle detectionand avoidance for low cost UAVs in GPS deniedenvironment,” in IEEE International Conference onAerospace Electronics and Remote Sensing Technology.Yogyakarta, Indonesia: IEEE, 2014, pp. 189–195, doi:10.1109/ICARES.2014.7024382.

[13] C. Wang, W. Liu, and M. Q.-H. Meng, “Obstacleavoidance for quadrotor using improved method basedon optical flow,” in IEEE International Conference onInformation and Automation. Lijiang, China: IEEE, 2015,pp. 1674–1679, doi: 10.1109/ICInfA.2015.7279555.

[14] J. Martınez-Carranza, E. O. Garcia, H. J. Escalante,and W. Mayol-Cuevas, “Towards autonomous flightof low-cost MAVs by using a probabilistic visualodometry approach,” in Advances in ArtificialIntelligence and its Applications. Cuernavaca,Morelos, Mexico: Springer, 2015, pp. 560–573, doi:10.1007/978-3-319-27101-9 43.

[15] I. Sa, H. He, V. Huynh, and P. Corke, “Monocular visionbased autonomous navigation for a cost-effectiveMAV in GPS-denied environments,” in IEEE/ASMEInternational Conference on Advanced IntelligentMechatronics. Wollongong, NSW, Australia: IEEE,2013, pp. 1355–1360, doi: 10.1109/AIM.2013.6584283.

[16] S. Grzonka, G. Grisetti, and W. Burgard, “Autonomousindoor navigation using a small-size quadrotor,” inWorkshop Proceedings SIMPAR, 2008, pp. 455–463.

[17] A. Bachrach, S. Prentice, R. He, and N. Roy,“RANGE–Robust autonomous navigation inGPS-denied environments,” Journal of Field Robotics,vol. 28, no. 5, pp. 644–666, 2011, doi: 10.1002/rob.20400.

[18] N. Gageik, P. Benz, and S. Montenegro, “Obstacledetection and collision avoidance for a UAV with

10.1016/j.ast.2006.04.002

10.1016/j.ast.2012.11.004

10.1109/48.775301

10.1109/TRA.2002.807557

10.2514/6.2017-1035

10.2514/6.2017-0448

10.2514/6.2016-5591

10.1016/j.ast.2016.05.020

10.1109/ICRA.2012.6225147

10.1016/j.ast.2009.09.001

10.1109/ICRA.2013.6630807

10.1109/ICARES.2014.7024382

10.1109/ICInfA.2015.7279555

10.1007/978-3-319-27101-9_43

10.1109/AIM.2013.6584283

10.1002/rob.20400



13

complementary low-cost sensors,” IEEE Access, vol. 3,pp. 599–609, 2015, doi: 10.1109/ACCESS.2015.2432455.

[19] G. C. S. Cruz and P. M. M. Encarnacao, “Obstacleavoidance for unmanned aerial vehicles,” Journal ofIntelligent & Robotic Systems, vol. 65, no. 1-4, pp.203–217, 2012, doi: 10.1007/s10846-011-9587-z.

[20] M. Y. Chen, D. H. Edwards, E. L. Boehmer, N. M.Eller, J. T. Slack, C. R. Speck, S. M. Brown, H. G.Williams, S. H. Wilson, C. S. Gillum et al., “Designinga spatially aware and autonomous quadcopter,” inSystems and Information Engineering Design Symposium.Charlottesville, VA, USA: IEEE, 2013, pp. 213–218, doi:10.1109/SIEDS.2013.6549521.

[21] A. Stubblebine, B. Redmond, B. Feie, andE. Kivelevitch, “Laser-guided quadrotor obstacleavoidance,” in AIAA Infotech @ Aerospace.Kissimmee, Florida: AIAA, 2015, pp. 2026–2037,doi: 10.2514/6.2015-2026.

[22] J. F. Roberts, T. Stirling, J.-C. Zufferey, andD. Floreano, “Quadrotor using minimal sensing forautonomous indoor flight,” in European Micro AirVehicle Conference and Flight Competition (EMAV2007),no. LIS-CONF-2007-006, Toulouse, France, 2007, doi:10.1.1.540.4993.

[23] M. Achtelik, A. Bachrach, R. He, S. Prentice,and N. Roy, “Stereo vision and laser odometryfor autonomous helicopters in GPS-denied indoorenvironments,” in SPIE Defense, Security, and Sensing.Orlando, Florida, USA: International Society forOptics and Photonics, 2009, pp. 733 219–733 228, doi:10.1117/12.819082.

[24] K. McGuire, G. de Croon, C. De Wagter, K. Tuyls,and H. Kappen, “Efficient optical flow and stereovision for velocity estimation and obstacle avoidanceon an autonomous pocket drone,” IEEE Robotics andAutomation Letters, vol. 2, no. 2, pp. 1070–1076, 2017,doi: 10.1109/LRA.2017.2658940.

[25] H. Qin, Y. Bi, K. Z. Ang, K. Wang, J. Li, M. Lan, M. Shan,and F. Lin, “A stereo and rotating laser frameworkfor UAV navigation in GPS denied environment,”in IEEE Annual Conference of the Industrial ElectronicsSociety. Florence, Italy: IEEE, 2016, pp. 6061–6066, doi:10.1109/IECON.2016.7793246.

[26] Z. He, R. V. Iyer, and P. R. Chandler, “Vision-based UAVflight control and obstacle avoidance,” in AmericanControl Conference. Minneapolis, MN, USA: IEEE, 2006,pp. 2166–2170, doi: 10.1109/ACC.2006.1656540.

[27] M. Achtelik, M. Achtelik, S. Weiss, and R. Siegwart,“Onboard IMU and monocular vision based control forMAVs in unknown in-and outdoor environments,” inIEEE International Conference on Robotics and automation.Shanghai, China: IEEE, 2011, pp. 3056–3063, doi:10.1109/ICRA.2011.5980343.

[28] A. Lioulemes, G. Galatas, V. Metsis, G. L. Mariottini,and F. Makedon, “Safety challenges in using AR.droneto collaborate with humans in indoor environments,”in International Conference on Pervasive TechnologiesRelated to Assistive Environments. Rhodes, Greece:ACM, 2014, pp. 33–36, doi: 10.1145/2674396.2674457.

[29] J. Engel, J. Sturm, and D. Cremers, “Camera-basednavigation of a low-cost quadrocopter,” in IEEE/RSJ

International Conference on Intelligent Robots and Systems.Vilamoura, Portugal: IEEE, 2012, pp. 2815–2821, doi:10.1109/IROS.2012.6385458.

[30] ——, “Scale-aware navigation of a low-costquadrocopter with a monocular camera,” Roboticsand Autonomous Systems, vol. 62, no. 11, pp. 1646–1656,2014, doi: 10.1016/j.robot.2014.03.012.

[31] C. Bills, J. Chen, and A. Saxena, “Autonomous MAVflight in indoor environments using single imageperspective cues,” in IEEE international conference onRobotics and automation (ICRA). Shanghai, China: IEEE,2011, pp. 5776–5783, doi: 10.1109/ICRA.2011.5980136.

[32] F. de Babo Martins, L. F. Teixeira, and R. Nobrega,“Visual-inertial based autonomous navigation,”in Robot 2015: Second Iberian Robotics Conference.Switzerland: Springer, 2016, pp. 561–572, doi:10.1007/978-3-319-27149-1 43.

[33] A. S. Etienne, R. Maurer, J. Georgakopoulos, andA. Griffin, “Dead reckoning (path integration),landmarks, and representation of space in acomparative perspective,” Wayfinding behavior :cognitive mapping and other spatial processes, pp. 197–228,1999.

[34] A. Chavez and D. Gustafson, “Vision-based obstacleavoidance using SIFT features,” in Advances in VisualComputing. Las Vegas, NV, USA: Springer, 2009, pp.550–557, doi: 10.1007/978-3-642-10520-3 52.

[35] D. G. Lowe, “Distinctive image features fromscale-invariant keypoints,” International Journal ofComputer Vision, vol. 60, no. 2, pp. 91–110, 2004, doi:10.1023/B:VISI.0000029664.99615.94.

[36] H. Bay, A. Ess, T. Tuytelaars, and L. Van Gool,“Speeded-up robust features (SURF),” Computer Visionand Image Understanding, vol. 110, no. 3, pp. 346–359,2008, doi: 10.1016/j.cviu.2007.09.014.

[37] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski,“ORB: An efficient alternative to SIFT or SURF,”in IEEE International Conference on Computer Vision.Barcelona, Spain: IEEE, 2011, pp. 2564–2571, doi:10.1109/ICCV.2011.6126544.

[38] R. E. Kalman et al., “A new approach to linear filteringand prediction problems,” Journal of basic Engineering,vol. 82, no. 1, pp. 35–45, 1960, doi: 10.1115/1.3662552.

Ram Prasad Padhy received his B.Techdegree in Computer Science from NationalInstitute of Technology Rourkela, India, in2011. He is currently pursuing the Ph.D.degree in the Department of ComputerScience and Engineering, National Instituteof Technology Rourkela, India. He has 3.5years of industrial research experience withSamsung Research Institute Delhi, India. Hisresearch interest includes autonomous robotics,machine learning, computer vision and visual

surveillance. He has worked on some Research and Developmentprojects that are funded by SERB, DST and MeitY.

10.1109/ACCESS.2015.2432455

10.1007/s10846-011-9587-z

10.1109/SIEDS.2013.6549521

10.2514/6.2015-2026

10.1.1.540.4993

10.1117/12.819082

10.1109/LRA.2017.2658940

10.1109/IECON.2016.7793246

10.1109/ACC.2006.1656540

10.1109/ICRA.2011.5980343

10.1145/2674396.2674457

10.1109/IROS.2012.6385458

10.1016/j.robot.2014.03.012

10.1109/ICRA.2011.5980136

10.1007/978-3-319-27149-1_43

10.1007/978-3-642-10520-3_52

10.1023/B:VISI.0000029664.99615.94

10.1016/j.cviu.2007.09.014

10.1109/ICCV.2011.6126544

10.1115/1.3662552



14

Feng Xia (M07-SM12) received the BScand PhD degrees from Zhejiang University,Hangzhou, China. He was a Research Fellow atQueensland University of Technology, Australia.He is currently a Full Professor in Schoolof Software, Dalian University of Technology,China. He is the (Guest) Editor of severalinternational journals. He serves as GeneralChair, PC Chair, Workshop Chair, or PublicityChair of a number of conferences. Dr. Xia haspublished 2 books and over 200 scientific papers

in international journals and conferences. His research interests includecomputational social science, big data, and mobile social networks. Heis a Senior Member of IEEE (Computer Society, SMC Society) and ACM(SIGWEB), and a Member of AAAS.

Suman Kumar Choudhury received hisM.Tech. degree from the National Institute ofTechnology Rourkela, India, in 2013. He iscurrently pursuing the Ph.D. degree in computervision with the Department of ComputerScience and Engineering, National Instituteof Technology Rourkela. His research interestincludes video surveillance, image processing,and pattern recognition.

Pankaj Kumar Sa received the Ph.D. degreein Computer Science in 2010. He is currentlyserving as an Assistant Professor withthe Department of Computer Science andEngineering, National Institute of TechnologyRourkela, India. His research interest includescomputer vision, biometrics, visual surveillance,and robotic perception. He has coauthoreda number of research articles in variousjournals, conferences, and book chapters.He has co-investigated some Research and

Development projects that are funded by SERB, DRDO-PXE, DeitY,and ISRO. He is the recipient of prestigious awards and honors forhis excellence in academics and research. Apart from research andteaching, he conceptualizes and engineers the process of institutionalautomation.

Sambit Bakshi received the Ph.D. degree incomputer science in 2015. He is currently withthe Centre for Computer Vision and PatternRecognition, National Institute of TechnologyRourkela, India. He also serves as an AssistantProfessor with the Department of ComputerScience and Engineering, National Instituteof Technology Rourkela. His research interestincludes visual surveillance and biometricsecurity. He serves as an Associate Editor ofIEEE Access (2016 -), Plos One (2017 -),

Innovations in Systems and Software Engineering - A NASA Journal(2016 -), and International Journal of Biometrics (2013 -). He is aTechnical Committee Member of the IEEE Computer Society TechnicalCommittee on Pattern Analysis and Machine Intelligence. He receivedthe Prestigious Innovative Student Projects Award- 2011 from IndianNational Academy of Engineering for his masters thesis. He has morethan 50 publications in journals, reports, and conferences.

Monocular Vision Aided Autonomous UAV Navigation in Indoor ...thealphalab.org/papers/Monocular...

Documents

Transcript of Monocular Vision Aided Autonomous UAV Navigation in Indoor ...thealphalab.org/papers/Monocular...