Xalapa, Ver., Mexico FPGA-Embedded Driver … Driver Assistance System Based On Computer Vision...

6
FPGA-Embedded Driver Assistance System Based On Computer Vision Ricardo Acevedo-Avila, Miguel Gonzalez-Mendoza, Andres-David Garcia-Garcia. Department of Electrical and Electronic Engineering Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Estado de México. Atizapán de Zaragoza, Estado de México, México [email protected]; {mgonza, garcia.andres}@itesm.mx Abstract In this paper we present the design of a Driver Assistance System based on computer vision whose main purpose is to assist the driver by providing vital information on the traffic environment and the vehicle’s actions. This design has been proposed on an embedded computation platform, due to physical limitations, memory and computing power, existing image processing algorithms based on PC platforms are not suitable for this application. Instead, simple and efficient image processing algorithms most be developed to fit our embedded architecture. The Driver Assistance System provides three main functionalities: Lane Detection, Obstacle Detection and Lane- Change Detection. The core algorithms have been developed and simplified using the Matlab Design Environment and then described into custom hardware components to be implemented on FPGA hardware. Keywords- Intelligent Systems, Embedded Driver Assitance System, Image Processing, Computer Vision, Lane Detection, Lane-Change Detection, Obstacle Detection, Perspective Transformation, FPGA Hardware Development. I. INTRODUCTION Road traffic accidents are a serious socio- economic problem and one of the top ten causes of death [1] the potential human and economic implications are large and cause continuous government and industry spending. The research in vehicle safety systems is an essential component needed to solve this problem. Computer vision-based Driver Assistance Systems present technology designed and developed for improving traffic safety, using the existing road infrastructure and computing platforms such as Personal Computers to run many functionalities (road recognition, lane and vehicle detection, tracking, etc.) aimed to safeguard the occupants of the vehicle. The main objective of this paper is to demonstrate how to implement the functionalities of a full computer-based Driver Assistance System on FPGA hardware. An Altera Cyclone II EP2C35F672C6N FPGA [2] is used on this work mainly because of its low cost, board-integrated components and design environment. Due to the complexity of image processing algorithms, it is necessary to use simplified models restricted to the resources offered by the development board. As mentioned before, the proposed system is composed of three main image processing-based algorithms or modules: 1) Lane Detection. 2) Lane-Change Detection. 3) Obstacle Detection. Lane detection is an important functionality of any Driver Assistance System, it is used to find lane and road boundaries in given images, it is also used to detect lane-change events that may occur inadvertently for the driver. Nissan has recently developed a lane departure warning system based on this technology [3]. Lane detection is especially difficult in urban environments due to parked, moving vehicles, people, trees, buildings, shadows and other noisy sources on the scene. [4] We present a fast and efficient approach to lane detection restricted only to highway environments, hardware-based image filters and algorithms are designed to take advantage of the simplified environment. Obstacle Detection is also a vital functionality for vehicle safety. To perform obstacle detection, snap shots of a video stream can be taken and these images can be used as input to a computer which ultimately performs the actual detection [4]. Our main concern is that obstacle detection can often be ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico 35

Transcript of Xalapa, Ver., Mexico FPGA-Embedded Driver … Driver Assistance System Based On Computer Vision...

FPGA-Embedded Driver Assistance System

Based On Computer Vision

Ricardo Acevedo-Avila, Miguel Gonzalez-Mendoza, Andres-David Garcia-Garcia. Department of Electrical and Electronic Engineering

Instituto Tecnológico y de Estudios Superiores de Monterrey, Campus Estado de México.

Atizapán de Zaragoza, Estado de México, México

[email protected]; {mgonza, garcia.andres}@itesm.mx

Abstract — In this paper we present the design of a Driver

Assistance System based on computer vision whose main purpose

is to assist the driver by providing vital information on the traffic

environment and the vehicle’s actions. This design has been

proposed on an embedded computation platform, due to physical

limitations, memory and computing power, existing image

processing algorithms based on PC platforms are not suitable for

this application. Instead, simple and efficient image processing

algorithms most be developed to fit our embedded architecture.

The Driver Assistance System provides three main

functionalities: Lane Detection, Obstacle Detection and Lane-

Change Detection. The core algorithms have been developed and

simplified using the Matlab Design Environment and then

described into custom hardware components to be implemented

on FPGA hardware.

Keywords- Intelligent Systems, Embedded Driver Assitance

System, Image Processing, Computer Vision, Lane Detection,

Lane-Change Detection, Obstacle Detection, Perspective

Transformation, FPGA Hardware Development.

I. INTRODUCTION

Road traffic accidents are a serious socio-economic problem and one of the top ten causes of death [1] the potential human and economic implications are large and cause continuous government and industry spending. The research in vehicle safety systems is an essential component needed to solve this problem.

Computer vision-based Driver Assistance Systems present technology designed and developed for improving traffic safety, using the existing road infrastructure and computing platforms such as Personal Computers to run many functionalities (road recognition, lane and vehicle detection, tracking, etc.) aimed to safeguard the occupants of the vehicle.

The main objective of this paper is to demonstrate how to implement the functionalities of a full computer-based Driver Assistance System on

FPGA hardware. An Altera Cyclone II EP2C35F672C6N FPGA [2] is used on this work mainly because of its low cost, board-integrated components and design environment. Due to the complexity of image processing algorithms, it is necessary to use simplified models restricted to the resources offered by the development board.

As mentioned before, the proposed system is composed of three main image processing-based algorithms or modules:

1) Lane Detection. 2) Lane-Change Detection. 3) Obstacle Detection.

Lane detection is an important functionality of

any Driver Assistance System, it is used to find lane and road boundaries in given images, it is also used to detect lane-change events that may occur inadvertently for the driver. Nissan has recently developed a lane departure warning system based on this technology [3].

Lane detection is especially difficult in urban environments due to parked, moving vehicles, people, trees, buildings, shadows and other noisy sources on the scene. [4] We present a fast and efficient approach to lane detection restricted only to highway environments, hardware-based image filters and algorithms are designed to take advantage of the simplified environment.

Obstacle Detection is also a vital functionality for vehicle safety. To perform obstacle detection, snap shots of a video stream can be taken and these images can be used as input to a computer which ultimately performs the actual detection [4]. Our main concern is that obstacle detection can often be

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

35

quite slow for some real time applications, so we need a simple yet effective method of road detection.

All the different modules of this system are designed as custom hardware components. The source image is smoothed out by a chain of filters, including operations of dilation-erosion, averaging and thresholding.

The core component of the system is a “perspective distortion corrector”; its main objective is to compute a top-view image of the actual scene, based on the Inverse Perspective Mapping (IMP) [5] theory. By running a simple, embedded software algorithm, we can determine lane changes and obstacle detection.

The proposed solution forms part of a first, experimental, iteration of the system. To investigate the viability of this design, experimental tests have been carried out under real scenarios.

Optimization is still needed, but the basic functionality of the core modules has been tested to work.

II. SYSTEM DESIGN

A. Lane detection and Road Model Simplification.

Hough transform is a commonly method used for

detecting lines and circles [6], constantly applied in

lane detection [7] scenarios due to its robustness to

light variations and image noise, however, its

complexity [8] demands significant computer

power, way beyond the restriction of this

application’s hardware platform.

It is possible to simplify the road model if we

extract the perspective distortion introduced by the

camera using Inverse Perspective Mapping. Figure

(1) shows the image distorted by the camera lens,

along the true image that can be obtained if a

perspective correction method is applied. It is

important to note that this operation is only a pixel

mapping between two image planes (image plane

and world plane).

The method used for the perspective distortion

correction requires 4 non-collinear points to create

the mapping between the two planes. The 4 points

correspond to each one of the 4 corners of trapezoid

formed on the image plane and their equivalent

positions on the world plane. These points generate

a set of 8 equations [9]. The equations can be solved

offline and a pair of matrices containing the world

plane pixels can be obtained.

The perspective distortion can be modeled by the

following projective transformation, as written in

equation (1), where and are 3-vectors

representing a point and is a homogeneous non-

singular 3x3 matrix.

(1)

With the 4 point correspondences discussed

above, it is possible to extract 8 equations to

generate matrix H, as shown in equation (2).

[

]

[ ]

[

]

(2)

It is possible to solve this set of linear equations

using any mathematical software.

Image Plane World Plane

Figure 1. Distortion of the world plane and true world plane.

Point 1

X: 244

Y: 194

Point 2

X: 384

Y: 194

Point 3

X: 17

Y: 344

Point 4

X: 617

Y: 344

(a)

(b)

Figure 2. (a) Camera image and (b) rectified image. Notice the

four point correspondences on each image plane.

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

36

B. FPGA Implementation

Once matrix H is obtained, the world plane pixel

positions for any image plane pixel can be obtained,

this data can be stored in Read-Only Memory, this

means that in our proposed FPGA architecture, each

pixel composing the distorted image can be re-

assigned to a new position according to the data

stored in ROM, as showed in Figure 3.

Figure 3. Each incoming pixel is located in a new position according

to the data stored in a ROM block.

To fully implement this idea in FPGA hardware

three major components are needed: Read-Only

Memory to store the rectified position data,

Random-access memory to hold each incoming

video frame and a RAM/ROM controller to sync up

the data transmission between the two memory

blocks.

The RAM/ROM controller is designed as a finite-

state machine model with four main states (Figure

4): State 0: Internal variables initialization.

State 1: Data request to ROM. (World plane position of the

incoming Image plane pixel stream).

State 2: ROM data received, data request to RAM.

State 3: RAM data received and sent to rectified image

container (in this case, a VGA controller so we can

visualize the final image on a computer monitor).

C. Lane Coordinates Extraction

Once the image has been rectified, it is necessary

to extract the coordinates of the two lanes along the

horizontal axis. Figure 5 shows the ideal rectified

image obtained after the perspective transformation.

Figure 5. Ideal rectified image. Each lane centroid is marked with a

red rectangle.

Each of these white lanes can be represented as

two perfect rectangles, with width “w” and height

“h”. If we consider the distance “v” from the bottom

left to the start of the white rectangle, then, the

center coordinate “ ” of this lane can be simply

calculated as:

(3)

In a real image obtained after the rectification

(Figure 6), the two lanes hardly resemble perfect

rectangles. It is possible to process each row of the

image independently and then compute the average

of each value to obtain the final center coordinate of

the white lane.

Figure 6. Real rectified image.

Positions Table

(ROM)

Number of positions:

Pixels contained in

The image plane

(Red Trapezium)

Image Plane

Incoming

Pixel

World Plane

Outgoing

Pixel

State 0

State 1

State 2 State 3

ROM

data

Request

RAM

data

reception.

Data to

VGA

Ctrlr.

ROM

data

reception.

RAM

data

Request.

S =1

S =2

S =3

S =4

Initialization

Figure 4. The finite-state machine model that syncs up

and controls the data transmission between the two RAM

and ROM blocks.

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

37

Consider each row of the image is processed

from left to right. First, we count each black pixel

and store the final value into an accumulator

variable “Vn”. The white pixels are also counted

and the final sum stored into another accumulator

variable “Wn”.

If a whole row (320 pixels) has been processed,

the row counter variable “H” must be increased in

one unit, if the number of processed rows has

reached an H predefined bound, the processing is

over and we need to compute the average of each

accumulator variable :

For black pixels:

(5)

For white pixels:

(6)

The lane centroid in the horizontal axis can be

computed as:

(7)

Remember that a pure hardware implementation

is the final platform for this algorithm, so we need a

way to deal with those divisions in eqs. (5), (6) and

(7). If we process only 16 (H = 24) image rows, we

can express the divisions in (5) and (6) as arithmetic

right shifts of four positions, for eq. (7) we can right

shift just one position.

D. Data Processing

After computing both centroids, we can finally

determine if a lane is present on the road, a lane-

change is taking place or if an obstacle is lying

between these two points. To detect an obstacle we

can threshold the area between the two lanes, if an

obstacle exists, it will show up as a blob of white

pixels.

If the blob of pixels exceeds a certain security

distance, as shown in figure 7, when can alert the

driver that an obstacle lies directly in front of the

vehicle.

Figure 7. Simple obstacle detection. The binary image shows a blob

of white pixels indicating the presence of an obstacle.

To detect a lane change, the processing is a little

bit trickier, as we need to track both lane centroids

through time; this is accomplished by defining a

“processing window”. The processing window will

evaluate the changes presented in 10 video frames,

then, a simple algorithm will conclude if the vehicle

has, in fact, changed lane.

The basic idea is simple: let’s divide the bottom

of the image in two major areas: Left and Right.

Let’s further divide those areas in two halves, so we

end up with four sub-areas as shown in Figure 8,

each lane centroid will cross each region at different

times, depending on the vehicle’s direction.

Left Right

Figure 8. Lane-change detection. Notice the four sub-

areas at the bottom. The red vertical lines represent each

of the lanes.

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

38

If the left centroid moves to the right and the

right centroid disappears from the scene, probably,

a change to the left has occurred. Conversely, if the

right centroid moves to the left and the left centroid

eventually disappears, we could infer that a change

to the right has just occurred. Now, we need a

certain threshold value to be perfectly sure when a

vehicle is really changing direction.

We define the minimum distance thresholds that

both centroids have to cross as the two halves of the

left and right areas in Figure 8 (red lines).

Previously, we define a processing window of

10 frames. If we notice that both centroids are

beyond the minimum distance threshold in at least 6

frames of 10, we can conclude that a real lane-

change is taking place in that instant in time.

This algorithm has been implemented as a simple

program written in C running on a NIOS II soft-

core CPU, the NIOS II is a 32-bit embedded-

processor architecture designed specifically for the

Altera FPGAs. Its configuration can be listed as

follows:

Standad core.

20 KBytes of on-chip memory.

UART JTAG for host communication and

debugging.

III. RESULTS

Tests on real-world scenarios were carried out

with the results evaluated by human designers. Each

test involved the evaluation of 10 frames of video in

a sequence of video of varying length. It is

important to note that the system yields a result

after 10 frames of video are evaluated, that way, if

detection shows up in at least 6 of 10 frames we can

conclude that, in fact, that detection is positive.

The test vehicle changed lanes randomly through

the whole video. Figure 9 shows the detection of a

right lane-change with the design discussed in this

paper.

Figure 9. Lane-change detection test.

The system is actively detecting and correctly identifying the lane and the lane-change provided that the lane markers are correctly painted on the road and the camera is correctly positioned. Tests performed on the lane detection and lane-change sequence we resulted in 100% accuracy for each video sequence. One point to note, however, is that no occlusion occurred during the test sequence and hence performance was expected to improve.

The core hardware implementation of the system is shown in Figure 10.

We have included image pre-processing stages composed of a closing and binarization filter before

Camera Image Rectified Image

Fra

me

11

F

ram

e 1

0

F

ram

e 1

Figure 10. System Core Architecture.

ROM

Memory

Controller

Centroid

Finder

Image

Filters

RAM

CPU

NIOS

II

Right

Change

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

39

the actual perspective correction component to increase the system robustness to light variations and environmental noise.

Table 1 summarizes the resource consumption of the proposed design, the proposed design has been proved to work at a maximum frequency of 44. 69 Mhz.

TABLE I. DESIGN RESOURCE CONSUMPTION

FPGA

Circuit

Resource Consumption

LE Memory bits

Fmax

(MHz)

Proposed 2,282 192,384 44.69

EP2C35 33,216 483 840 44.69

EP2C70 68,416 1125 K 44.69

The EP2C35 FPGA circuit from Altera has been used as testing platform, however, the memory bits required by our design exceeds those offered by this particular FPGA. The full design can be implemented on the EP2C70 circuit, as show on Table 1, the requirements of memory blocks are fully met with that circuit.

IV. CONCLUSIONS

In this work we proposed an implementation of a Driver Assistance System running on embedded hardware, which is a real and promising solution for improved traffic and road security. A simplified model of road and lane detection using perspective transformation was developed to take advantage of a hardware-configurable environment.

One of the crucial components of this system is the perspective corrector based on IPM, it is important to note that, as shown in eq. (2), this solution is completely independent of intrinsic and extrinsic camera parameters, as long as the camera is correctly positioned on the vehicle.

This approach has proved to be feasible and reliable according to the experiments conducted,

there is still room for code and resource optimizations, in a near future it will be possible to use the provided modules fully integrated on an actual on-board vehicle system. During development of the system, a processing time of 0.008147 seconds per frame was achieved without fully optimization of all components. The processing time is expected to decrease in future iterations.

ACKNOWLEDGMENT

The authors would like to express their gratitude

to Dr. Cuauhtémoc Carbajal, Dr. Alfredo Santana

and Dr. Sadegh Babaii, for their continuous help

and advice in the discussions leading up to this

work.

REFERENCES

[1] The World Health Organization "The top 10 causes of Death"

http://www.who.int/mediacentre/factsheets/fs310/en/index.html Web. 07 May 2011.

[2] Cyclone II Literature Wbsite by Altera:

http://www.altera.com/literature/lit-cyc2.jsp Web. 02 May 2011.

[3] Nissan's "All Around Collision Free" Prototype To Demonstrate Advanced Accident Avoidance Systems At Its World Congress

http://www.nissannews.com/newsrelease.do;jsessionid=EBF6F1B4AEC92778E31524B6C91816E7?&id=626&mid=1 Web. 23 April 2011.

[4] Aly, M. "Real Time Detection Of Lane Markers In Urban Streets". Computer Vision Lab, Electrical Engineering, California institute of Pasadena, 2008.

[5] Bertozzi, M. "Stereo Inverse Pespective Mapping: Theory and Applications" Dipartimento di Ingegneria dell'Informazione, Universitá di Parma, Parma, Itally, 1998.

[6] Coifman, B. "A Real-Time Computer Vision system for Vehicle Tracking and Traffic Surveillance" Institute of Transportation Studies, University of California, Berkeley, California, 1998.

[7] Mc. Donald, J. "Application of the Hough Transform to Lane Detection and Following on High Speed Roads" Signals & Systems Group, Department of Computer Science, National University of Ireland, Maynooth, Ireland, 2001.

[8] Yang, G., "Computer Vision Hough Transform" Department of Computing, Imperial College London, 2005.

[9] Hartley, R., Zisserman, A. "Multiple View Geometry in Computer Vision" Second Edition, Cambridge University Press, Cambridge, U.K., 2003.

ROSSUM 2011 June 27-28, 2011 Xalapa, Ver., Mexico

40