[IEEE 2008 IEEE International Conference on Multimedia and Expo (ICME) - Hannover, Germany...

VISION-BASED VEHICLE EVENT DETECTION THROUGH VISUAL RHYTHM ANALYSIS

1Chia-Hung Yeh, 3Jia-Chi Bai, 2Sun-Chen Wang, 2Po-Yi Sung, 2Ruey-Nan Yeh, & Maverick Shih

1Department of Electrical Engineering, National Sun Yat-Sen University, 804 Taiwan, R.O.C.

2 Material and Electro-Optics Research Division, Chung-Shan Institute of Science and Technology, 325 Taiwan, R.O.C.

3Department of Computer Science and Information Engineering, National Dong-Hwa University, 974 Taiwan, R.O.C.

Email: [email protected]

ABSTRACT

In this paper, a simple and reliable on-road vehicle event detection algorithm is proposed to identify events for vehicle. A virtual line at the same position of a frame is employed to extract visual rhythm. The visual rhythm is a compact representation of a video that captures the temporal information of vehicle status of a coarsely spatially sampled video sequence. By analyzing statistical characteristics of the visual rhythm, the events such as safe distance, passing and lane changing can be effectively detected. The proposed techniques can prevent accidents and improve traffic safety by monitoring the alertness of drivers, augmenting vision fields to prevent collision. The proposed system is efficient both in terms of computational complexity and memory requirements. Experimental results show the efficiency and effectiveness of the proposed system for intelligent transport system. Keywords: vehicle event detection, intelligent transport system, visual rhythm

1. INTRODUCTION

Vision-based vehicle detection systems are widely used in intelligent transportation system (ITS). The goal of vehicle detection is to prevent accidents and improve traffic safety. Several different types of devices, including advanced electronics, communications, and system-controlling techniques are well developed, and often applied to vehicle production or driving services for improving driving safety systems. Vision-based analysis systems have become popular in transportation management due to their capability to extract a wide variety of information in comparison with the sensor-based system. Vision-based systems have a good potential for highway surveillance application, and useful traffic information such as vehicle dimensions, lane changing, and other traffic-related information can be effectively extracted. However, it is challenging to maintain detection accuracy at all time since vision-based processing is sensitive to environmental factors such as lighting,

shadow, and weather conditions. In this paper, a reliable and efficient algorithm for vehicle event detection is proposed. The proposed algorithm utilizes “visual rhythm” to analyze the behavior of vehicle and detect events such as lane changing, safe distance and vehicle pass. Visual rhythm records the pixel data along the scan line in each frame to capture the temporal information [1]. The proposed system is efficient both in terms of computational complexity and memory requirements.

The rest of this paper is organized as follows. Relevant works on vision-based analysis systems are reviewed in Sec. 2. In Sec. 3, we introduce the concept of visual rhythm to provide both reliable and efficient vehicle event detection. The detection of lane changing, safe distance and vehicle passing are described in Sec. 4. Experimental results are shown in Sec. 5. Finally, concluding remarks and future research directions are given in Sec. 6.

2. BACKGROUND The traffic is a complex environment with its conditions changing rapidly with time, but the driver’s attention is limited and cannot be paid to every single event that occurs during driving. Even more, drivers may get worn out during long time driving and eventually begin to doze off. The goal of the vehicle event detection system is to assist the driver in alerting potential safety events and improve the situation awareness of the driver, thus preventing traffic accidents. Such safety related events include lane changing, overtaking, safety distance, pedestrians crossing, and so on. To detect these events and give out warnings, the system must operate in real-time. 2.1. Vehicle Detection in the Driving Environment Through detecting and tracking surrounding vehicles, the system can monitor the driving environment and give out warnings when necessary. To satisfy this goal, various approaches have been proposed [2]-[5]. They can be largely categorized into three groups. The first group is hypothesis-

309978-1-4244-2571-6/08/$25.00 ©2008 IEEE ICME 2008

based methods. In general, vehicle detection can be divided into two steps: Hypothesis Generation (HG) methods locate possible patterns of vehicle in the input video, while hypothesis verification methods examine the validity of the

hypothesis. In the HG stage, potential areas containing vehicles in an image are located through the features such as edges, colors, textures, or even symmetry. However, vehicle detection relying on such low-level features often results in high false-positive rate and limits its feasibility.

The second group is to compute the motion field of the input video called “optical flow.” By analyzing the optical flow, the system can detect the presence of a moving vehicle. However, due to the motion-based nature of this method, the optical flow cannot detect static obstacles on the road. The process of motion searching is also time-consuming and it is difficult to be implemented in real time applications.

The third group uses stereo vision to find vehicles. The main idea is to generate disparity maps from two or more cameras. Vehicles can be therefore distinguished from the background according to the reconstructed depth information. But such approaches are often disturbed by the instability of the platform. Vibrations occurred during driving may cause the camera move slightly off place and resulted in errors. After hypotheses are generated, they must be verified in the HV stage using methods such as templates and appearance models.

2.2. Vehicle Events Detection Monitoring driving events is also an important research issue of traffic safety. For example, when a driver dozes off, the vehicle may drift out of the lane or run into the car in the front. By monitoring the certain events, such as lane changing and safety distance, many accidents could be avoided. In addition, there are usually many blind spots around the vehicle. If another vehicle overtakes from a blind spot without being noticed by the driver, both vehicles are

jeopardized. If the system could alert the driver in advance when another vehicle overtakes, such risk can be greatly reduced.

Fang et al. proposed a scheme using a three-layer analysis model to detect driving environment changing such as tunnel entrance, tunnel exit, lane changing, overtaking, etc. [3] The first layer is a preprocessing step that extracts moving objects in a video sequence. The second layer performs a perceptual analysis to extract the features of the driving environment using the spatiotemporal attention (STA) neural network model. These features are then sent into the CART neural module in the third layer. The CART neural module performs a conceptual analysis to determine the current status of the driving environment.

3. THE CONCEPT OF VISUAL RHYTHM

Conventional approaches grounds the fundamental principles with the whole image captured from the camera, which requires extensive computing power and greatly raises the cost of intelligent transport system. We hereby use the concept of “visual rhythm” extracted from a scan line at a given position of each frame in a video [1].

As shown in Fig. 1, a horizontal scan line is placed at a given position in a video. For each frame, the pixel data are scanned along the line and stored into a scan line buffer. Fig. 2 shows the history of the scan line listed in order. These time-variant data sets called “visual rhythm” record the history of the vehicle’s driving status. Though analyzing the characteristics of the visual rhythm, we can obtain the meaningful information and detect important driving events by using only a single camera. Furthermore, the visual rhythm extracts and deals with only the information along the scan line. Computing power can be significantly reduced and it thus can lower the cost of the system.

Fig. 1. The positions of a horizontal scan line and the square detection blocks in the camera view

Fig. 2. Visual rhythm extracted from scan lines. The top scan line holds the most recent data, while bottom scan line holds the oldest ones.

310

4. VEHICLE EVENT DETECTION

4.1. Lane Changing Detection

To detect lane changing, we should first obtain the position of the lane marks. By tracking the horizontal offsets of the lane marks, we are able to detect lane changing events according to the amount of horizontal drift of the vehicle. The scan lines are first converted to grey scale image and then binarized by the given threshold. Brighter regions, such as lane marks, are extracted from the road surface. However, many of the lane marks on the road surface are dash lines, which increase the difficulty of lane mark tracking. To connect dash lines, each marked pixel is given an extended influence time of 30 frames. In this way, the dash line regions will extend vertically and most of them can connect to each other, as shown in Fig. 3. Then, we apply a one-dimensional morphological operation, dilation, on the scan line in order to merge any nearby fragmental segments produced during the thresholding process.

The next step is to locate the position of each lane so that the offset between different frames can be estimated. As can be seen in Fig. 4, the average position of each lane mark is calculated to obtain the center coordinate. The trajectory formed by the center coordinates from different frames represents the history of the lane mark. The system then picks a lane mark and continuously tracks its position. By comparing the current center coordinate and the historic center coordinates on the lane trajectory, we can obtain the lane offset. The lane offset gives us a rough idea that how far the vehicle is drifting away from the original path within the certain time interval. When a lane mark leaves the view of the camera or a dash line cannot be fully connected through lane mark extension so that further tracking process becomes unfeasible, the system will automatically switch to the nearest lane mark and restart the tracking process.

However, lane offset obtained in this way cannot be directly applied to lane changing detection. Vibrations occurred during driving and minor adjustments of the vehicle path

will bend the lane trajectory and cause unnecessary false alarms. If we raise the detection threshold, it will increase the reaction time of the system and fail to alert the driver in time when dangerous events happen. To solve this problem, we need to distinguish lane changing event from other minor changes in offset.

Here, we introduce a measurement called lane baseline that is used to distinguish between lane changing or minor changes. Let X(i) be the horizontal position of the lane trajectory and a given frame i, we define the lane baseline position Xbase(t) at frame t as follows:

1

)()(

01

1

0

tt

iXtX

tt

ttibase , (1)

where a delayed time interval sets between t-t0 frames and t-t1 frames. The constants t0 and t1 are time offsets in frames which mark the beginning and end of the time interval.

As shown in Fig. 4, the lane baseline is the average horizontal position of lane trajectory within a certain time interval. In this case, we choose the time interval t0 = 60 and t1 = 120. When the red line drifts away from the green line, the offset between current lane trajectory and the lane baseline will also increase. When the offset exceeds a given threshold, the lane changing event will happen. The lane baseline is actually a smoothed variant of lane offset. Minor variations will be eliminated through the averaging process and significantly improve false alarms. The delay of the certain time interval also gives better sensitivity to lane changing event detection.

4.2. Safe Distance and Vehicle Passing Detection

On the highway, when unexpected events happened to the vehicle ahead, such as a traffic accident or an emergency brake, maintaining proper safe distance could be a life-saving matter. Using the visual rhythm concept, we propose

Fig. 3. The thresholded visual rhythm with extended influence and white regions represent detected lane marks; while yellow represent the extended influence of the lane marks.

Fig. 4. An example of lane tracking. Red trajectory represents the current tracking lane mark and blue trajectory represents other candidate lane mark and green trajectory represents the lane baseline

t-t0

t-t1+1

311

an effective detection algorithm for safe distance and vehicle passing so that the driver will be notified when necessary.

In Fig. 1, several square detection blocks are placed in the camera view through calibration procedure. To distinguish the image characteristics of a vehicle from ordinary road surface, we utilize both mean and standard deviation of pixel intensities within the square detection block. First, we have to acquire these features from ordinary road surface as the basis of comparison. As can be seen in Fig. 4, since lane marks region have already been extracted, the regions remained after the lane mark region extraction can be considered as the ordinary road surface. However, we do not have to calculate the mean and standard deviation of the whole ordinary road surface. Instead, we only have to deal with the region shown in Fig. 5. The reason is simple: if the vehicle maintains the current direction, the ordinary road surface area that once fell in the square detection block will ultimately pass through the scan line and be stored in a narrow column of the visual rhythm buffer. The ordinary road surface that lies outside the column may contain other objects such as highway fences, shadows of passing vehicles of the neighboring lane, etc. and thus disturbing the detection.

Then, we can compare the features between the visual tempo and the detection block. Let RRS be the intensities of the limited region of the ordinary road surface in the visual rhythm and RDB be the intensities of a square detection block in the camera view, we define the detection function F as follows:

2 2,R S D B R S D B R S D BF R R R R Std R Std R (2)

If F exceeds a given threshold, the features of the pixel content in the square detection block cannot match those of the ordinary road surface. Such a condition will trigger safe distance event. Vehicle passing detection is also done in a similar manner. The major difference is the placement of

square detection block on the sides of the vehicle path for the detection of other vehicles that pass by its neighboring lanes.

5. EXPERIMENTAL RESULTS

We have gathered 10 hours of driving video with sufficient events of each type to fully test the feasibility of the proposed algorithm. The algorithm is implemented through the DirectShow platform on a PC with Intel 3.4G Dual Core CPU. The simulation results are listed as follows:

Event Type Accuracy

Lane Changing 96.8%

Safe Distance 95.6%

Vehicle Passing 95.2%

The frame size and the frame rate of the testing videos are 352x240 pixels and 30 fps. The implemented algorithm operates in real time with the average CPU usage of 7%. Experimental results show that most vehicle events can be efficiently extracted by the proposed scheme.

6. CONCLUSIONS

In this paper, a reliable and efficient algorithm for vehicle event detection is proposed. The proposed algorithm utilizes the concept of “visual rhythm” to analyze the behavior of vehicle and detect events such as lane changing, safe distance and vehicle passing by recording the pixel data along the scan line in each frame. Experimental results show that the proposed algorithm not only satisfies the reliability standards, but also is very efficient for the implementation both in time and memory requirements.

ACKNOLEDGEMENTS

The authors would like to thank the National Science Council for financially supporting this research under Contracts No. NSC-96-2628-E-259-009-MY2

REFERENCES [1] Hyeokman Kim, Jinho Lee and S. Moon-Ho Song, “An efficient graphical shot verifier incorporating visual rhythm,” in Proceedings of IEEE International Conference on Multimedia Computing and Systems, vol. 1, pp. 827-834, June 1999. [2] Zehang Sun, George Bebis, and Ronald Miller, “On-road vehicle detection: a review”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 5, pp. 694~711, May 2006. [3] Chiung-Yao Fang, Sei-Wang Chen, and Chiou-Shann Fuh, “Automatic change detection of driving Environments in a vision-based driver assistance system”, IEEE Transactions on Neural Networks, vol. 14, no. 3, pp. 646~657, May 2003. [4] T. Vaa, M. Penttinen and I. Spyropoulou, “Intelligent transport systems and effects on road traffic accidents: state of the art”, in proceedings of Intelligent Transport Systems, vol. 1, pp. 81-88, June 2007 [5] K. Thomas and H. Dia, “Comparative evaluation of freeway incident detection models using field data”, in proceedings of Intelligent Transport Systems, vol. 153, pp. 230-241, Sep. 2006

Fig. 5 Only the region that is right on the vehicle’s direction is computed as the basis of comparison against the detection block.

312

[IEEE 2008 IEEE International Conference on Multimedia and Expo (ICME) - Hannover, Germany...

Documents

Transcript of [IEEE 2008 IEEE International Conference on Multimedia and Expo (ICME) - Hannover, Germany...