Object detection and shadow removal from video stream

Object detection andshadow removal from video

stream

Realized by

Paolo [email protected]

Pierluigi [email protected]

Vittorio [email protected]

University of CataniaProject of Artificial Intelligence

2010/2011

SupervisorIng. Concetto Spampinato

Ing. Roberto Di Salvo

1

Abstract

This paper presents an algorithm for detecting moving objects from a static background scene that contains shading and shadows using color images. We develop a robust and efficiently computed background subtraction algorithm that is able to cope with local illumination changes, such as shadows and highlights, as well as global illumination changes. We have applied this method to real image sequences of both simple and complex scenes, obtaining good results in both cases. The results, which demonstrate the system's performance, and some speed up techniques we employed in our implementation are also shown.

1. INTRODUCTION The focus of this work is to develop a general-purpose model for updating the background and the segmentation of MVOs (Moving Visual Objects) considering the issues related to the presence of shadow.

Mvo's: it is a target object that can be achieved through a perfect segmentation, such as the set of points associated with belonging to an object characterized by non-zero movements and visual appearance different from background.

Shadow: the shadow is detected along with the object of the foreground. Usually you must remove the shadow to reduce the risk of a sub-segmentation, because the geometric properties of the object identified are distorted. This can cause problems during the successive stages.

The model for the correct classification of the moving object is represented in figure 1.

Figure 1: Object classification

2. ALGORITHM (figure 2)

The algorithm for background model's creation and update, and for salient foreground selection, is based on a work of Liyuan Li, Weimin Huang, Irene Y.H. Gu, and Qi Tian( ACM MM2003 9p, implemented in OpenCV). After creation the model is updated for each frame processed. Frames are grouped into lists and processed using the same background model (each list contains frames temporally close so it is plausible that they have a common background). So, the number of frames in the lists becomes one of the configuring parameter of algorithm, in terms of performance, generality and effective result according to the examined scene.

To improve the quality of the images we decided to apply filters to individual frames in order to have a better picture to analyze. These operations are carried out in the Camera Correction process. We analyzed some filters, and finally choose the median filter that reduces the local contrast of the image by deleting the unnecessary detail (blurring) or related to the presence of noise (noise clearing). The median filter does not worsen the rising edge, but it deleted peaks with a base sufficiently smaller than the width of the mask.

For each frame in lists, we compute a foreground mask by selecting foreground points through the Background Suppression. These points are candidate to belong MVOs because they have different values from the current background. In order to improve detection, the background suppression is computed considering the image point to point chromaticity and not the gray levels only. We compute the difference between the background and the complete picture as the distance between each pixel in each of the RGB channels, where c represents the image in the R, G, B channels.

On the difference image, the selection of the initial set of foreground points is then carried out by thresholding with an adequately low threshold. In addition, some of the points previously obtained are identified as noise and excluded from the foreground selection through the application of the OPENING morphological operators.

Then a region-based labeling is performed to compute the connected blobs of candidate moving points (by means of 8-connectivity), and the shadow detection process is applied to exclude shadow points from the set of foreground points. MVOs are validated with rules on area and saliency (if enabled). The salient foreground allows greater precision in the recognition of MVO, mixing the results of two algorithms. You can also refine the validation process through a supervisioning process. Validated MVOs are excluded from the update of list's background. The internal background update process is carried out by re-shadowing: it simply consist in a pixel to pixel copy and paste of not validated blob on the background model. The results are then made available for further processing.

2.2 Shadow Detection

Detecting shadows from an object is not easy to do. In many cases the two classes of points belonging to objects and shadows may have similar visual appearance, especially if we work with grey level only. We have tested that the discrimination between shadows and objects can be improved by adding color information. So we decided to use an approach based on the exploitation of the HSV space to a better differentiation of shadows from object. To detect shadows, our algorithm is based on the following equation:

where

represents the foreground and the background in the H, S, V channels respectively.

So, a point (x,y) is classified as shadow if it satisfies all the three properties:

1. the ratio of the V component (the lightness) of and respects both a lower and a upper bound;

Figure 2: Algorithm

2. the difference on H space (the hue) between and are limited.

3. the difference on S space (the saturation) between and are limited.

This equation comes from the observation that when an area is covered by a shadow, this often results in a significant change in lightness without a great modification of the color information.

Thus, we have upper bound the hue and saturation differences with a threshold each and we impose the lightness ratio to be a value bound by two thresholds α and β

(with 0 < α < β < 1). The first one analyzes the “power” of the shadow (lower value indicates a darker shadow), while the second is used to increase the robustness to noise (the lightness of the current frame cannot be too similar to that of the background).

2.3 Automatic Thresholding

We analyze the absolute difference between foreground and background image to compute the median MED and the median absolute deviation MAD into the foreground selection mask Fm.

We determined that suitable values of thresholding in the H and S channels are:

In the V channel, we adopted a technique that allows us to optimize the computation of the lower and upper bound. First we calculate the median of foreground and background into the foreground selection mask:

Then we compute α and β as follows:

where 1.4826 is a normalization factor.

2.4 Results

We have tested our application with of both simple and complex video stream.

We observed that depending on the type of video examined, changes in brightness of the scene, it was necessary to automatically update the parameters of thresholding

because they are closely related to the brightness, saturation and hue of the processed frame.

es. 1 (default settings with a specific background)

The algorithm behaves well with complex scenes too, where the brightness, saturation and hue flequently change.

es. 2 (defoult settings with background creation)

...under all condition...

As mentioned, the algorithm is robust even when the background changes over time. An

example of this is visible on the left images.

In fact, although the scene changes (is

dawning), the resulting background model permitt

a good object recognition.

Figure 7: analyzed frame

Figure 3: frame Figure 5: detected mvoFigure 4: background Figure 6:object's shadow

Figure 8: background Figure 10: shadowFigure 9: detected mvo

Figure 11:background 1 Figure 12:background 2 Figure 13:background 3

Figure 17: object 1 Figure 18: object 2 Figure 19: object 3

Figure 15: frame 3Figure 16: frame 2Figure 14: frame 1

ROC-CURVE

The true positive rate (Sensitivity) is plotted in function of the false positive rate (100-Specificity) for different cut-off points. Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a different distance between the top-left corner of the real (rectangular!) blob and the top-left corner of the detected blob (rectangular also...). The data in the table on the left refer to the entire set of cut-off points.

MedCalc Software bvba was used to create ROC curve data

3. SOMETHING ABOUT CODE

For the development of our application we used Visual Studio IDE with the OpenCV 1, Cvblob2, log4cxx3 AND XERCES-C++4 libraries. The algorithm has been parallelized on multiple threads, each thread processes a list of frames [see POOL - wait in Appendix A]. As mentioned, the number of frames in the lists is a configuring parameter of algorithm (see THREAD_NUM in Appendix A).

3.1 Background

To prevent including moving objects in the initial background model, we decided to analyze a number of frames that we establish in the initialization parameters [see cicle_background in Appendix A] to get a starting background that will then be updated in subsequent cycles. According to the examined scene is also necessary to set

1 (http://opencv.willowgarage.com/wiki/) Computer vision algorithms in OpenCV can be used, for example, identify objects, classify human actions in videos, track camera movements, track moving objects, identify background and foreground ecc.

2 (http://code.google.com/p/cvblob/) cvBlob is a library for computer vision to detect connected regions in binary digital images. cvBlob performs connected component analysis (also known as labeling) and features extraction.

3 (http://logging.apache.org/log4cxx/index.html) Apache log4cxx is a logging framework for C++ patterned after Apache log4j. Almost every large application includes its own logging or tracing API. Inserting log statements into code is a low-tech method for debugging it. It may also be the only way because debuggers are not always available or applicable. This is usually the case for multithreaded applications and distributed applications at large.

4 a validating XML parser written in a portable subset of C++

about:blank

about:blank

about:blank

an adequately threshold for the creation of foreground mask [see THRESHOLD in Appendix A].

3.2 Salint foreground

The Salient Foreground Selection is the foreground computed by the pure OpenCV library. It is computed on the corrected (by Camera Correction) frame.

If enabled, this process compares each object with the verity mask (opencv foreground selection) and if they have a fixed number of pixels in common the object will be saved, otherwise it will be discarded [see fitting in Appendix A].

3.3 Shadow

You can set automatic or manual shadow thresholding [see alfa - beta - Th - Ts in Appendix A].

3.4 Mvo's validation

As already mentioned, the validation is done through recruitment on blob's area and through supervisioning (if enabled the system asks you to identify if certain objects are real objects) [see supervisioning - minArea – maxArea in Appendix A].

3.5 Results management

We decided to simply save the results as images (.jpg) in a folder named “detected”. You can configure your self some saving option [see three - saveShadow in Appendix A]. Is also is possible to perform additional processing on the results inside the code. Enabling the Delivery service [see thread_saving in Appendix A], in fact, you can perform an ordered access to data structures (FrameObject) containing the results of each frame.

4. EXTRA

The software is fully configurable through the configuration.xml and is equipped with console logger (also configurable). For more detail see the README file and code documentation.

APPENDIX A

Initialization ParametersWhen the application starts, we set the path of the video to be analyzed, and the parameters required to run, if these parameters are not changed will be set to default.

These are the parameters in details:

• int POOL: the number of total thread

• int THREAD_NUM: the number of processed frame for every single thread

• int cicle_background: the number of frame for background learning

• bool thread_saving: enable/disable delivery service

• float THRESHOLD: the threshold for background suppression

• double alfa, beta, Th, Ts: the threshold for shadow detection

• int wait: the max number of concurrent thread

• bool three: enable/disable saving in a three directory

• bool saveShadow: enable/disable saving object shadow

• bool supervisioning: enable/disable supervisioning for ghost detection

• int fitting: the number of frame to use in a verity mask for fitting process

• int minArea, maxArea: the range of area within which the object is detected

REFERENCES

1. “Image difference threshold strategies and shadow detection”, Paul L. Rosin - Tim Ellis.

2. “A statistical Approach for Real-time Robust Background Subtraction and Shadow Detection”, Thanarat Horprasert – David Harwood – Larry S. Davis (Computer Vision Laboratory University of Maryland College Park, MD 20742).

3. “Detecting Objects, Shadows and Ghosts in Video Streams by Exploiting Color and Motion Information”, R. Cucchiara, C. Grana, M. Piccardi, A. Prati (D.S.I. University of Modena and Reggio Emilia - Dip. Ingegneria University of Ferrara, Italy).

4. OpenCV (Open Source Computer Vision): a library of programming functions for real time computer vision http://opencv.willowgarage.com/wiki

5. cvBlob: a library for computer vision to detect connected regions in binary digital images http://code.google.com/p/cvblob/ .

6. "Image Enhancement and spazial filter", Concetto Spampinato (University of Catania).

7. http://www.medcalc.org

http://www.medcalc.org/

http://opencv.willowgarage.com/wiki

about:blank

Object detection and shadow removal from video stream

Documents

Transcript of Object detection and shadow removal from video stream