A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant...

16
A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance Md. Mahbubur Rahman, and Susumu Horiguchi School of Information Science, Japan Advanced Institute of Science and Technology E-mail: [email protected] Abstract A moving-object invariant video mosaicing technique applicable to the detection and monitoring activity over a wide area using an active camera has been proposed. There has been a large literature on image mosaicing, however, many of these algorithms focus on the static scene mosaicing using homography among the frames. There has also been recent work on activity monitoring and detection. Such work generally assumes a non-moving camera. In contrast, we propose a model for activity monitoring and detection over a wide area, where a moving camera is used to sweep over the area of interest and control through network or internet. Our approach is based on the combination of a proposed video mosaicing technique that has been used for constructing panoramic background and static registration along with adaptive background subtraction method that has been used for moving object detection and tracking. Unlike the conventional approach, the proposed mosaicing method employs motion clustering and segmentation techniques for automatically filtering moving objects from the resulting panorama and does not require any robust estimator for dominant motion estimation. We demonstrated the efficiency of our method by presenting results obtained from real life video data. 1 Introduction Recently, there has been a growing interest surrounding the mosaic representa- tion of video sequences. Mosaicing is the process of aligning of multiple images into larger aggregates onto a common reference plane. Applications of mosaicing include panoramic photography, virtual environ- ments, video compression and video surveillances. Mosaicing is based on mainly two steps: image registration and image blending. A number of image registration techniques have been proposed in the literature [1-8]. Those methods can be divided into two categories: first category relies on special feature correspondences and the rest is based on transformation optimization from pixel correlation. For their simplicity and reliability, the later approach is used in most cases. For translational shifts International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 37

Transcript of A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant...

Page 1: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

Md. Mahbubur Rahman, and Susumu Horiguchi

School of Information Science, Japan Advanced Institute of Science and Technology

E-mail: [email protected]

Abstract

A moving-object invariant video

mosaicing technique applicable to the detection and monitoring activity over a wide area using an active camera has been proposed. There has been a large literature on image mosaicing, however, many of these algorithms focus on the static scene mosaicing using homography among the frames. There has also been recent work on activity monitoring and detection. Such work generally assumes a non-moving camera. In contrast, we propose a model for activity monitoring and detection over a wide area, where a moving camera is used to sweep over the area of interest and control through network or internet. Our approach is based on the combination of a proposed video mosaicing technique that has been used for constructing panoramic background and static registration along with adaptive background subtraction method that has been used for moving object detection and tracking. Unlike the conventional approach, the proposed mosaicing method employs motion clustering and segmentation techniques for automatically filtering moving objects from the resulting panorama and

does not require any robust estimator for dominant motion estimation. We demonstrated the efficiency of our method by presenting results obtained from real life video data.

1 Introduction Recently, there has been a growing

interest surrounding the mosaic representa-tion of video sequences. Mosaicing is the process of aligning of multiple images into larger aggregates onto a common reference plane. Applications of mosaicing include panoramic photography, virtual environ-ments, video compression and video surveillances. Mosaicing is based on mainly two steps: image registration and image blending. A number of image registration techniques have been proposed in the literature [1-8]. Those methods can be divided into two categories: first category relies on special feature correspondences and the rest is based on transformation optimization from pixel correlation. For their simplicity and reliability, the later approach is used in most cases. For translational shifts

International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 37

Page 2: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

Kuglin and Hines [1] proposed a method of phase correlation which works pretty well even for images with little overlap. Barena and Silverman [2] proposed a method of registration in the spatial domain.

Some other techniques rely on the user

providing with the initial registration with a set of corresponding points in the two images. Using these points the techniques employ some linear transformation which is optimized so as to reduce the error in the overlapping region. Such methods are proposed by Irani and Anandan [5] and Szeliski[6]. In their process they used a planar manifold. A manifold is essentially a reference surface onto which all the input images are projected. Planner transformations are good for simple translations and rotation about the optical axis but severe distortions are introduced in case of more general camera motion. Instead of planer projection, Mann and Picard [3] and Herman [4] used a method of projecting the images onto a cylindrical surface to reduce distortion. Peleg and Rousso [7] suggest that most restrictions on the motion of the camera can be eliminated by using a manifold whose shape is determined adaptively during the registration process.

In essence, such methods consist in

directly minimizing an image based misregistration measure to estimate the unknown structure and motion parameters, generally a least squares method is used. The least squares estimation problem usually converges to local minima especially when large image motion exists between two consecutive frames, otherwise there are too many outliers. That means a good initialization to the unknown structure and motion parameters is crucial. When the image motion between two consecutive frames is large, image pyramid is often used, which first registers a sub sampled smaller image to estimate parameter solutions as the

initial values of the next level image registration. When a significant portion of outliers appears, a robust M-estimate method is used [8]. A drawback of M-estimator compared to a non-robust approach is the additionally introduced computational complexity [9].

With its many applications, one of the

areas where video mosaicing can be used is the wide area surveillance and monitoring. As automatic video object extraction has been proved to be not promising in general situations, in applications such as videoconferences, and surveillances, it is relatively easy to pre-capture the background mosaic which can be helpful for the automatic object extraction. There has been recent work on activity monitoring and detection using pre-stored background mosaic, however, such work generally assumes a non-moving camera. Using a static camera restricts the foreground object to be in a very limited view which is not satisfying in wide area surveillance applications. The benefit of employing an active camera in those applications is that it solves this viewing problem. But it imposes the restriction that conventional background analysis schemes could not be applied easily to the images taken from it. There are only a few research work [10-12] that have addressed this issue of object segmentation and tracking using an active camera but they cannot segment the moving object in an arbitrary pan-tilt angle.

In this paper we present a new method

for automatic generation of moving-object invariant background mosaic from the incoming video of an active camera and its application to wide area surveillance and monitoring. The method exploits the optical flow based technique along with motion clustering for eliminating moving object from the panorama. The background panorama is subsequently used for using

38

Page 3: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

adaptive background subtraction to detect the moving object in the scene. We use extended Kalman filtering to track the moving object in subsequent frames. The combination of the methods results in a reliable detection and tracking of moving objects in wide field of view and can be used in surveillance purposes. The model we proposed use a single camera instead of using multiple cameras and alleviates the problem of synchronization when multiple cameras are used for wide area surveillance. Other than surveillance and monitoring in wide area, the proposed mosaicing technique can be applied in other areas as well e.g. video editing, synthetic panoramic scene generation.

This paper is organized as follows.

Section 2 provides basic concepts of mosaicing techniques. Section 3 presents the proposed method for moving object invariant mosaicing. Section 4 describes the application of proposed mosaicing method. Section 5 shows experimental results and relevant discussions. Section 6 gives conclusions and future directions.

g

2 Image mosaicing The main steps in mosaicing is based on

i) Image registration or warping and ii) Image blending.

Image warping is in essence a

transformation of overlapping images in order to minimize the error in the overlapping region that changes the spatial configuration of an image i.e.,

22:),( ℜℜ=′ awIwI

(1) Here and are input and warped

images respectively, w is the warping

function. If we represent the input and output images in homogeneous coordinates

we can describe the class of 2D planner projective transformations as:

I 'I

),,( Zyx

M

,(1 yx

=

′ Zyx

mmmmmmmmm

Zyx

876

543

210'

'

(2)

(3)

MXX =',

Mo

represetransfor

, successin overminimiobtain problem

the erro

(,∑∀ yx

g

Whimage matrix interpolcoordin

Af

ation, threduce of the tas imatechniqMost solutionintensitweightiinterpol

International Journal of The Computer, the Internet and Management Vol 39

i.e.

)

re complex transformations can be nted using bilinear or polynomial mations. To determine the value of an initialization is made to M and ively modifies M to reduce the error lapping region. Generally nonlinear zation techniques are employed to the optimum value of . The can be described as: if two images and are given, minimize

r in

M

),(2 yxg

.)),(),(,

2,

2''21 ∑

=−yx

yxeyxgyx

(4)

ere are the transformed using the projective transformation

and after that bilinear ation is used to determine integer ates of

),(2 yxg ′′

.2g

M

ter computing the optimal transform-e images need to be merged so as to

any artifacts introduced at the edges wo images. The process is referred to ge blending. A number of different ues can be used for this purpose. mosaicing papers present unique s to this problem, from simple

y averaging to more complex Vornoi ng of the images and Gaussian spline ations.

. 12#1 (January – April, 2004) pp 37 - 52

Page 4: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

(a) (b)

(c)

Figure 1 Problem of static mosaicing. (a) and (b) are two frames of truck sequence.(c) is the mosaiced image. The truck is blurred in the mosaiced image due tooverlapping.

robust statistical techniques or so called M-estimator. In M-estimation based algorithm, the unknown parameters are estimated by minimizing the objective function of residual errors i.e.,

In general, image mosaicing techni-ques assume a static scene. The presence of a moving object creates additional complexity in registration process (figure 1). First, such object can throw off the process providing incorrect information about how the images should be aligned. This results in a poor quality panoramic image. The second issue with moving objects is in the construction of the panoramic view. Simply taking the average pixel value in constructing the panorama yields a final mosaic that contains the traces of moving object. Registration errors can be addressed through the use of

min , (5) );( σρ∑

iir

where is the residual error, and is

the scale factor. The problem of employing M-estimator compared to a non-robust approach is the additionally introduced computational complexity.

ir σ

40

Page 5: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

puter, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52

41

3.1 Optical flow estimation 3 Proposed mosaicing approach The fundamental assumption in

differential based optical flow estimation is intensity conservation, from which the principal relationship between intensity derivatives and optical flow can be derived [14]. Let I be the intensity at time

at the image point ( Then if u and v are the x and components of the optical flowvector at that point, we expect that the intensity will be the same at time at the point ( , where and That is,

),,( tyx

)y

tuδ=

t )., yxy

.tvδ=

),( yx

)yδ+

,(x

tt δ+xδ

, yxx δ+yδ

In this section we propose a optical flow based approach for mosaicing video frames invariant of the presence of moving object. The moving object (outliers) is identified in two stages. By means of clustering and using the motion values and color components we label the moving region incoherent with the background first. After that, estimating the affine motion parameters (motion layers) to the regions of the frame and comparing with the identified regions in previous step, the moving object is extracted from the resulting background. Therefore our proposed method consists of the following steps (Figure 2):

),,(),,( ttyyxxItyxI δδδ +++= (6) 1. Compute dense optical flow for a small time interval By

expanding the right hand side of the above equation using Taylor series expansion we get,

.tδ2. Cluster the optical flow map using motion vectors and color values

3. Calculate the affine motion para-meters

International Journal of The Com

Affineparameter

estimation &outliers filtering

Frame i-1 Frame i Frame i+1

optical flowestimation

Optical flowestimation

Clustering &outliers labelling

Clustering &outliers labelling

Affineparameter

estimation &outliers filtering

Projective paramer matrix

Frame n-1 Frame n

Optical flowestimation

Clustering &outliers labelling

Affineparameter

estimation &outliers filtering

Figure 2 Process of determining projective parameter matrix

4. Merge the frames to obtain the resulting panorama .),,(),,(),,(),,(),,( httyxIytyxIxtyxItyxItyxI tyx ++++= δδδ

(7)

Ignoring higher order terms we get,

0=++ tyx IvIuI (8)

..to

Page 6: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

42

]

]

or equivalently in short, (9) 0),(),( =+⋅∇ tXIVtXI t

here, denotes the partial

derivative of , ),( tXI t),( tXI

)),,(),,((),( tXItXItXI yx=∇

VI ⋅∇

and denotes the usual dot product. For estimation the motion components, other constraints are imposed. We assume here that the flow in a local window is constant. In this model the optical flow field is fitted to a constant model in each special neighborhood. Therefore, optical flow estimates are computed by minimizing the weighted least square fitting

[ 22 ),().,()()( tXIvtXIXWXE tx

+∇= ∑Ω∈

(10) where denotes a window

function that gives more influence to constraints at the center of the neighborhood than those at periphery. Minimizing this fitting error with respect to V leads to the equation ∇ from which the optical flow field can be solved

)(XW

)( =VE ,0

as

−−

=

∑∑

∑∑∑∑

ty

tx

yxy

yxx

IWIIWI

WIIWIIWIWI

V1

2

2

.

(11) The local model method repeatedly

solves this linear system and then integrates the computed values of V to provide an estimate of the optical flow field over the aggregate time interval. The optical flow estimation is valid upon the assumptions of constant intensity along motion trajectory and small motion. But the problem is that the small motion assumption is often violated. Therefore we adopted a multiresolution approach. In multiresolution framework, a

Gaussian pyramid of images is generated by iteratively filtering and subsampling by a factor of 2. To reduce the aperture effect, the local window size and threshold of smaller eigen value of the Hessian of the image are chosen dynamically. Figure 3 shows an experiment.

3.2 Clustering optical flow map In this stage the optical flow map

obtained from the previous stage is clustered using motion vectors, pixel positions and color values of the pixels to approximately determine the moving object region. The clustered map without the moving object region is fed to the next stage. As the moving object region gives incorrect registration information therefore it is wise not to use this region for registration process. We use k-means clustering with a feature vector

, where u and v are the components of the motion vectors and ( is the pixel position and r re normalized color values of the pixel i.e.,

[ bgryxvuF ,,,,,,=), yx

b, ag,

BGR

Rr++

= (12)

BGRGg++

= (13)

BGRBb++

= (14)

Pixel intensity allows the algorithm to

separate pixels of different objects and the consideration of image coordinates concentrates the pixels of each region. Initially cluster means are chosen arbitrarily. The distance between pixel P and cluster

is measured using a Mahalanobis distance defined by

i

jC

T

jijjiji FFWFFCPD )()(),( −−= (15)

Page 7: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 43

where is the feature vector of pixel ,

iFiP jF is the feature vector of cluster C

and it is composed of the mean of the component values of the cluster and W is a diagonal matrix with seven elements to weight the color intensities since they are of a different scale from the rest of the components in the feature vector . One way to compute the weight vector components of

of any cluster C is as follows [15]:

j

j

jW j

here, is the number of pixels in cluster j, is the i-th feature of pixel P ,

pNif if is the i-

th feature of cluster C . Here K is the total number of features in the feature vector. A minimum and maximum region size τ and is used control the cluster refinement step. A cluster is divided if the following condition holds:

j

min

maxτ

iN( maxτ or iN( minτ and

2iσ )) tσ

(19)

i

ii

cwσ

= (16) where N is the number of pixels in

cluster , and is the variance threshold. We assume that, the background occupies a larger region than the moving objects. Firstly, using the magnitude of motion vectors,

ii tσ

22 vuEopf += the moving regions are roughly determined. After that the clustering is performed only on the moving region to get the moving object region. As the clustering stage does not guarantee that

KK

jiic

1

1

= ∏

=

σ (17)

∑∈

−=

p

j

N

CPii

pi ff

N2)(

11σ

(18)

(a) (b) (c)

(e) (f) (g)

Figure 3 Process of motion clustering in car sequence. (a) ,(b) and (c),(d) are frames from “car1” and “car2” sequence respectively. (e),(g) are their optical flows and (f),(h) are their corresponding clustering results.

Page 8: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

all pixels are spatially connected therefore, a connected component analysis is performed afterwards. The resulting regions are then labeled.

3.3 Affine motion parameters estimation In this stage affine motion model is

assigned for each region of the image including the region labeled in previous stage. We employed here affine motion model which consists of only six parameters and can describe motions commonly encountered in video sequences. This transformation includes translation, rotation, zoom, shear and any linear combination of these. Affine motion can be described as:

321),( ayaxayxua ++=

654),( ayaxayxva ++= where denote the pos

pixels. The parameters aare estimated by linear least procedure to minimize the residual e

),( yx321 ,,, aa

[∑∀

−+−=yx

a yxvyxuyxun ,

22 ),(()),(),((1σ

where is the number of pixe

image region. At the beginning, thesegmented arbitrarily with regioninclude the regions labeled beforesplit-merge procedure is applied the affine motion model estimatThe final segmented image includeobjects and the background regaccurately determine the outliers correspondence is done. For eachregion determined in previous its corresponding region is foundcurrent stage then it is removed registration process in next stage

correspondence is ensured if the two regions intersection is non-empty and

n

NiLlTRRc ril ..1,..1; ==<−=

(23) where is the labeled region, is the

region found in the current stage, and T is a region corresponding threshold and L and

are total regions found in previous and current steps respectively.

lR iRr

M

3.4 Frame Merging Merging of frames is performed by

projecting the frames on to a common anchor frame. Any frame can be selected arbitrarily from the sequence as an anchor frame. After that the frames in the sequence prior to the anchor frame is forward projected to the anchor frame and the frames in post positions of the anchor frame are backward

(20)

)

(21

ition of

squares rror

654 ,, aaa

a yxv 2)),( ]

ls in the image is s which hand. A to refine ion [16]. s moving ions. To a region labelled stage, if in the

from the . Region

projected to the anchor frame. Suppose i th frame in the sequence is selected as anchor frame. Then i -th frame is backward projected to the i th frame using the computed projection parameters in previous stage. The transformation between non-contagious frames can be obtained by multiplying the transformation matrices of the in-between frames. The transformation

between frames and is computed as

1+−

fjiP , if j(22)

∏−

=+=

1

1,,

j

ikkkji PP (24)

where The above equation can be

used to compute the transformation P between any sequence frame and the anchor frame . The merging process is shown in figure 4. Once the global transformation between frames are obtained, these frames are integrated to into the final mosaic. The pixels in the region of overlap in the resulting panorama need to be assigned values such that it creates a smooth transition

.ji⟨

f

ianchor,

if

anchor

44

Page 9: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

1 2 i n-1Anchor frame

n

Figure 4 Frame merging process. Frames 1 to i-1-th are forward projected to anchor frame i.

Frames i+1 to n-th frame are backward projected to anchor frame.

from one frame to another. Among different methods, we chose temporal median filtering. The benefit of using median operator is that it removes the temporal noise.

4 Wide area surveillance and monitoring

In this section we propose a model of

a surveillance and monitoring system using the pre-stored panoramic background constructed offline by the method described above . The model uses a single active camera for object detection and tracking. The model is shown in figure 6. It is composed of

warping, motion detection, tracking and camera control module. The template store is used to classify the moving object if needed. Camera is real time controlled by the system. Also camera pan and tilt angle position is obtained in real-time to use this information to roughly locate the sub view in the mosaic image. Adaptive background subtraction along with background update is used for moving object detection. A centroid tracking method is employed to further reduce complexity in computation. We briefly describe the modules in subsequent sub sections.

Panoramaframe i-1

frame i

Warping Update

Motion detection

Templatestore Tracking Camera

control

Camera Figure 5 The proposed model of wide area surveillance and monitoring system.

International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 45

Page 10: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

4.1 Registering frames to background mosaic Our proposed system initially builds a

background mosaic of the entire area to be monitored. During monitoring, as the camera rotates, the corresponding viewable subset of the background mosaic is indexed and updated to perform motion detection and tracking. For estimating the corresponding alignment parameters, a large displacement is to be handled. Therefore, to reduce computation, during the mosaic construction, the pan and tilt angles and corresponding affine warp parameters are stored for all the images in the sequence. In registering the current frame to the mosaic, first we coarsely index the mosaic from the given rotation angle of the camera using the stored affine parameters. This coarse registration process registers current frame with the background mosaic with small error. After that, assuming only translation between the images, we find the alignment parameters using least square estimates. Once the translational parameters are found, then the current frame is correctly aligned with the background mosaic. Therefore real time implementation is possible using this technique.

∈−×−+

∈=+

nnnun

uu

nn

unu MxxpxfcxT

MxxTxT

,)()()1()(

),()(1

αα

4.2 Moving object detection We employed an adaptive background

subtraction algorithm to detect the moving object. For each incoming frame , corresponding background portion p in the mosaic is determined using warping. After that the moving region is identified by

ifi

uiii Tpfv >−=

where T is a threshold. The background mosaic is updated using the knowledge in the incoming frames non moving portions. If is the non

moving part of the incoming frame, then the corresponding part of the mosaic is updated as follows:

u

)(Sfi

ip

)()1(1 Sfpp iii ββ −+=+ (26)

where 0 β 1. We update the threshold value

dynamically in accordance with the changing environmental conditions as too low a value will swamp the difference map with spurious changes, while two high a value will suppress significant changes. The threshold value is updated in harmony with the temporal change results. The following equation describes the updating process:

(27)

where is the moving region and c is a constant.

nM

4.3 Moving object Tracking The exact position detection of moving

object is a noisy process. If the noise is assumed to be gaussian then it can be handled by using an extended Kalman filter. For tracking, the centroid C of the moving object region is used. Here,

),( cc yx

)()( ),(

RA

xRx Ryx

c

∑∈= (28)

)()( ),(

RA

yRy Ryx

c

∑∈= (29)

(25) where is the area of the moving

object region. The trace of centroid in the x direction can be modeled as

)(RA

46

Page 11: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

International Journal of The Computer, the Internet

4.4 Camera Control 2

,,1 21 TaTvxx kxkxkk ++=+ In tracking mode the active camera is

Tavv kxkxkx ,,1, +=+

where is the random time va

acceleration, is the correspovelocity and T is the time between sand Therefore the system camodeled as :

kxa ,v kx,

.1+k

kkk gSS Γ+Φ=+1 where S ,

system Gaussian noise representingacceleration of the centroid and

[ ]Tkykkxkk vyvx ,, ,,,= kg

10000100

010001T

T

and

.

00

2/002/

2

2

TT

TT

(33)

A measurement can be modeled as

kkk nHSz += where is the measurement, n

Gaussian measurement noise and H measurement transfer function. covariance form of Kalman filtering [1be used to recursively update the predbased on the innovation information astep. The predicted variable at each sused to control the tracking process.

kz k

(30)

controlled to put the moving object in the central region of the captured image

(31)

according to the feature points. Once the feature point leaves the central point the active camera is readjusted. The pan-tilt speed of the camera is determined by the distance between the predicted feature points. If the predicted centroid position is ^

and the centroid at k is , then camera speed can be determined by

1+kC kC

rying nding tep k n be

(32)

is the the

tCCwC kk

v ∆−

= + )( 1

^

where w is a weighting factor and ∆

is the time interval between two consecutive measurements.

t

5 Results and Discussion

This section presents experimental results obtained from video shots acquired with a commercial camcorder. The experiments were done using three video sequences of outdoor scenes. The first sequence contains a moving truck, moving from left to right and we call it as “truck sequence" video. The other two of the clips

(34)

and Management Vol. 12#1 (January – April, 2004) pp 37 - 52

47

is the is the

The 7] can iction t each tep is

contain moving car and we call these as “car1’ and ‘car2” sequence respectively. The frame size used was 480 320 pixels. For “truck”, “car1” and “car2” sequences 120, 64, and 32 frames were used respectively in panorama construction process.

×

Figure 1 shows two frames from truck

sequence. Figure 4 shows frames from “car1” and “car2” sequence. Figure 6 shows the panoramic backgrounds generated from “car2”, “car1” and “truck” sequence

Page 12: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

respectively. In “car2” sequence, collected from [17], a fast moving car is always present in the scene. The resulting panorama size is 553×317. For the “car1” sequence the panorama obtained is 682×257. In the truck sequence, a truck is moving on a hilly road. The scene is non planar in nature. The obtained panorama (1038×312) shows good result. In all of the cases the moving object is automatically disappeared from the scene without leaving any artifacts. Here we notice that moving object does not interrupt in constructing panoramas. The obtained panorama incurred some resolution loss by the blending process, yet the result is satisfactory. We found that if the moving object has the relative motion with respect to background, the proposed method can always handle it well. In figure 7, the mosaic based foreground object detection is has been demonstrated. The extracted object has post processed to get the approximately correct shape. Object tracking simulation is presented in figure 8. The centrroid of truck in 100 frames is calculated and the corresponding predicted values were obtained using Kalman filtering. From the figure it is obvious that the reliable surveillance system can be built using the proposed model.

6 Conclusions This paper describes a new moving-

object invariant video mosaicing approach and its application to the wide area surveillance and monitoring. The method uses the optical flow based technique along with motion clustering and region thresholding for eliminating moving object from the panorama. The method does not restrict the presence of multiple moving objects in the scene, and thus it is more robust than other methods in this respect.

The main advantage of this approach is that it does not employ any robust estimator (M-estimator) which is employed by conventional methods. A model for wide area surveillance and monitoring is also presented which uses the panorama generated by the proposed method. The constructed background panorama is used subsequently for adaptive background subtraction to detect the moving object in the scene. An extended Kalman filtering based technique is used to track the moving object in future frames. The combination of these methods results in a reliable detection and tracking of moving objects in wide field of view and can be used in surveillance purposes. The model we proposed use a single camera instead of using multiple cameras and alleviates the problem of synchronization when multiple cameras are used for wide area surveillance and hence reduces cost. Other than surveillance and monitoring in wide area, the proposed mosaicing technique can be applied in other areas as well e.g. video editing, synthetic panoramic scene generation. It is difficult to determine the exact shape of a large moving object whose parts are in different motions. Temporal shape tracking may help solve the problem.

48

Page 13: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 49

(a)

(b)

(c) Figure 6 Mosaicing result. (a) , (b), (c) are mosaics generated from “car2” , “car1” and “truck” sequence respectively . Moving objects have been eliminated by the mosaicing process.

Page 14: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

50

(i) (ii)

Figure 7 Object detection using mosaic created from truck sequence (i) input frame (ii) trace of foreground object (iii) The input frame registered with background mosaic.

(iii)

X-P

ositi

0 10 20 30 40 50 60 70 80 90 100125

126

127

128

129

130

131

132

133

134

Frame number

Y-P

ositi

on

truepredicted

0 10 20 30 40 50 60 70 80 90 100234

236

238

240

242

244

246

248

Frame number

on

truepredicted

(i) (ii)

Figure 8 Simulation of tracking of moving object using Kalman filter of the above experiment. (i) X-position of centroid in different frames (ii) Y-position of the centroid. In bothfigures the predicted position is approximately near to the real position.

Page 15: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance

References

[10] Barcelo, L., and Binefa, X., “Bayesian video mosaicing with moving objects”, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 16, No. 3 (2002) 341-348

[1] Kuglin, C.D. Hines, D.C., “The phase correlation image alignment method”, In Conference on Cybernetics and Society, pp. 163-165, 1975.

[2] Barena, D., Silverman, H. “A class of

algorithms for fast digital registration”, IEEE Transsactions on Computers, C-21, pp. 179-186,1972.

[11] Pan, J., Lin, C., Gu, C., et.el., “A

robust spatio-temporal video object segmentation scheme with pre stored background information”, in Proc. IEEE Int. Symp. Circuits and System, Arizona, USA,May 2000.

[3] Mann, S., Picard, R. W., “Virtual

bellows: constructing high quality stills from vodeo”, In IEEE International Confe-rence on Image Processing, pp. 363-367,1994.

[12] Hat, S., Saptharishi, M., and Khosla,

P. K., “Motion Detection and Segmentation Using Image Mosaics,” in Proc. IEEE Int. Conf. Multimedia and Expo, pp. 1577-1580, NY, USA, Jul. 2000.

[4] Peleg, S. Herman, J., “Panoramic

mosaics by manifold projection”, In CVPR, pages 338-343, 1997.

[5] Irani, M., Anandan, P., Hsu, S., “Mosaic

based representations of video sequen-ces and their applications”, In ICCV, pp.605-611,1995.

[13] Mittal A., Huttenlo, D., “Scene modeling for wide area surveillance and image synthesis”, In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Hilton Head, South Carolina, June 2000.

[6] Szeliski, R., “Video mosaics for virtual

environments”, In IEEE WACV, pages 22-30, 1996.

[14] Beauchemin, S. S., Barron, J. L., “ The computation of optical flow”’ ACM Computing Surveys, 27(3):433–467, 1995.

[7] Rav-Acha, A., Peleg, S., etel.,

“Mosaicing on adaptive manifolds”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22 (10):1144-1154, 2000.

[15] Bedenas, J., Boder, M., Pla, F.,

“Segmenting Traffic Scenes from Grey Level and Motion Information”, Pattern Analysis & Applications, Volume 4 Issue 1 (2001) pp 28-38.

[8] Feng, S., Lu, H., Ma, S., “Mosoaic

representions of video sequences based on slice image analysis”, Pattern Recognition letters, 23(2002)513-521.

[16] Borshukov G. D., Bozdagi, G.,

Altunbasak, Y., et.el., “Motion Segmentation by Multistage Affine Classification”, IEEE Transaction on

[9] Smolic, A., Ohm, J., “Robust global motion estimation using a simplified M-

estimator approach”, In ICIP 2000 Canada, September 2000.

International Journal of The Computer, the Internet and Management Vol. 12#1 (January – April, 2004) pp 37 - 52 51

Page 16: A New Moving-Object Invariant Video Mosaicing based method ... · A New Moving-Object Invariant Video Mosaicing based method for Remote Surveillance puter, the Internet and Management

Md. Mahbubur Rahman, Susumu Horiguchi

Image Processing , Vol. 6, No. 11, November 1997

[17] Bishop, G., Welch, G., “An Introduc-

tion to the Kalman Filter”, ACM SIGGRAPH tutorial, Los Angeles, Califonia, USA , August 2001.

[18] F. Odone, A. Fusiello and E. Trucco,

“Layered repesentation of a video shot with mosaicing”, Pattern Analysis and Applications (2002) 5:296-305. www.cee.hw.ac.uk/~franci/

------

52