Automatic Target Tracking System Using Computer Vision

8
International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016 All Rights Reserved © 2016 IJORAT 1 Automatic Target Tracking System Using Computer Vision Prakash.P, Gowthami.P, Sathishkumar.V.S Assistant Professor, Department of ECE, Surya Engineering College, Erode, India Assistant Professor, Department of EEE, Surya Engineering College, Erode, India Assistant Professor, Department of ECE, Nandha Engineering College, Erode, India Abstract: The project is an automated tactical device which performs pre-programmed stealthmission operations, unaided by human interferenceand is most effective at the night. We utilize an infra-red camera to take continuous images of the area that has to be monitored. A laser gun is used for assault but the unit can be fitted with a range of different weapons. An infra-red camera is used to capture video and the obtained digital images are continually processed using programs in MATLAB and when desired conditions are reached, the output is sent to Arduino Uno microcontroller via com port. The data is then used to control a laser gun. The infra-red camera takes image which consists of heat signatures of human image. The target is fixed in the coding to be at the highest intensity of the infra-red rays in the image and shooting angle is determined. The microcontroller used in this project is used to interface the output of the MATLAB to the motor. Keywords:Arduino; Target tracking; Capture Video; Digital Image Processing; MATLAB I. INTRODUCTION Human detection and tracking are importantcomponents of video analytics (VA) in multi-camerasurveillance. This paper proposes a framework for achieving these tasks in a multi-camera network. The proposed system configuration is different from existing multi- meresurveillance systems in which utilize common image information extracted from similar field of views (FOVs) to improve the object detection and tracking. Our basic idea was to develop a weather station which is easy to use with different technologies (Smartphone, Internet, performance. However, in practice, such camera setup may not be easily achieved because of economical concern, topology limitation, etc. Therefore, we focus on the non-overlapping multi-camera scenario in this paper, and our main objective is to develop reliable and robust object detection and tracking algorithms for such environment. Automatic object detection is usually the first task in a multi-camera surveillance system and background modelling (BM) is commonly used to extract predefined information such as object’s shape, geometry and etc., for further processing. Pixel-based adaptive Gaussian mixture modelling (AGMM) is one of the most popular algorithms for BM where object detection is formulated as an independent pixeldetection problem. It is invariant to gradually light change,slightly moving background and fluttering objects. However,it usually yields unsatisfactory foreground information (objectmask) for object tracking due to sensor noise and inappropriate GM update rate, which will lead to holes, unclosed shape and inaccurate boundary of the extracted object. While sensor noise can Be suppressed through appropriate filtering, it is difficult to find an optimum update rate of the model because different objects behave differently in the scene. Furthermore, important information of the object such as edge and shape are not utilized in such method. Therefore, the performance of subsequent operations such as object tracking and recognition will be degraded. In this paper, a mean shift (MS)-based segmentation algorithm is proposed for improving the object mask obtained by AGMM. By using the segmentation information, holes within the mask can be significantly reduced through in painting and better alignment between the object boundary and those of the mask can be obtained. Occlusion of moving objects is a major problem inmulti-camera surveillance systems. In existing multi-camerasurveillance systems, occlusion problem is addressed by fusing the BM information obtained from the overlapped image information in adjacent cameras. These approaches,however, are not directly applicable to our non-overlappingsetup. Therefore we propose to use stereo cameras, which offer additional depth information to resolve the occlusion problem. Developing intelligent driver assistance system using thermal infrared (IR) imagery is being paid a lot of attention. Researches are mainly focused on recognizing IR images of their silhouette.Therefore, intelligent recognition system based on thermalimages characteristics is presented in this paper. The proposed by employingnew thermal distributed features of IR images. The distribution of temperature with in the human body yields an artificial heat map which indicates a high temperature region at the human heart location. On the contrary, non-living objects contain no variation of temperature throughout itself. Fur thermo, average luminance within human body demonstrates adequate element. Average luminance of upper body is higher than lower body's while average luminance of both

description

Abstract: The project is an automated tactical device which performs pre-programmed stealthmission operations, unaided by human interferenceand is most effective at the night. We utilize an infra-red camera to take continuous images of the area that has to be monitored. A laser gun is used for assault but the unit can be fitted with a range of different weapons. An infra-red camera is used to capture video and the obtained digital images are continually processed using programs in MATLAB and when desired conditions are reached, the output is sent to Arduino Uno microcontroller via com port. The data is then used to control a laser gun. The infra-red camera takes image which consists of heat signatures of human image. The target is fixed in the coding to be at the highest intensity of the infra-red rays in the image and shooting angle is determined. The microcontroller used in this project is used to interface the output of the MATLAB to the motor.

Transcript of Automatic Target Tracking System Using Computer Vision

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 1

Automatic Target Tracking System

Using Computer Vision

Prakash.P, Gowthami.P, Sathishkumar.V.S Assistant Professor, Department of ECE, Surya Engineering College, Erode, India

Assistant Professor, Department of EEE, Surya Engineering College, Erode, India

Assistant Professor, Department of ECE, Nandha Engineering College, Erode, India

Abstract: The project is an automated tactical device which performs pre-programmed stealthmission operations,

unaided by human interferenceand is most effective at the night. We utilize an infra-red camera to take continuous

images of the area that has to be monitored. A laser gun is used for assault but the unit can be fitted with a range of

different weapons. An infra-red camera is used to capture video and the obtained digital images are continually

processed using programs in MATLAB and when desired conditions are reached, the output is sent to Arduino Uno

microcontroller via com port. The data is then used to control a laser gun. The infra-red camera takes image which

consists of heat signatures of human image. The target is fixed in the coding to be at the highest intensity of the

infra-red rays in the image and shooting angle is determined. The microcontroller used in this project is used to

interface the output of the MATLAB to the motor.

Keywords:Arduino; Target tracking; Capture Video; Digital Image Processing; MATLAB

I. INTRODUCTION Human detection and tracking are importantcomponents

of video analytics (VA) in multi-camerasurveillance.

This paper proposes a framework for achieving these

tasks in a multi-camera network. The proposed system

configuration is different from existing multi-

meresurveillance systems in which utilize common

image information extracted from similar field of views

(FOVs) to improve the object detection and tracking. Our

basic idea was to develop a weather station which is easy

to use with different technologies (Smartphone, Internet,

performance. However, in practice, such camera setup

may not be easily achieved because of economical

concern, topology limitation, etc. Therefore, we focus on

the non-overlapping multi-camera scenario in this paper,

and our main objective is to develop reliable and robust

object detection and tracking algorithms for such

environment. Automatic object detection is usually the

first task in a multi-camera surveillance system and

background modelling (BM) is commonly used to extract

predefined information such as object’s shape, geometry

and etc., for further processing. Pixel-based adaptive

Gaussian mixture modelling (AGMM) is one of the most

popular algorithms for BM where object detection is

formulated as an independent pixeldetection problem. It

is invariant to gradually light change,slightly moving

background and fluttering objects. However,it usually

yields unsatisfactory foreground information

(objectmask) for object tracking due to sensor noise and

inappropriate GM update rate, which will lead to holes,

unclosed shape and inaccurate boundary of the extracted

object. While sensor noise can

Be suppressed through appropriate filtering, it is difficult

to find an optimum update rate of the model because

different objects behave differently in the scene.

Furthermore, important information of the object such as

edge and shape are not utilized in such method.

Therefore, the performance of subsequent operations

such as object tracking and recognition will be degraded.

In this paper, a mean shift (MS)-based segmentation

algorithm is proposed for improving the object mask

obtained by AGMM. By using the segmentation

information, holes within the mask can be significantly

reduced through in painting and better alignment

between the object boundary and those of the mask can

be obtained. Occlusion of moving objects is a major

problem inmulti-camera surveillance systems. In existing

multi-camerasurveillance systems, occlusion problem is

addressed by fusing the BM information obtained from

the overlapped image information in adjacent cameras.

These approaches,however, are not directly applicable to

our non-overlappingsetup. Therefore we propose to use

stereo cameras, which offer additional depth information

to resolve the occlusion problem.

Developing intelligent driver assistance system using

thermal infrared (IR) imagery is being paid a lot of

attention. Researches are mainly focused on recognizing

IR images of their silhouette.Therefore, intelligent

recognition system based on thermalimages

characteristics is presented in this paper. The proposed

by employingnew thermal distributed features of IR

images. The distribution of temperature with in the

human body yields an artificial heat map which indicates

a high temperature region at the human heart location.

On the contrary, non-living objects contain no variation

of temperature throughout itself. Fur thermo, average

luminance within human body demonstrates adequate

element. Average luminance of upper body is higher

than lower body's while average luminance of both

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 2

right and left side are almost same. This enables

the recognition in condition of obstacles-existence. In

particular, the proposed intelligent recognition system

utilizes feasible and low-cost approach, compared to

prior works.

Fig 1. Distribution of human temperature region

Figure 2. Description of high, medium and low

temperature region within selected regions

II. RELATED WORK

The proposed approach was conductedusing

thermalimages captured by camera mounted on test

vehicle. The presented method applied real-time

process in order to employ in pedestrian protection

system, thus the computing time per images was

also regarded at the end of process including

detection rate.Each generated area using multiple

threshold method and labelling shows unique

correlation among each other. The centroid point and

thermal regions arealso demonstrating relevant

characteristic for this system. Coordinate of x-axis

arealmost similar, however the y-axisshow increasing

tendency as the temperature rises. The Centroid point

of high, medium and low temperature regions are

calculated using the equation , accordingly the

mentioned correlation is used to determine selected

region either human or not.

The extracted thermal regions are segmented into 3

x 3 blocks beforehand, in order to compute the average

luminance for each block. The average luminance

related with temperature distribution, where average

luminance and yaxis coordinate are proportional to each

other. Meanwhile,average luminance is almostsame

within x-axis coordinate. By employingthis feature,

the pedestrian recognition is enabled even though the

previous features are futile. The general definition of

average luminance, Av is described as the total of

each pixel's intensity in thermal region, IE divided by

total pixel within thermal region,is shown as below:

Each generated area using multiple threshold method

and labelling shows unique correlation among each

other. The centroid pointand thermalregions arealso

demonstrating relevant characteristic for this system.

III. SCOPE OF RESEARCH

In order to solve the imperfections of AGMM, a MS

basedsegmentation algorithm is proposed in this paper.

We use theobject mask obtained by AGMM to estimate

an initial locationof the object. Then the original mask is

improved by usingMS segmentation. More precisely, the

holes will be in paintedand the object boundary will be

better aligned with true objectboundary. Furthermore, we

also use K-means for separatingoccluded objects in the

mask by using depth information.Details of the algorithm

are discussed as follows.

A. Object Segmentation Using Mean Shift (MS)

A colour-based MS segmentation is used for improving

the object mask by using the AGMM result as well as

colour information. This algorithm is called

AGMM+Seg. Algorithm subsequently. The detailed

steps are summarized as below:

1) Perform colour-based MS segmentation on the colour

Pixels which lie in a slightly larger bounding box (20%)

Than one that just covers the estimated mask. The

Segmentation result and different colours represent

different segments.

2) Filter out the small regions in the original object mask

Obtained by AGMM. More precisely, it is obtained by

Labelling all the connected components in the binary

Object mask and only keeping the component with

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 3

Largest number of pixels.

3) For each segment, if the ratio between the number of

The foreground pixels and the total number of pixels in

that segment extends a predefined threshold, we will

Consider it as a part of the foreground object. Otherwise,

That segment is considered to belong to the background.

An empirical threshold value 0.05 is used in thispaper.

B. Occluded Object Segmentation Using DepthMap

and Means Clustering

Here, we further propose a segmentation method for

Separating occluded objects in the mask so that the

objects canbe individually identified, if necessary there

are two objects in the mask. Hence, our objective is to

separate the object mask into two independent masks

which only contain one object each. In general, it can be

achieved by using K-means clustering on depth map with

The help of say MS segmentation. More precisely,

differentobjects may have different depths in the scene.

Hence, wecan use them to help separating the objects

especially for occluded objects.

A real-time dynamic programming (DP) stereo matching

method is employed here for depthestimation. In this

method, the matching cost of each pixel isaggregated

over a window. More precisely, the sum of absolute

differences (SAD) aggregation of the aforementioned

cost is used.

Then, DP is used for picking up the disparity with global

minimum cost for each pixel. The depth map estimated

by this efficient method is still noisy. Therefore, an edge

preserving bilateral filter is used for depth map

smoothing. It shouldbe noted that more advanced depth

estimation algorithmsare available and DP is used

because of its low arithmeticcomplexity and reasonably

good performance. To determine the number of objects

from the refined depth map in the mask, the histogram of

depth values inside the mask. Then a non-parametric

histogram segmentation method is used to determine the

object number.

Figure 3. Processing steps ofobject detection:

(a) Image frame containing objects A and B,

(b) Object mask obtained by AGMM

(c) Segmentation map,

(d) Refined object mask,

(e) Raw depth map,

(f) Refined depth map,

(g) And (h) segmented objects A and B, and

(i) 1D normalized histogram of depth values in (f).

IV. PROPOSED METHODOLOGY AND

DISCUSSION

We now describe the proposed BKF-SGM-IMS

algorithm.The principle of BKF with GM (BKF-GM) is

outlined first,which is followed by the proposed BKF-

SGM using directGM simplification. Finally, an

improved MS algorithm will beincorporated to the BKF-

SGM to yield the desired algorithm.Similar to

conventional KF, the MS provides measurements tothe

GMs in the BKF and yields the posterior density of the

State and hence location of the object, while the state

equationprovides prediction of the target location in next

frame in formof a prior state density.

A. Bayesian Kalman Filter with Gaussian Mixture

The probabilistic object tracking problem can be

modelled by the discrete-time linear state-space model as

follows:

xk =Akxk−1+wk

zk =Ckxk+vk

wherexk and zk denote respectively the state and

observation(measurement) vectors at timek.Akdenotes

the state transitionmatrix and Ck constitute the

observation model which relatesthe measurement with

the state. wkandvk denote respectivelythe process and

observation noise vectors, and are assumed tobe mutually

independent.In this paper, the state vector and transition

matrix aredefined as

xk = xk,yk,Hxk,Hyk

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 4

wherexk, yk, (Hxk,Hyk) are the half axes of the

trackingbounding box with the state defined asCk =[I4×4

04×4]. zk is calculated by the improved MS.A Gaussian

distribution with mean ul andcovariance Qkl. In the rest

of the paper, the measurementnoise is assumed to be

zero-mean Gaussian withcovariance matrix Rk,i.e.,p(vk)

= N(vk;0,Rk),forsimplicity.

Now suppose further that Zk−1 denotes theobservations

collected up to time instant k−1andtheapriori

pdfp(xk|Zk−1)is modeled by the following GM

withp(zk|xk)p(xk|zk−1)dxk is a normalization constant.

Note that the second equality consists of the products of

two Gaussian distributions. Thus, each of these products

can be approximated by a Gaussian distribution for

which the KF for Gaussian processes is directly

applicable.Similar to the derivation, the a prior density is

p(xk+1|Zk) = p(xk|Zk)p(xk+1|xk)dxk

We can see that the number of mixture components

and hence the complexity grow exponentially after each

recursion. To keep a constant complexity, the GM model

should be approximatedas the form of BKF with

Simplified GM (BKF-SGM). Consider the following GM

model with components:

f(x)=n

j=1

φj(x)=N(x;uj,Hj)

BKF-SGM, our goal is to approximatef(x)with a

simplifiedmixture model with fewer components

g(x)=wigi(x),

gi(x) =N(x;ti,˜ Hi)

These equations are respectively the weight, center

and covariance matrix ofthe i-th component and should

be summed up to one.

Given a distance measureD(f(x),g(x)) between functions

f(x) and g(x), the error of approximating f(x) by g(x) is

D(f(x),g(x))=((f(x)−g(x))2dx)

Conventionally, the simplification is done by resampling

following by clusteringusing the K-means or EM

algorithm. However, the complexitydepends

exponentially on the dimension of the state and hence

It will soon become infeasible. Here, we adopted the

twostep algorithm developed in for machine learning for

Model order reduction while avoiding the additional

resampling process. At thek-th iteration, the component

mixture is partitioned intomgroups. BKF-SGM in one

dimension withξk =4andηk =2. GM of the prior density

atk−1 time instant and illustrates the exponential growth

of the GM duringprediction. The order reduction results.

The major advantage of the proposed method over other

Conventional resampling methods is the use of an

efficientmixture simplification method with a lower

complexity.

Forexample, let the number of component and dimension

of statebenanddrespectively. Then, the complexity of the

greedyEM algorithm is related to the number of

candidates and data size used in the greedy EM

algorithm. On the other hand, the complexity of the two-

step algorithm. T and L denote the numbers of iterations

and areusually small as suggested. Since, is typically

verylarge, say from 1000 to 5000, for the greedy EM

algorithm, the method in has a higher complexity.

B. BKF-SGM with Improved MS (BKF-SGM-IMS)

The observation vectorzk definedin can be obtained by

the MS tracker. It models theappearance of the tracked

object by a weighted color histogram and the object

center (xk,yk) is calculated by iteratively maximizing the

similarity between the tracked object and its candidate.

Histogram similarity is defined in terms of the

Bhattacharya coefficient and distance. At each iteration,

the mean shift vector is computed such that the

histogram similarity is increased. This process is

repeated until convergence is achieved.

Conventionally,(Hxk,Hyk) is calculated by scale

adaptation, which modifies(Hxk−1,Hyk−1) with a certain

fraction (±10%)and let the MS converge again. In this

paper,(Hxk,Hyk)is directly obtained from the object

mask, which further increases the accuracyand efficiency

of the MS.

Fig 4. Illustration of the convergence behaviour of the

BKF-SGMIMS algorithm

Fig 4(a) Tracked object (yellow) cantered at a small area

(red) and Fir 4(b) which is similarity surface (values of

the Bhattacharyya coefficient) corresponding to the red

rectangle marked and yellow triangles are two local

maximum points and the orange rectangle is the

estimated object centre of the GM and 4 black circles are

object canters of the GM components.

V. EXPERIMENTALRESULTS This section is divided into three parts. In the first part,

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 5

We evaluate the performance of the proposed BM. In the

second part describe the implementation details of our

tracking algorithms. Then, comparison with highly

related algorithms such as PF-based approach, MS with

KF prediction tracker and other state-of-the-art

algorithms using stereo camera captured surveillance

videos and standard testing datasets will be given. In the

third part, a novel non-training-based object recognition

algorithm is introduced. Comparisons with conventional

non-training-based algorithms, such as weighted color

histogram (WCH), colour correlogram (CC) etc., are then

presented. Moreover, we also illustrate that both no

training training-based object recognition algorithms can

be improved by using our detection and tracking results

as input on a standard testing dataset.

A. Background Modelling (BM) Comparison

In order to qualitatively evaluate the proposed BM

method, we compare it with other state-of-the-art BM

Algorithms, such as block-based classifier cascade with

probabilistic decision integration (BCCPDI) and

stochasticapproximation for background modelling (SA).

The standard datasets PETS2001and PETS2009, and our

own atasetsDSP_LAB, SYS_RdIandSYS_RdIIare

considered. Note that DSP_LAB is concerned with

indoor scenario whilethe others are concerned with

various outdoor scenes, suchas waving trees, waving

tapes, shadows, lighting change anddifferent kinds of

object. For each dataset, we randomlyselect ten image

frames and the corresponding ground truth masks are

labelled manually.

Figure 5. Comparison ofFmeasurevalues of various

BM methods: The higher the Fmeasure value the

better the segmentation result

B. Object Tracking Results and Comparison

Implementation details: The proposed visual object

tracking algorithm is based on the state-space model .We

can see that the improved MS tracker determines theme

asurement vectorzk, which is then fed back to theBKF-

SGM-IMS state-space model in order to predict the

newobject location, velocity and new bounding box

region in nextframe. In conventional MS [3], (Hxk,Hyk)

is calculated by scale adaptation. It modifies

(Hxk−1,Hyk−1) with a certainfraction (usually ±10%)

and let the MS converge again.On the contrary, our

improved MS only calculates (xk,yk)from MS iterations.

(Hxk,Hyk) can be automatically scaledby the refined

object mask. Hence, the complexity of MS canbe further

reduced. In our method, thenumber of MS tracker

iteration is limited to 20, and theaverage number of

iterations is much smaller. The state noise wk is

modelled as a GM with two components:

p(wk)=βN(wk;u1,Qk1) + (1−β)N(wk; u2,Qk2) (10)

Whereβ = 0.1, u1 = u2 = 0, Qk1 = 10I, and Qk2 =I.

Note that, for a fair comparison, the number of

GM components is fixed to six in our experiments,

though theGM components can be adaptively selected to

better representthe real pdf of the object without

modifying other part ofthe BKF-SGM algorithm. In all

experiments, the initial statep(x0|Z−1) =p(x0) is a GM

with six components havingthe same weight. Means of

GM components are initializedby the zeroth moment

(centroid) of the refined object mask.Covariance of each

component is set to P. In order to handle situations such

as long term tracking, large occlusion and sudden

appearance change, the static colourhistogram. Now

consists of finding maximum similarity between target

histograms mixture and its candidates.

To further assess the stability and effectiveness of our

algorithm, we select and track two rigid objects and six

non-rigid objects inPETS2001dataset. It can be seen that

the proposed BKF-SGM-IMS achieves lowland high

precision among the competing algorithms. More

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 6

importantly, it only loses 25% objects in average during

tracking, while other algorithms lose 38% to 50%

objects, and some of them even fail at the very

beginning. Together with the zero lost Track rate in the

previous testing sequences, the proposed algorithm offers

the lowest Lost Track rate among all. The complete

visual tracking results can be found in our project.

Finally, we note that other more sophisticated

representations adaptive multi-component appearance

models can alsobe incorporated in the proposed

algorithm. It is expectedthat the tracking performance

can be further improved at the expense of increased

implementation complexity.

Due to pagelimitation, this issue is not further pursued.

More precisely, the MS is performed by iterating the

following two steps:

1) Shifting the estimated target location such that the

Bhattacharyya distance is reduced, while keeping the

estimated mixture proportions.

2) Minimize the Bhattacharyyadistance with respect to

the mixture proportions, whilekeeping the estimated

target location fixed. The reduction ofthe distance in Step

2 is convex, and it can be performedrapidly. Interested

readers are referred to for details.In practice, these two

steps are repeated until the change intarget location is

smaller than half a pixel and each of themixture

proportions is modified by less than 0.01.

3) Tracking Results and Comparison:The performance

ofthe proposed object tracking algorithm is first tested on

asimple non-overlapping surveillance camera network.

Here,the network consists of four JVC 3D cameras and

the networktopology is illustrated in Fig. 6(a). Four

image frames capturedby these cameras are shown in

Fig. 6(b). The resolution of thestereo videos captured is

1920×1080 in side-by-side format.

Figure 6. Camera configuration of the multiple stereo

cameras captured dataset

(a) Topology of the stereo cameras network and

(b) Example images captured

By camera 1–4. The viewpoint angle between cameras 1

and 2 is 90 degrees. The viewpoint angle between

cameras 3 and 4 is 30 degrees

C. Object Recognition Application

Objects are detected and tracked independently in each

camera. However, it is desirable to be able to recognize

the same object in nearby cameras to achieve network-

wise tracking. Therefore, a non-training-based object

recognition algorithm is proposed below for object

recognition and tracking over camera network. More

importantly, it only loses 25%objects in average during

tracking, while other algorithmslose 38% to 50% objects,

and some of them even fail at thevery beginning.

Together with the zero LostTrack rate in the

Previous testing sequences, the proposed algorithm

offers

The lowest Lost Track rate among all. This algorithm is

extended from our hand gesture recognition algorithm

and it is based on a novel distance metric, superpixel

earth mover’s distance (SP-EMD). In SP-EMD, the

object matching problem is viewed as the problem of

finding the optimal moving flows between two sets of

superpixels that represent the objects. It utilizes the

superpixels to simplify the representation.

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 7

Figure 7. Examples of tracking result of various

tracking algorithms.

(a) Surfer body sequence

(b) BoBot_ball sequence

(c) CAVIAR sequence.

For the sake of clarity, we only demonstrate the results of

four selected trackers. Red and yellow lines show the

results of BKF-SGM-IMS and SCM, respectively. Blue

and Black lines show the results of AST and TLD,

respectively.

To further demonstrate the usefulness and effectiveness

ofthe proposed object detection and tracking algorithms,

we alsoevaluate two objects identification algorithms on

a standardtesting dataset called CAVIAR4REID. One is

the no training-based algorithm described above and the

other one is a state-of-the-art training-based algorithm

called ICT (short for Implicit Camera Transfer). The

CAVIAR4REID dataset includes fifty pedestrians

captured by two different cameras. For each person in

each camera there are ten available appearances. We

report results for two setups indemonstrating that both

training and non-training recognitionalgorithms can be

improved by using our object detection andtracking

algorithm as a pre-processing step. The performanceis

measured in terms of cumulative matching characteristic

(CMC) curve, which represents the expectation of

findingthe correct match in the top n-thmatch. Blue and

black lines show the recognition results using object

masks instead.

Figure 8. (a) Shows the sample image pairs and their

object masks of CAVIAR4REID dataset.

(b) The cumulative matching characteristic (CMC)

Curves of two object recognition methods.

VI. CONCLUSION

New approaches for object detection and tracking in

cameranetwork has been presented. A novel object

detectionalgorithm using colour based MS segmentation

and depthinformation is first proposed for improving

background modelling and segmentation of occluded

objects. The segmentedobjects are then tracked by BKF-

SGM-IMS. Finally, a no training-based object

recognition algorithm based on SP-EMDdistortion metric

is presented for identification of similarobject extracted

in nearby cameras to achieve network-based

Tracking. The usefulness of the proposed algorithms is

Illustrated by experimental results and comparison with

Conventional methods.

VII. ACKNOWLEDGEMENT

We wish to express heartiest gratitude to the Head of the

Department Mr S. SARAVANAN, M.E., for his effective

leadership, encouragement and guidance during the course of

this project. We wish to thank profoundly all our Faculty

members who were illustrious and Lab technicians of our

Department. We also extend our warm thanks to our beloved

parents and friends for constantly encouraging us and providing

us with their valuable ideas.

VIII. REFERENCES

1. Y. Wu, J. Lim, and M.-H. Yang, “Online object

tracking: A benchmark,” inProc. IEEE Conf.

Comput. Vis. Pattern Recognit., Jun. 2013, pp. 2411–

2418.

International Journal of Research in Advanced Technology - IJORAT Vol 2, Issue 3, MARCH 2016

All Rights Reserved © 2016 IJORAT 8

2. P. Pérez, C. Hue, J. Vermaak, and M. Gangnet,

“Color-based probabilistic tracking,” in Proc. 7th Eur.

Conf. Comput. Vis., 2002, pp. 661–675.

3. D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-

based object tracking,” IEEE Trans. Pattern Anal.

Mach. Intell., vol. 25, no. 5, pp. 564–577, May 2003.

4. S.C.Chan,B.Liao,andK.M.Tsui, “Bayesian Kalman

filtering, regularization and compressed sampling,” in

Proc. IEEE 54th Int. Midwest Symp. Circuits Syst.,

Aug. 2011, pp. 1–4.

5. Y. C. Ho and R. Lee, “A Bayesian approach to

problems in stochastic estimation and control,” IEEE

Trans. Autom. Control, vol. 9, no. 4, pp. 333–339,

Oct. 1964.

6. K. Zhang and J. T. Kwok, “Simplifying mixture

models through function approximation,”IEEE Trans.

Neural Netw., vol. 21, no. 4, pp. 644–658, Apr. 2010.

7. I. Bilik and J. Tabrikian, “MMSE-based filtering in

presence of nonGaussian system and measurement

noise,”IEEE Trans. Aerosp. Electron. Syst., vol. 46,

no. 3, pp. 1153–1170, Jul. 2010.

8. PETS2001 and PETS2009 Datasets. [Online].

Available: http://ftp.pets.rdg.ac.uk/pub/PETS2001/

and

http://cs.binghamton.edu/~mrldata/pets2009.html,

accessed Mar. 1, 2014.

9. Surveillance Camera Network Project Page. [Online].

Available:http://www.eee.hku.hk/~h0995463/SCN/,

accessed Aug. 15, 2014.

10. Z. Zivkovic and F. van der Heijden, “Efficient

adaptive density estimation per image pixel for the

task of background subtraction,” Pattern Recognit.

Lett., vol. 27, no. 7, pp. 773–780, May 2006.

11. D. Comaniciu and V. Ramesh, “Mean shift and

optimal prediction for efficient object tracking,” in

Proc. Int. Conf. Image Process., 2000, pp. 70–73.

12. S. Zhang, S. C. Chan, R. D. Qiu, K. T. Ng, Y. S.

Hung, and W. Liu, “On the design and

implementation of a high definition multi-view

intelligent video surveillance system,” inProc. IEEE

Int. Conf. Signal Process.,Commun., Comput., Aug.

2012, pp. 353–357.

13. O. Akman, A. Aydin Alatan, and T. Çiloglu, “Multi-

camera visual surveillance for motion detection,

occlusion handling, tracking and event recognition,”

presented at Workshop Multi-Camera Multi-Modal

Sensor Fusion Algorithms Appl.,2008. [Online].

Available: https://hal.archivesouvertes.fr/inria-

00326780/document, accessed Aug. 15, 2014.

14. C. Wang, Z. Liu, and S. C. Chan, “Sparse-

representation-based handgesture recognition with

Kinect depth camera,”IEEE Trans. Multimedia,vol.

17, no. 1, pp. 29–39, Jan. 2015.

15. Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth

mover’s distanceas a metric for image retrieval,” Int.

J. Comput. Vis., vol. 40, no. 2,pp. 99–121, 2000.

16. [16] G. B. Dantzig, “Application of the simplex

method to a transportation problem,” inActivity

Analysis of Production and Allocation.NewYork,

NY, USA: Wiley, 1951, pp. 359–373.

17. S. O. Shim and T.-S. Choi, “Image indexing by

modified color co-occurrence matrix,” inProc. Int.

Conf. Image Process. Sep. 2003, pp. III-493–III-496.

18. A. Adam, E. Rivlin, and I. Shimshoni, “Robust

fragments-based tracking using the integral

histogram,” inProc. IEEE Comput. Soc. Conf.

Comput.Vis. Pattern Recognit., Jun. 2006, pp. 798–

805.

19. Z. Kalal, K. Mikolajczyk, and J. Matas, “Tracking-

learning-detection,” IEEE Trans. Pattern Anal. Mach.

Intell., vol. 32, no. 7, pp. 1409–1422,Jul. 2012.

20. B. Babenko, M.-H. Yang, and S. Belongie, “Robust

object tracking with online multiple instance

learning,” IEEE Trans. Pattern Anal. Mach. Intell.,

vol. 33, no. 8, pp. 1619–1632, Aug. 2011.

21. W. Zhong, H. Lu, and M.-H. Yang, “Robust object

tracking via sparsitybased collaborative model,”

inProc. IEEE Conf. Comput. Vis. Pattern Recognit.,

Jun. 2012, pp. 1838–1845

22. B. Georgescu, D. Comaniciu, T. X. Han, and X. S.

Zhou, “Multi-model component-based tracking using

robust information fusion,” inProc. Int. Workshop

Statist. Methods Video Process. 2004, pp. 61–70.

23. R. Zhan, W. Ouyamg, and X. Wang, “Person re-

identification by salience matching,” inProc. IEEE

Int. Conf. Comput. Vis., Dec. 2013, pp. 2528–2535.

24. S. Forstmann, Y. Kanou, J. Ohya, S. Thuering, and

A. Schmitt, “Realtime stereo by using dynamic

programming,” in Proc. IEEE Int. Conf.Comput. Vis.

Pattern Recognit., Jun. 2004, pp. 29–36

25. ShuaiZhang,ChongWang,Shing-Chow Chan,Xiguang

Wei and Check-HeiHo,New Object Detection,

Tracking and Recognition Approaches for Video

Surveillance over camera network, may 2015