Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition
-
Upload
griffin-collier -
Category
Documents
-
view
51 -
download
0
description
Transcript of Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition
Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition
Oytun Akman
2
Overview
Surveillance Systems Single Camera Configuration
Moving Object Detection Tracking Event Recognition
Multi-camera Configuration Moving Object Detection Occlusion Handling Tracking Event Recognition
3
Single Camera Configuration
Moving Object Detection (MOD) Tracking Event Recognition
4
Single Camera Configuration
Moving Object Detection (MOD)
Input Image - Background Image = Foreground Mask
5
Single Camera Configuration
Moving Object Detection (MOD)
Frame Differencing (M. Piccardi, 1996)
Eigenbackground Subtraction (N. Oliver, 1999)
Parzen Window (KDE) Based MOD (A. Elgammal, 1999)
Mixture of Gaussians Based MOD (W. E. Grimson, 1999)
6
Single Camera Configuration
MOD – Frame Differencing
Foreground mask detection
Background model update
thresholdtyxBD
thresholdtyxBDtyxFM
tyxBMtyxItyxBD
),,(0
),,(1),,(
)1,,(),,(),,(
)1,,()1(),,(),,( tyxBMtyxItyxBM
7
Single Camera Configuration
MOD – Eigenbackground Subtraction
Principal Component Analysis (PCA) Reduce the data
dimensionCapture the major
variance Reduced data represents
the background model
(http://web.media.mit.edu/~tristan/phd/dissertation/chapter5.html)
8
Single Camera Configuration
MOD – Parzen Window Based
Nonparametrically estimating the probability of observing pixel intensity values, based on the sample intensities
9
Single Camera Configuration
MOD – Mixture of Gaussians Based
Based on modeling each pixel by mixture of K Gaussian distributions
Probability of observing pixel value xN at time N,
where (assuming that R,G,B are independent)
K
k
xx
kDkN
kNkT
kN
ewxp1
2
1
2/12/
1
2
1)(
Ikk2
10
Single Camera Configuration
MOD - Simulation Results
11
Single Camera Configuration
Tracking
Object Association Mean-shift Tracker (D. Comaniciu, 2003) Cam-shift Tracker (G. R. Bradski, 1998) Pyramidal Kanade-Lucas-Tomasi
Tracker (KLT) (J. Y. Bouguet, 1999)
(A constant velocity Kalman filter is associated with each tracker)
12
Single Camera Configuration
Tracking – Object Association
Oi(t) = OJ(t+1) if Bounding box overlapping D(Oi(t), OJ(t+1)) < thresholdmd,
D() is a distance metric between
color histograms of objects Kullback-Leibler divergence
Bhattacharya coefficient
2121 1),( hhhhD
1
22
2
1121 loglog),(
h
hh
h
hhhhD
O bje ct1 ( t )
O bje ct2 ( t )
O bje ct1 ( t+ 1 )
O bje ct2 ( t+ 1 )
13
Single Camera Configuration
Tracking – Mean-shift Tracker
Similarity function between the target model q and the candidate model p(y) is
where p and q are m-bin color histograms
._
1
)(1)(
coeffyyaBhattachar
m
uuu qypyd
(http://www.lisif.jussieu.fr/~belaroussi/face_track/CamshiftApproach.htm)
14
Single Camera Configuration
Tracking - Mean-shift Tracker - Simulation Result
15
Single Camera Configuration
Tracking – Cam-shift Tracker
Backprojection image (probability distribution image) calculated
Mean-shift algorithm is used to find mode of probability distribution image around the previous target location
16
Single Camera Configuration
Tracking – Cam-shift Tracker - Simulation Result
17
Single Camera Configuration
Tracking – Pyramidal KLT
Optical flow d=[dx dy] of the good feature point (corner) is found by minimizing the error function
xx
xx
yy
yy
wu
wux
wu
wuyyxyx dydxJyxIddd 2),(),(),()(
(http://www.suri.it.okayama-u.ac.jp/research/2001/s-takahashi/s-takahashi.html)
18
Single Camera Configuration
Tracking - Pyramidal KLT - Simulation Results
19
Single Camera Configuration
Event Recognition - Hidden Markov Models (HMM)
GM - HMMs, trained by proper object trajectories, are used to model the traffic flow (F. Porikli, 2004) (F. Bashir, 2005)
in
in
im
im
im
im
yx
yx
yx
itr
..
..)(11
m :starting frame number in which the object enters the FOV
n :end frame number in which the object leaves the FOV
20
Single Camera Configuration
Event Recognition – Simulation Result
21
Multi-camera Configuration
Background Modeling Occlusion Handling Tracking Event Recognition
22
Multi-camera Configuration
Background Modeling Three background modeling algorithms
Foreground Detection by Unanimity Foreground Detection by Weighted Voting Mixture of Multivariate Gaussians Background Model
23
Multi-camera Configuration
Background Modeling Common field-of-view must be defined to specify the region in which
the cameras will cooperate
24
Multi-camera Configuration
Background Modeling - Unanimity If (x is foreground) && (xI is foreground) foreground
25
Multi-camera Configuration
Background Modeling – Weighted Voting
thresholdtxxBD
thresholdtxxBDtyxFM
txMBtxItxBMtxItxxBD
ii
ii
iiiiiiii
),,(0
),,(1),,(
)1,(),()1,(),(),,(
and are the coefficients to adjust the contributions of the cameras. Generally, the contribution for the first camera (reference camera with better positioning) is larger than the second one, and
26
Multi-camera Configuration
Background Modeling – Weighted Voting
27
Multi-camera Configuration
Background Modeling – Mixture of Multivariate Gaussians
Each pixel modeled by mixture of K multivariate Gaussian distributions
where
K
k
XX
kDkN
kNkT
kN
ewXp1
2
1
2/12/
1
2
1)(
N
N
N
NN xH
x
x
xX
12
28
Multi-camera Configuration
Background Modeling – Mixture of Multivariate Gaussians
Input image
Mixture of Multivariate Gaussians
Single camera MOG
29
Multi-camera Configuration
Background Modeling - Conclusions
Projections errors due to the planar-object assumption Erroneous foreground masks False segmentation results
Cameras must be mounted on high altitudes compared to object heights
Background modeling by unanimity False segmented regions are eliminated Any camera failure failure in final mask Solved by weighted voting
In multivariate MOG method missed vehicles in single camera MOG method can be segmented
30
Multi-camera Configuration
Occlusion Handling
Occlusion Primary issue of surveillance systems
False foreground segmentation results Tracking failures
Difficult to solve by using single-camera configuration
Occlusion-free view generation by using multiple cameras Utilization of 3D information Presence of different points of views
31
In p u t im ag e 1B ack g ro u n dM o d el 1
Bac k g r o u n dS u b tr ac tio n
F o r eg r o u n d M as k
S eg m en ta tio nM o d u le
E p ip o larM atc h in g
S eg m en ts
P o in t T r an s f er
T r if o c a l T en s o r
T o p - v iew p o in ts
m atc h ed s eg m en t c en ter s
G r ap h Bas edC lu s te r in g
I n d iv id u al o b jec ts
In p u t im ag e 2B ack g ro u n dM o d el 2
Bac k g r o u n dS u b tr ac tio n
F o r eg r o u n d M as k
S eg m en ta tio nM o d u le
S eg m en ts
Multi-camera Configuration Occlusion Handling – Block Diagram
32
Multi-camera Configuration
Occlusion Handling - Background Subtraction
Foreground masks are obtained using background subtraction
33
Multi-camera Configuration
Occlusion Handling – Oversegmentation
Foreground mask is oversegmented using “Recursive Shortest Spanning Tree” (RSST) and K-means algorithms
RSST K-means
34
Multi-camera Configuration
Occlusion Handling – Top-view Generation
35
Multi-camera Configuration
Occlusion Handling – Top-view Generation
Corresponding match of a segment is found by comparing the color histograms of the target segment and candidate segments on the epipolar line
RSST K-means
36
Multi-camera Configuration
Occlusion Handling – Clustering
Segments are grouped using “shortest spanning tree” algorithm using the
weight function
RSST K-means
diffdiffij DHw
37
Multi-camera Configuration
Occlusion Handling – Clustering
After cutting the edges greater than certain threshold
RSST K-means
38
Multi-camera Configuration
Occlusion Handling – Conclusions
Successful results for partially occluded objects Under strong occlusion
Epipolar matching fails Objects are oversegmented or undersegmented Problem is solved if one of the cameras can see the object without
occlusion RSST and K-means
have similar results K-means has better real time performance
39
Multi-camera Configuration
Tracking – Kalman Filters
Advantage: continuous and correct tracking as long as one of the cameras is able to view the object
Tracking is performed in both of the views by using Kalman filters
2D state model: ][ yxi vvyxX
State transition model:
1000
0100
1010
0101
iA
Observation model:
0010
0001iO
40
Multi-camera Configuration
Tracking – Object Matching
Objects in different views are related to each other via homography
212 )()( xHxHxx
x
x '
H x
H -1x'
C a m e ra v ie w 1 C a m e ra v ie w 2
d
d'
41
C am er a 1 C am er a 2
a
b ca '
b'
c'
P1 a
P1 b P1 c
P1 c'
P1 b '
P1 a '
P1 a
P1 b P1 c
C am er a 1 C am er a 2
P1 c'
P1 b 'P1 a '
Multi-camera Configuration
Tracking - Example
H
U
U
U
U
U
U
Homogra
phic
projection
P2 a
P2 b P2 c
P2 a '
P2 b '
P2 c'
42
C am er a 1 C am er a 2
P3 bP3 c
P3 b '
P3 c'
UU
U
U
Multi-camera Configuration
Tracking - Example
P4 bP4 c
P4 b '
P4 c'
43
Multi-camera Configuration
Tracking – Simulation Results
Multi-camera Tracking
44
Multi-camera Configuration
Tracking – Simulation Results
Single-camera Tracking Single-camera Tracking
45
Multi-camera Configuration
Tracking – Simulation Results
Multi-camera Tracking
46
Multi-camera Configuration
Tracking – Simulation Results
Single-camera Tracking Single-camera Tracking
47
Multi-camera Configuration
Event Recognition - Trajectories
Extracted trajectories from both of the views are concatenated to obtain a multi-view trajectory
in
in
im
im
im
im
yx
yx
yx
itr
11
11
11
..
..)(11
1
in
in
im
im
im
im
yx
yx
yx
itr
22
22
22
..
..)(11
2
in
in
in
in
im
im
im
im
im
im
im
im
yyxx
yyxx
yyxx
trtritr
2121
2121
2121
....
....]|[)(1111
213
48
Multi-camera Configuration
Event Recognition – Training
49
Multi-camera Configuration Event Recognition – Viterbi Distances of Training Samples
Object ID Viterbi Distance to GM_HMM_1 Viterbi Distance to GM_HMM_2 Viterbi Distance to GM_HMM_1+2
1 10.0797 9.90183 19.7285
2 10.2908 10.1049 20.1867
3 10.2266 10.1233 20.1006
4 10.577 10.6716 21.2304
5 9.99018 9.84572 19.6763
6 10.0584 9.85901 19.6572
7 10.0608 9.88434 19.7496
8 10.2821 10.2472 20.3949
9 10.0773 9.8764 19.7181
10 10.3629 10.2508 20.3399
11 10.0322 9.86696 19.6382
12 10.0695 9.92222 19.7072
13 10.1321 9.95447 19.7818
14 10.2666 10.139 20.2119
15 10.2661 10.0629 20.0147
16 10.038 9.92932 19.6548
17 10.126 9.98202 19.7991
18 10.2134 10.108 19.9983
19 10.8046 10.5149 21.3008
20 10.4454 10.2919 20.333
21 10.111 9.90018 19.6983
22 10.1791 9.9294 19.9025
23 10.0511 10.0658 20.1564
24 10.1007 10.2248 20.248
25 10.3782 9.8986 19.9865
26 10.0308 9.9264 19.8682
27 10.0816 10.2139 20.1286
50
Multi-camera Configuration
Event Recognition – Simulation Results with Abnormal Data
Average distance to GM_HMM_1 : 10.20 Average distance to GM_HMM_2 : 10.06 Average distance to GM_HMM_1+2: 20.04
Object ID Viterbi Distance to GM_HMM_1
Viterbi Distance to GM_HMM_2
Viterbi Distance to GM_HMM_1+2
28 20.4058 19.9481 45.1818
29 21.2409 19.7736 45.034
30 26.9917 24.7016 55.2278
31 10.7213 10.5773 21.2099
32 10.4648 10.5105 22.1852
33 10.1611 9.97222 19.7785
51
Multi-camera Configuration
Tracking & Event Recognition - Conclusions
Tracking Successful results Correct initial segmentationOther tracker algorithms can be used
Event recognitionGM_HMM1+2 classifies the test data better
52
Thank you...
53
Summary - Surveillance Single camera configuration
Moving object detection Frame differencing Eigen background Parzen window (KDE) Mixture of Gaussians
Tracking Object association Mean-shift tracker Cam-shift tracker Pyramidal Kanade-Lucas-Tomasi tracker (KLT)
Event recognition
Multi-camera configuration Background modeling
Foreground detection by unanimity Foreground detection by weighted voting Mixture of multivariate Gaussian distributions
Occlusion handling Tracking Event recognition