10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
-
Upload
mohit-sngg -
Category
Documents
-
view
212 -
download
0
Transcript of 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
-
7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
1/4
Detecting Dominant Motion Flows In Unstructured/structured Crowd
Scenes
Ovgu Ozturk Toshihiko Yamasaki Kiyoharu Aizawa
The University of Tokyo
{ovgu, yamasaki, aizawa}@hal.t.u-tokyo.ac.jp
Abstract
Detecting dominant motion flows in crowd scenes
is one of the major problems in video surveillance.
This is particularly difficult in unstructured crowd
scenes, where the participants move randomly in
various directions. This paper presents a novelmethod which utilizes SIFT features flow vectors to
calculate the dominant motion flows in both
unstructured and structured crowd scenes. SIFT
features can represent the characteristic parts of
objects, allowing robust tracking under non-rigid
motion. First, flow vectors of SIFT features are
calculated at certain intervals to form a motion flow
map of the video. ext, this map is divided into
equally sized square regions and in each region
dominant motion flows are estimated by clustering
the flow vectors. Then, local dominant motion flows
are combined to obtain the global dominant motion
flows. Experimental results demonstrate the
successful application of the proposed method to
challenging real-world scenes.
1. Introduction
Dominant motion patterns in videos provide very
significant information which has a wide range of
applications. Since motion patterns are formed by
individual motions or interacted motions of crowds,
it helps to analyze the social behavior in a given
environment in the video. Furthermore it is useful
during public place design and activity analysis for
security reasons.
Over the years, there have been many researches
which try to find the motion patterns by using
individual object tracking and trajectory
classification methods. However, in real world
situations, high density crowds form the most cases,
and it is not always possible to track individual
objects. Crowd scenes can be divided into two
groups, unstructured and structured scenes, as in
Figure 1. Structured crowds are the ones where mainmotion tracks are defined by environmental
conditions, such as elevators, crosswalks, etc.
Unstructured crowds are those where objects can
move freely in any direction, following any path. So
far, very few researchers have attempted to solve the
complexity of the crowd scenes that are structured.
Detecting dominant motion flows in unstructured
crowds still remains as a challenging task.
To solve the problem of calculating the dominant
motion flows both in unstructured and structured
crowds, we propose a new approach which has two
distinctive contributions. First, our approach utilizesmotion flows of the SIFT features in a scene. Unlike
corner-based features which have been used
commonly in other researches, SIFT features can
represent characteristics parts of the objects.
Therefore, their tracking consistency and accuracy
are higher during complex motions. Second, we
propose a hierarchical clustering framework to deal
with the complexity of unstructured motion flows.
(a) Structured Crowd Scenes
(b) Unstructured Crowd Scenes
Figure 1. Unstructured/structured Crowds.
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 2010 IEEE
DOI 10.1109/ICPR.2010.862
3521
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 2010 IEEE
DOI 10.1109/ICPR.2010.862
3537
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 2010 IEEE
DOI 10.1109/ICPR.2010.862
3533
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 2010 IEEE
DOI 10.1109/ICPR.2010.862
3533
2010 International Conference on Pattern Recognition
1051-4651/10 $26.00 2010 IEEE
DOI 10.1109/ICPR.2010.862
3533
-
7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
2/4
Entire scene is divided into equally sized local
regions. In each local region, flow vectors are
classified into groups based on their orientation.
Then, location-based classification is applied to find
the spatial accumulation of the vectors. Finally, local
dominant motion flows are connected to obtain
global dominant motion flows.
1.1.Related Work
Tracking individual objects and constructing the
trajectories is a common approach to find the global
motion flows as in [1, 6]. However, for crowd videos,
continuous tracking of individual objects is not
possible because of occlusion or failures. Another
approach is to employ instantaneous flow vectors of
image features in the entire image [3-5, 11]. They use
corner-based features. But, these features are not
reliable under non-rigid motion, affine
transformation or noise. Hence, these researchesconsider only structured motions and do not work for
unstructured crowds. In [4], they use neighborhood
information, but it fails when a region contains flows
with multiple directions eliminating each other. In [7],
they propose floor fields, which are applicable for
structured crowds. Only, the work in [2] considers
unstructured crowd scenes where they try to track
individual targets.
2. Generating SIFT Feature Flows
In this paper, SIFT features are used to calculatethe motion flows. SIFT features are known to be one
of the best features that are robust under various
transformations. They can be used to continuously
track the foreground objects over many frames. Thus,
instead of calculating the motion flows at each frame,
we track the features at certain intervals. It provides
us two advantages. First, it reduces the noise coming
from background and unstable points. And computed
motion flow vectors can be used directly without any
pre or post processing.
Each video is segmented into intervals with length
d. SIFT features extracted in a frame will be
matched to the corresponding features in the next
frame after the interval d. The displacement vectors
of the features over a certain threshold are defined as
flow vectors. Figure 2 depicts the flow vectors. Flow
vector is represented with F(x, y, , t, L), where:
x,y : center of mass
: orientation
L : length
t : frame number
Figure 3(a) demonstrates an unstructured crowd
scene. Motion flow map of the region in white square
is depicted in 3(b). Motion flows are calculated for
400 frames with interval length 3. Accumulation of
flow vectors can be seen in certain orientations.
However, if the variety of orientations in the region
increases, the flow map becomes very complicated.
When entire scene is considered, data amount and
complexity will be higher. In this case, common
clustering methods [3] in the literature will not work
effectively. We introduce a hierarchical clustering
method to detect the dominant motion flows in theregion, which is explained in the next section.
3. Calculating Dominant Motion Flows
Detecting dominant motion flows is defined as
finding the orientation and spatial distribution of the
mostly followed paths in a scene during a given
period. If the motion of the objects in a video has an
organized behavior, then one type of orientation can
be assigned to each location. However, for crowd
videos, especially unstructured crowds, participants
move in various directions at different times. Each
spatial location holds more than one orientation type
depending on the time. It is not possible to find the
dominant flows by existing methods [3, 4, 11].
In this work, entire scene is divided into smaller
regions, in which flows vectors are easier to separate
into meaningful groups. Then, the flow vectors in
each region are clustered with a two-step hierarchical
approach to find the local dominant motion flows.
Figure 4 shows the hierarchical clustering steps.
Figure 2. SIFT motion flow vectors.
(a) (b)
Figure 3. (a) Unstructured Crowd Scene. (b) SIFT
flows in the marked region for 400 frames.
35223538353435343534
-
7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
3/4
Finally, local dominant motion flows are connected
to compute the global dominant motion flows.
3.1. Hierarchical Clustering of Flow
Vectors
Orientation information is the most significantinformation while classifying the flow vectors. In
each local region, first, flow vectors are classified
into one of the four main orientation groups. Figure 4
shows the grouping of orientations. To achieve this,
orientation histogram is calculated and major groups
are chosen to represent the region. For example, in
Figure 5(b), there are two groups depending on the
orientation as depicted in blue and green. Second
step is spatial clustering. Flow vectors in each
orientation group are clustered based on the location.
Hence accumulations of the vectors in the region are
detected as in Figure 5(c). For this, Self-Tuning
Spectral Clustering method has been appliedconsidering the evaluation results in [3].
After clustering, local dominant motion flows are
calculated by computing the average location,
average orientation and total number of the flow
vectors in each group. So, local dominant motion
flow for each group is described with L(x, y, w, ).
w symbolizes the number of vectors and depicted
with the width of the flow vector. Figure 5(d) shows
three dominant motion flows calculated in the region.
3.2. Combining Local Dominant Flows
Once, main flows in local regions are detected,next question is how to combine them and obtain the
global motion flows. The basic logic is to start from
one side of the scene and follow the local flows and
connect them to the most probable neighbor flows till
the end of the scene. In other words, first, the entire
scene is scanned horizontally to connect the
horizontal flows. After this, it is scanned vertically to
connect the vertical flows. Orientation groups II, III
are stated as horizontal flows, whereas groups I and
IV are vertical flows. The algorithm is as follows:
While scanning, for each local motion flow,
1. Determine the neighbor cells, Ns.
2. In each N, search for the motion flows thatare in the same orientation group
3. Choose the closest one in the neighborhood
and connect with the current flow.
4. If, there are not motion flows with the sameorientation group in the neighbor cells and
next neighbor cells, choose the motion flow
that is the closest
Neighbor cells are defined as the two regions that
are in the direction of the current flow. For example,
in Figure 6(a), for the horizontal vector, the
neighborhood cells are c, e and next neighbor cells
are c, e. In Figure 6, the vectors shown with A are
in orientation group II. A1 is connected to A3 and A2,
A3 are connected to A4. Hence, they form the globalflow shown with bold gray line. If there are not any
vectors in the neighbor and next neighbor cells, then
it is connected to the closest vector to keep the
continuity. In which case, it means there is a
dominant abrupt motion orientation change in that
region. For example, if there wasnt A4 , A3 would be
connected to B1.
(a)
(b) orientation (c) spatial (d) local motionbased clustering clustering flows
Fi ure 5. Hierarchical clusterin .
(a)
(b)
Figure 6. Connecting the local flows.
(a) (b)
Figure 4. Hierarchical Clustering.
35233539353535353535
-
7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes
4/4
4. Experimental Results and Discussion
In our experiments, crowd data sets are taken
from the datasets of University of Central Florida [4]to provide a comparison with the related works. (a)
shows the input scenes and SIFT flows, (b) shows the
results of our method with detailed lines. (c) shows
the results in thick lines after combining the groups
and generating one group for each global flow. (d)
shows the ground truth which is drawn from the
average result of user study. The image size for two
sets is 360x480-pixels. Local regions are 60x60-
pixels size. There are 48 (6x8) local regions in total
The set at the top is from an escalator
neighborhood, which is a structured crowd example.
The video is analyzed between frames 100 and 460
with an interval of three. Most of the people move on
the escalators and the people on the far end of the
escalators walk freely. The proposed method can
successfully detect the global motion flows in free
motion regions as well as the flows through the
escalators
The one at the bottom is from a street, which is an
unstructured crowd example and complexity is high.
Video is analyzed between frames 140 and 460 with
interval length three. In 7(b) the local regions and the
connection of the local motion flows can be seen. For
the street scene, our system catches the parallelism in
the upper half of the scene. And the crossing of themotion flows is also detected in the lower part. Also,
3 main flows of vertical motion are detected, it is
shown with purple in 7(b). With the proposed
approach, dominant motion flows can be detected in
various levels. General dominant flow maps can be
provided as in (c) or if necessary local analysis of the
flows can also be obtained as in (b).
5. ConclusionsIn this work, we have presented a new approach
to solve the problem of calculating dominant motion
flows in various crowd scenes. By using SIFT featureflows and hierarchical clustering approach, it
becomes possible to analyze the motion flows even
for unstructured and structured crowds. The
proposed approach can detect global motion flows, at
the same time it can give information about local
characteristics of the motion flows.
References[1] F. M. Porikli, Trajectory Pattern Detection by HMM
Parameter Space Features and Eigenvector Clustering,
ECCV, 2004.
[2] M. Rodriguez, S. Ali and T. Kanade, Tracking In
Unstructured Crowded Scenes, ICCV, 2009.
[3] G. Eibl, N. Brandle, Evaluation of Clustering Methodsfor Finding Dominant Optical Flow Fields in Crowded
Scenes, ICPR, 2008.
[4] M. Hu, S. Ali and M. Shah, Detecting Global Motion
Patterns in Complex Videos, ICPR, 2008.
[5] G. Brostow, R. Cipolla, Unsupervised Bayesian
Detection of Independent Motion in Crowds, CVPR, 2006.
[6] X. Wang et al., Learning Semantic Scene Models by
Trajectory Analysis, ECCV, 2006.
[7] S. Ali, M. Shah, Floor Fields for Tracking in High
Density Crowd Scenes, ECCV, 2008.
[8] B. D. Lucas and T. Kanade, An Iterative Image
Registration Technique with an Application to Stereo
Vision, IJCAI, 1981.
[9] D. Lowe. Distinctive image features from scale-invariant key points. Intl. J. of Computer Vision,
60(2):91110, 2004.
[10] Y. Tsuduki, H. Fujiyoshi, A Method for Visualizing
Pedestrian Traffic Flow using SIFT, PSIVT, 2009.
[11] N. Ihaddadene, C. Djeraba, Real-time Crowd Motion
Analysis, ICPR, 2008.
[12] L. Zelnik-Manor, P. Perona, Self-Tuning Spectral
Clustering, In Adv. Neur. Inf. Proc. Sys.: 16011608,
2004.
(a) (b) (c) (d)
Figure 7. Experimental results.
35243540353635363536