10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

1/4

Detecting Dominant Motion Flows In Unstructured/structured Crowd

Scenes

Ovgu Ozturk Toshihiko Yamasaki Kiyoharu Aizawa

The University of Tokyo

{ovgu, yamasaki, aizawa}@hal.t.u-tokyo.ac.jp

Abstract

Detecting dominant motion flows in crowd scenes

is one of the major problems in video surveillance.

This is particularly difficult in unstructured crowd

scenes, where the participants move randomly in

various directions. This paper presents a novelmethod which utilizes SIFT features flow vectors to

calculate the dominant motion flows in both

unstructured and structured crowd scenes. SIFT

features can represent the characteristic parts of

objects, allowing robust tracking under non-rigid

motion. First, flow vectors of SIFT features are

calculated at certain intervals to form a motion flow

map of the video. ext, this map is divided into

equally sized square regions and in each region

dominant motion flows are estimated by clustering

the flow vectors. Then, local dominant motion flows

are combined to obtain the global dominant motion

flows. Experimental results demonstrate the

successful application of the proposed method to

challenging real-world scenes.

1. Introduction

Dominant motion patterns in videos provide very

significant information which has a wide range of

applications. Since motion patterns are formed by

individual motions or interacted motions of crowds,

it helps to analyze the social behavior in a given

environment in the video. Furthermore it is useful

during public place design and activity analysis for

security reasons.

Over the years, there have been many researches

which try to find the motion patterns by using

individual object tracking and trajectory

classification methods. However, in real world

situations, high density crowds form the most cases,

and it is not always possible to track individual

objects. Crowd scenes can be divided into two

groups, unstructured and structured scenes, as in

Figure 1. Structured crowds are the ones where mainmotion tracks are defined by environmental

conditions, such as elevators, crosswalks, etc.

Unstructured crowds are those where objects can

move freely in any direction, following any path. So

far, very few researchers have attempted to solve the

complexity of the crowd scenes that are structured.

Detecting dominant motion flows in unstructured

crowds still remains as a challenging task.

To solve the problem of calculating the dominant

motion flows both in unstructured and structured

crowds, we propose a new approach which has two

distinctive contributions. First, our approach utilizesmotion flows of the SIFT features in a scene. Unlike

corner-based features which have been used

commonly in other researches, SIFT features can

represent characteristics parts of the objects.

Therefore, their tracking consistency and accuracy

are higher during complex motions. Second, we

propose a hierarchical clustering framework to deal

with the complexity of unstructured motion flows.

(a) Structured Crowd Scenes

(b) Unstructured Crowd Scenes

Figure 1. Unstructured/structured Crowds.

2010 International Conference on Pattern Recognition

1051-4651/10 $26.00 2010 IEEE

DOI 10.1109/ICPR.2010.862

3521


1051-4651/10 $26.00 2010 IEEE

DOI 10.1109/ICPR.2010.862

3537


1051-4651/10 $26.00 2010 IEEE

DOI 10.1109/ICPR.2010.862

3533


1051-4651/10 $26.00 2010 IEEE

DOI 10.1109/ICPR.2010.862

3533


1051-4651/10 $26.00 2010 IEEE

DOI 10.1109/ICPR.2010.862

3533


2/4

Entire scene is divided into equally sized local

regions. In each local region, flow vectors are

classified into groups based on their orientation.

Then, location-based classification is applied to find

the spatial accumulation of the vectors. Finally, local

dominant motion flows are connected to obtain

global dominant motion flows.

1.1.Related Work

Tracking individual objects and constructing the

trajectories is a common approach to find the global

motion flows as in [1, 6]. However, for crowd videos,

continuous tracking of individual objects is not

possible because of occlusion or failures. Another

approach is to employ instantaneous flow vectors of

image features in the entire image [3-5, 11]. They use

corner-based features. But, these features are not

reliable under non-rigid motion, affine

transformation or noise. Hence, these researchesconsider only structured motions and do not work for

unstructured crowds. In [4], they use neighborhood

information, but it fails when a region contains flows

with multiple directions eliminating each other. In [7],

they propose floor fields, which are applicable for

structured crowds. Only, the work in [2] considers

unstructured crowd scenes where they try to track

individual targets.

2. Generating SIFT Feature Flows

In this paper, SIFT features are used to calculatethe motion flows. SIFT features are known to be one

of the best features that are robust under various

transformations. They can be used to continuously

track the foreground objects over many frames. Thus,

instead of calculating the motion flows at each frame,

we track the features at certain intervals. It provides

us two advantages. First, it reduces the noise coming

from background and unstable points. And computed

motion flow vectors can be used directly without any

pre or post processing.

Each video is segmented into intervals with length

d. SIFT features extracted in a frame will be

matched to the corresponding features in the next

frame after the interval d. The displacement vectors

of the features over a certain threshold are defined as

flow vectors. Figure 2 depicts the flow vectors. Flow

vector is represented with F(x, y, , t, L), where:

x,y : center of mass

: orientation

L : length

t : frame number

Figure 3(a) demonstrates an unstructured crowd

scene. Motion flow map of the region in white square

is depicted in 3(b). Motion flows are calculated for

400 frames with interval length 3. Accumulation of

flow vectors can be seen in certain orientations.

However, if the variety of orientations in the region

increases, the flow map becomes very complicated.

When entire scene is considered, data amount and

complexity will be higher. In this case, common

clustering methods [3] in the literature will not work

effectively. We introduce a hierarchical clustering

method to detect the dominant motion flows in theregion, which is explained in the next section.

3. Calculating Dominant Motion Flows

Detecting dominant motion flows is defined as

finding the orientation and spatial distribution of the

mostly followed paths in a scene during a given

period. If the motion of the objects in a video has an

organized behavior, then one type of orientation can

be assigned to each location. However, for crowd

videos, especially unstructured crowds, participants

move in various directions at different times. Each

spatial location holds more than one orientation type

depending on the time. It is not possible to find the

dominant flows by existing methods [3, 4, 11].

In this work, entire scene is divided into smaller

regions, in which flows vectors are easier to separate

into meaningful groups. Then, the flow vectors in

each region are clustered with a two-step hierarchical

approach to find the local dominant motion flows.

Figure 4 shows the hierarchical clustering steps.

Figure 2. SIFT motion flow vectors.

(a) (b)

Figure 3. (a) Unstructured Crowd Scene. (b) SIFT

flows in the marked region for 400 frames.

35223538353435343534


3/4

Finally, local dominant motion flows are connected

to compute the global dominant motion flows.

3.1. Hierarchical Clustering of Flow

Vectors

Orientation information is the most significantinformation while classifying the flow vectors. In

each local region, first, flow vectors are classified

into one of the four main orientation groups. Figure 4

shows the grouping of orientations. To achieve this,

orientation histogram is calculated and major groups

are chosen to represent the region. For example, in

Figure 5(b), there are two groups depending on the

orientation as depicted in blue and green. Second

step is spatial clustering. Flow vectors in each

orientation group are clustered based on the location.

Hence accumulations of the vectors in the region are

detected as in Figure 5(c). For this, Self-Tuning

Spectral Clustering method has been appliedconsidering the evaluation results in [3].

After clustering, local dominant motion flows are

calculated by computing the average location,

average orientation and total number of the flow

vectors in each group. So, local dominant motion

flow for each group is described with L(x, y, w, ).

w symbolizes the number of vectors and depicted

with the width of the flow vector. Figure 5(d) shows

three dominant motion flows calculated in the region.

3.2. Combining Local Dominant Flows

Once, main flows in local regions are detected,next question is how to combine them and obtain the

global motion flows. The basic logic is to start from

one side of the scene and follow the local flows and

connect them to the most probable neighbor flows till

the end of the scene. In other words, first, the entire

scene is scanned horizontally to connect the

horizontal flows. After this, it is scanned vertically to

connect the vertical flows. Orientation groups II, III

are stated as horizontal flows, whereas groups I and

IV are vertical flows. The algorithm is as follows:

While scanning, for each local motion flow,

1. Determine the neighbor cells, Ns.

2. In each N, search for the motion flows thatare in the same orientation group

3. Choose the closest one in the neighborhood

and connect with the current flow.

4. If, there are not motion flows with the sameorientation group in the neighbor cells and

next neighbor cells, choose the motion flow

that is the closest

Neighbor cells are defined as the two regions that

are in the direction of the current flow. For example,

in Figure 6(a), for the horizontal vector, the

neighborhood cells are c, e and next neighbor cells

are c, e. In Figure 6, the vectors shown with A are

in orientation group II. A1 is connected to A3 and A2,

A3 are connected to A4. Hence, they form the globalflow shown with bold gray line. If there are not any

vectors in the neighbor and next neighbor cells, then

it is connected to the closest vector to keep the

continuity. In which case, it means there is a

dominant abrupt motion orientation change in that

region. For example, if there wasnt A4 , A3 would be

connected to B1.

(a)

(b) orientation (c) spatial (d) local motionbased clustering clustering flows

Fi ure 5. Hierarchical clusterin .

(a)

(b)

Figure 6. Connecting the local flows.

(a) (b)

Figure 4. Hierarchical Clustering.

35233539353535353535


4/4

4. Experimental Results and Discussion

In our experiments, crowd data sets are taken

from the datasets of University of Central Florida [4]to provide a comparison with the related works. (a)

shows the input scenes and SIFT flows, (b) shows the

results of our method with detailed lines. (c) shows

the results in thick lines after combining the groups

and generating one group for each global flow. (d)

shows the ground truth which is drawn from the

average result of user study. The image size for two

sets is 360x480-pixels. Local regions are 60x60-

pixels size. There are 48 (6x8) local regions in total

The set at the top is from an escalator

neighborhood, which is a structured crowd example.

The video is analyzed between frames 100 and 460

with an interval of three. Most of the people move on

the escalators and the people on the far end of the

escalators walk freely. The proposed method can

successfully detect the global motion flows in free

motion regions as well as the flows through the

escalators

The one at the bottom is from a street, which is an

unstructured crowd example and complexity is high.

Video is analyzed between frames 140 and 460 with

interval length three. In 7(b) the local regions and the

connection of the local motion flows can be seen. For

the street scene, our system catches the parallelism in

the upper half of the scene. And the crossing of themotion flows is also detected in the lower part. Also,

3 main flows of vertical motion are detected, it is

shown with purple in 7(b). With the proposed

approach, dominant motion flows can be detected in

various levels. General dominant flow maps can be

provided as in (c) or if necessary local analysis of the

flows can also be obtained as in (b).

5. ConclusionsIn this work, we have presented a new approach

to solve the problem of calculating dominant motion

flows in various crowd scenes. By using SIFT featureflows and hierarchical clustering approach, it

becomes possible to analyze the motion flows even

for unstructured and structured crowds. The

proposed approach can detect global motion flows, at

the same time it can give information about local

characteristics of the motion flows.

References[1] F. M. Porikli, Trajectory Pattern Detection by HMM

Parameter Space Features and Eigenvector Clustering,

ECCV, 2004.

[2] M. Rodriguez, S. Ali and T. Kanade, Tracking In

Unstructured Crowded Scenes, ICCV, 2009.

[3] G. Eibl, N. Brandle, Evaluation of Clustering Methodsfor Finding Dominant Optical Flow Fields in Crowded

Scenes, ICPR, 2008.

[4] M. Hu, S. Ali and M. Shah, Detecting Global Motion

Patterns in Complex Videos, ICPR, 2008.

[5] G. Brostow, R. Cipolla, Unsupervised Bayesian

Detection of Independent Motion in Crowds, CVPR, 2006.

[6] X. Wang et al., Learning Semantic Scene Models by

Trajectory Analysis, ECCV, 2006.

[7] S. Ali, M. Shah, Floor Fields for Tracking in High

Density Crowd Scenes, ECCV, 2008.

[8] B. D. Lucas and T. Kanade, An Iterative Image

Registration Technique with an Application to Stereo

Vision, IJCAI, 1981.

[9] D. Lowe. Distinctive image features from scale-invariant key points. Intl. J. of Computer Vision,

60(2):91110, 2004.

[10] Y. Tsuduki, H. Fujiyoshi, A Method for Visualizing

Pedestrian Traffic Flow using SIFT, PSIVT, 2009.

[11] N. Ihaddadene, C. Djeraba, Real-time Crowd Motion

Analysis, ICPR, 2008.

[12] L. Zelnik-Manor, P. Perona, Self-Tuning Spectral

Clustering, In Adv. Neur. Inf. Proc. Sys.: 16011608,

2004.

(a) (b) (c) (d)

Figure 7. Experimental results.

35243540353635363536

10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

Documents

Transcript of 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes