10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

download 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

of 4

Transcript of 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

  • 7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

    1/4

    Detecting Dominant Motion Flows In Unstructured/structured Crowd

    Scenes

    Ovgu Ozturk Toshihiko Yamasaki Kiyoharu Aizawa

    The University of Tokyo

    {ovgu, yamasaki, aizawa}@hal.t.u-tokyo.ac.jp

    Abstract

    Detecting dominant motion flows in crowd scenes

    is one of the major problems in video surveillance.

    This is particularly difficult in unstructured crowd

    scenes, where the participants move randomly in

    various directions. This paper presents a novelmethod which utilizes SIFT features flow vectors to

    calculate the dominant motion flows in both

    unstructured and structured crowd scenes. SIFT

    features can represent the characteristic parts of

    objects, allowing robust tracking under non-rigid

    motion. First, flow vectors of SIFT features are

    calculated at certain intervals to form a motion flow

    map of the video. ext, this map is divided into

    equally sized square regions and in each region

    dominant motion flows are estimated by clustering

    the flow vectors. Then, local dominant motion flows

    are combined to obtain the global dominant motion

    flows. Experimental results demonstrate the

    successful application of the proposed method to

    challenging real-world scenes.

    1. Introduction

    Dominant motion patterns in videos provide very

    significant information which has a wide range of

    applications. Since motion patterns are formed by

    individual motions or interacted motions of crowds,

    it helps to analyze the social behavior in a given

    environment in the video. Furthermore it is useful

    during public place design and activity analysis for

    security reasons.

    Over the years, there have been many researches

    which try to find the motion patterns by using

    individual object tracking and trajectory

    classification methods. However, in real world

    situations, high density crowds form the most cases,

    and it is not always possible to track individual

    objects. Crowd scenes can be divided into two

    groups, unstructured and structured scenes, as in

    Figure 1. Structured crowds are the ones where mainmotion tracks are defined by environmental

    conditions, such as elevators, crosswalks, etc.

    Unstructured crowds are those where objects can

    move freely in any direction, following any path. So

    far, very few researchers have attempted to solve the

    complexity of the crowd scenes that are structured.

    Detecting dominant motion flows in unstructured

    crowds still remains as a challenging task.

    To solve the problem of calculating the dominant

    motion flows both in unstructured and structured

    crowds, we propose a new approach which has two

    distinctive contributions. First, our approach utilizesmotion flows of the SIFT features in a scene. Unlike

    corner-based features which have been used

    commonly in other researches, SIFT features can

    represent characteristics parts of the objects.

    Therefore, their tracking consistency and accuracy

    are higher during complex motions. Second, we

    propose a hierarchical clustering framework to deal

    with the complexity of unstructured motion flows.

    (a) Structured Crowd Scenes

    (b) Unstructured Crowd Scenes

    Figure 1. Unstructured/structured Crowds.

    2010 International Conference on Pattern Recognition

    1051-4651/10 $26.00 2010 IEEE

    DOI 10.1109/ICPR.2010.862

    3521

    2010 International Conference on Pattern Recognition

    1051-4651/10 $26.00 2010 IEEE

    DOI 10.1109/ICPR.2010.862

    3537

    2010 International Conference on Pattern Recognition

    1051-4651/10 $26.00 2010 IEEE

    DOI 10.1109/ICPR.2010.862

    3533

    2010 International Conference on Pattern Recognition

    1051-4651/10 $26.00 2010 IEEE

    DOI 10.1109/ICPR.2010.862

    3533

    2010 International Conference on Pattern Recognition

    1051-4651/10 $26.00 2010 IEEE

    DOI 10.1109/ICPR.2010.862

    3533

  • 7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

    2/4

    Entire scene is divided into equally sized local

    regions. In each local region, flow vectors are

    classified into groups based on their orientation.

    Then, location-based classification is applied to find

    the spatial accumulation of the vectors. Finally, local

    dominant motion flows are connected to obtain

    global dominant motion flows.

    1.1.Related Work

    Tracking individual objects and constructing the

    trajectories is a common approach to find the global

    motion flows as in [1, 6]. However, for crowd videos,

    continuous tracking of individual objects is not

    possible because of occlusion or failures. Another

    approach is to employ instantaneous flow vectors of

    image features in the entire image [3-5, 11]. They use

    corner-based features. But, these features are not

    reliable under non-rigid motion, affine

    transformation or noise. Hence, these researchesconsider only structured motions and do not work for

    unstructured crowds. In [4], they use neighborhood

    information, but it fails when a region contains flows

    with multiple directions eliminating each other. In [7],

    they propose floor fields, which are applicable for

    structured crowds. Only, the work in [2] considers

    unstructured crowd scenes where they try to track

    individual targets.

    2. Generating SIFT Feature Flows

    In this paper, SIFT features are used to calculatethe motion flows. SIFT features are known to be one

    of the best features that are robust under various

    transformations. They can be used to continuously

    track the foreground objects over many frames. Thus,

    instead of calculating the motion flows at each frame,

    we track the features at certain intervals. It provides

    us two advantages. First, it reduces the noise coming

    from background and unstable points. And computed

    motion flow vectors can be used directly without any

    pre or post processing.

    Each video is segmented into intervals with length

    d. SIFT features extracted in a frame will be

    matched to the corresponding features in the next

    frame after the interval d. The displacement vectors

    of the features over a certain threshold are defined as

    flow vectors. Figure 2 depicts the flow vectors. Flow

    vector is represented with F(x, y, , t, L), where:

    x,y : center of mass

    : orientation

    L : length

    t : frame number

    Figure 3(a) demonstrates an unstructured crowd

    scene. Motion flow map of the region in white square

    is depicted in 3(b). Motion flows are calculated for

    400 frames with interval length 3. Accumulation of

    flow vectors can be seen in certain orientations.

    However, if the variety of orientations in the region

    increases, the flow map becomes very complicated.

    When entire scene is considered, data amount and

    complexity will be higher. In this case, common

    clustering methods [3] in the literature will not work

    effectively. We introduce a hierarchical clustering

    method to detect the dominant motion flows in theregion, which is explained in the next section.

    3. Calculating Dominant Motion Flows

    Detecting dominant motion flows is defined as

    finding the orientation and spatial distribution of the

    mostly followed paths in a scene during a given

    period. If the motion of the objects in a video has an

    organized behavior, then one type of orientation can

    be assigned to each location. However, for crowd

    videos, especially unstructured crowds, participants

    move in various directions at different times. Each

    spatial location holds more than one orientation type

    depending on the time. It is not possible to find the

    dominant flows by existing methods [3, 4, 11].

    In this work, entire scene is divided into smaller

    regions, in which flows vectors are easier to separate

    into meaningful groups. Then, the flow vectors in

    each region are clustered with a two-step hierarchical

    approach to find the local dominant motion flows.

    Figure 4 shows the hierarchical clustering steps.

    Figure 2. SIFT motion flow vectors.

    (a) (b)

    Figure 3. (a) Unstructured Crowd Scene. (b) SIFT

    flows in the marked region for 400 frames.

    35223538353435343534

  • 7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

    3/4

    Finally, local dominant motion flows are connected

    to compute the global dominant motion flows.

    3.1. Hierarchical Clustering of Flow

    Vectors

    Orientation information is the most significantinformation while classifying the flow vectors. In

    each local region, first, flow vectors are classified

    into one of the four main orientation groups. Figure 4

    shows the grouping of orientations. To achieve this,

    orientation histogram is calculated and major groups

    are chosen to represent the region. For example, in

    Figure 5(b), there are two groups depending on the

    orientation as depicted in blue and green. Second

    step is spatial clustering. Flow vectors in each

    orientation group are clustered based on the location.

    Hence accumulations of the vectors in the region are

    detected as in Figure 5(c). For this, Self-Tuning

    Spectral Clustering method has been appliedconsidering the evaluation results in [3].

    After clustering, local dominant motion flows are

    calculated by computing the average location,

    average orientation and total number of the flow

    vectors in each group. So, local dominant motion

    flow for each group is described with L(x, y, w, ).

    w symbolizes the number of vectors and depicted

    with the width of the flow vector. Figure 5(d) shows

    three dominant motion flows calculated in the region.

    3.2. Combining Local Dominant Flows

    Once, main flows in local regions are detected,next question is how to combine them and obtain the

    global motion flows. The basic logic is to start from

    one side of the scene and follow the local flows and

    connect them to the most probable neighbor flows till

    the end of the scene. In other words, first, the entire

    scene is scanned horizontally to connect the

    horizontal flows. After this, it is scanned vertically to

    connect the vertical flows. Orientation groups II, III

    are stated as horizontal flows, whereas groups I and

    IV are vertical flows. The algorithm is as follows:

    While scanning, for each local motion flow,

    1. Determine the neighbor cells, Ns.

    2. In each N, search for the motion flows thatare in the same orientation group

    3. Choose the closest one in the neighborhood

    and connect with the current flow.

    4. If, there are not motion flows with the sameorientation group in the neighbor cells and

    next neighbor cells, choose the motion flow

    that is the closest

    Neighbor cells are defined as the two regions that

    are in the direction of the current flow. For example,

    in Figure 6(a), for the horizontal vector, the

    neighborhood cells are c, e and next neighbor cells

    are c, e. In Figure 6, the vectors shown with A are

    in orientation group II. A1 is connected to A3 and A2,

    A3 are connected to A4. Hence, they form the globalflow shown with bold gray line. If there are not any

    vectors in the neighbor and next neighbor cells, then

    it is connected to the closest vector to keep the

    continuity. In which case, it means there is a

    dominant abrupt motion orientation change in that

    region. For example, if there wasnt A4 , A3 would be

    connected to B1.

    (a)

    (b) orientation (c) spatial (d) local motionbased clustering clustering flows

    Fi ure 5. Hierarchical clusterin .

    (a)

    (b)

    Figure 6. Connecting the local flows.

    (a) (b)

    Figure 4. Hierarchical Clustering.

    35233539353535353535

  • 7/27/2019 10 Detecting Dominant Motion Flows in Unstructured Structured Crowd Scenes

    4/4

    4. Experimental Results and Discussion

    In our experiments, crowd data sets are taken

    from the datasets of University of Central Florida [4]to provide a comparison with the related works. (a)

    shows the input scenes and SIFT flows, (b) shows the

    results of our method with detailed lines. (c) shows

    the results in thick lines after combining the groups

    and generating one group for each global flow. (d)

    shows the ground truth which is drawn from the

    average result of user study. The image size for two

    sets is 360x480-pixels. Local regions are 60x60-

    pixels size. There are 48 (6x8) local regions in total

    The set at the top is from an escalator

    neighborhood, which is a structured crowd example.

    The video is analyzed between frames 100 and 460

    with an interval of three. Most of the people move on

    the escalators and the people on the far end of the

    escalators walk freely. The proposed method can

    successfully detect the global motion flows in free

    motion regions as well as the flows through the

    escalators

    The one at the bottom is from a street, which is an

    unstructured crowd example and complexity is high.

    Video is analyzed between frames 140 and 460 with

    interval length three. In 7(b) the local regions and the

    connection of the local motion flows can be seen. For

    the street scene, our system catches the parallelism in

    the upper half of the scene. And the crossing of themotion flows is also detected in the lower part. Also,

    3 main flows of vertical motion are detected, it is

    shown with purple in 7(b). With the proposed

    approach, dominant motion flows can be detected in

    various levels. General dominant flow maps can be

    provided as in (c) or if necessary local analysis of the

    flows can also be obtained as in (b).

    5. ConclusionsIn this work, we have presented a new approach

    to solve the problem of calculating dominant motion

    flows in various crowd scenes. By using SIFT featureflows and hierarchical clustering approach, it

    becomes possible to analyze the motion flows even

    for unstructured and structured crowds. The

    proposed approach can detect global motion flows, at

    the same time it can give information about local

    characteristics of the motion flows.

    References[1] F. M. Porikli, Trajectory Pattern Detection by HMM

    Parameter Space Features and Eigenvector Clustering,

    ECCV, 2004.

    [2] M. Rodriguez, S. Ali and T. Kanade, Tracking In

    Unstructured Crowded Scenes, ICCV, 2009.

    [3] G. Eibl, N. Brandle, Evaluation of Clustering Methodsfor Finding Dominant Optical Flow Fields in Crowded

    Scenes, ICPR, 2008.

    [4] M. Hu, S. Ali and M. Shah, Detecting Global Motion

    Patterns in Complex Videos, ICPR, 2008.

    [5] G. Brostow, R. Cipolla, Unsupervised Bayesian

    Detection of Independent Motion in Crowds, CVPR, 2006.

    [6] X. Wang et al., Learning Semantic Scene Models by

    Trajectory Analysis, ECCV, 2006.

    [7] S. Ali, M. Shah, Floor Fields for Tracking in High

    Density Crowd Scenes, ECCV, 2008.

    [8] B. D. Lucas and T. Kanade, An Iterative Image

    Registration Technique with an Application to Stereo

    Vision, IJCAI, 1981.

    [9] D. Lowe. Distinctive image features from scale-invariant key points. Intl. J. of Computer Vision,

    60(2):91110, 2004.

    [10] Y. Tsuduki, H. Fujiyoshi, A Method for Visualizing

    Pedestrian Traffic Flow using SIFT, PSIVT, 2009.

    [11] N. Ihaddadene, C. Djeraba, Real-time Crowd Motion

    Analysis, ICPR, 2008.

    [12] L. Zelnik-Manor, P. Perona, Self-Tuning Spectral

    Clustering, In Adv. Neur. Inf. Proc. Sys.: 16011608,

    2004.

    (a) (b) (c) (d)

    Figure 7. Experimental results.

    35243540353635363536