Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes

Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes

Louis Kratz and Ko Nishino

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2012

OutlineMotivationIntroductionProposed methodExperimental resultsConclusion

MotivationGoal: tracking single or multiple pedestrians in crowd scenes

Solve conventional tracking problems- Occlusion problem- Pedestrians move in of different directions- Appearance change

Introduction(1)Observe a phenomenon

ObservationSmall area of instantaneous motions tend to repeat

- Temporal- Spatial

Introduction(2)Spatio-temporal motion pattern

- Describe crowd motion - Build a Spatial and temporal statistical model- Use to predict movement of individuals

Spatio-temporal motion pattern

t

y

x

Spatio-temporal motion pattern3D gradient vector:

Calculate the mean motion vector or build a statistical model at each cuboid

Introduction(3)Hidden Markov Model:

- States are not directly visible- Compromise of three components

① observation probabilities② transition probabilities③ initial probabilities

Introduction(4)Posterior distribution:

given confidence X find probability of parameters

Introduction(5)Particle filter: is a filter which can be used to predict next state

- different from kalman filter:Robust to non linear system and can handle non Gaussian noise

- Measurement:

Proposed method

Flow chart

(a) Divide the training video into spatio-temporal cuboids and calculate motion vectors, and then build statistical model for each motion patterns

(b) Train a collection of hidden Markov models (c) Use observed local motion patterns to predict the motion patterns at each location

(d) Use this predicted motion patterns to trace individuals

Step (a)-statistical model for motion patterns 1.First we calculate the motion vector at each pixel by 3D gradient vector

2.Next we build a statistical model by 3D Gaussian distribution

3. Define the local spatio-temporal pattern at location n and frame t

Step (b)-train hidden Markov models 1. By clustering algorithm, divide motion patterns into S clusters 2. Define states{s=1,…,S},and S is the number of clusters 3. For a specific hidden state s, the probability of an observed motion pattern is:

Calculate variance between two distributions

Step(c)- predict motion patterns Taking expected value of the predictive distribution:

Solve by forwards-backwards algorithm Reference: [23] L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,”Proc. IEEE,vol. 77, no. 2,pp. 257-286, Feb. 1989.

Step(d)-trace individuals Use particle filter maximize posterior distribution :

Compare to:

posterior likelihood priors

P(xf)

xf-1=[x,y,w,h]T in frame f-1

Figure present state vector xf-1 define a target window at frame f-1

Past and current measurement:

zf is the frame at time f

priors We use the motion pattern at the center of tracked target to estimate priors on the distribution of next state xf

Transition distributionP(xf|xf-1) is the transition distributionWe model by normal distribution:

is the 2D optical flow vector from predicted motion pattern [27] is the covariance matrix from predicted motion pattern distributionReference:

[27]J. Wright and R. Pless, “Analysis of Persistent Motion Patterns Using the 3D Structure Tensor,”Proc. IEEE Workshop Motion and Video Computing,pp. 14-19, 2005

Likelihood distribution

T: template of human objectR: region of bounding box at frame fZ: constant : variance respect to appearance change

Define distance measure:

ti: template gradient vector ri: region gradient vector M: number of pixels in templateIf distance large, likelihood small

If distance small, likelihood large

Add weight information to adjust appearance change

Error account to appearance change

- pixels from occlusion region have large angle between t and r thus error Ei large- When Ei large weight becomes small

Experimental resultsImplementation :

- Intel Xeon X5355 2.66GHz processor- 10 frames per seconds- cuboid size 10*10*10

Datasets

DatasetsFrom UCF Crowd data set300,350,300,120 frames respectively(a) train station’s concourse(b) ticket gate(c) sidewalk(d) intersection

Experiment 1 white indicate high error

error indicate little texture or

noisy area intersection scene due to small

amount amount of training data

Experiment 2

When occlusion enormous, variance of likelihood increase at frame 56,112,201

Experiment 3

Experiment 4

Errors cause by Innitial states not contain this direction

Experiment 5

Experiment 6

ConclusionWe proposed a efficient method for tracking individuals in crowded scenesWe solve the error caused by occlusion problem, appearance change, and different direction movement

Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes

Documents

Transcript of Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes