Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio...
-
Upload
ella-garza -
Category
Documents
-
view
214 -
download
0
Transcript of Bilinear models for action and identity recognition Oxford Brookes Vision Group 26/01/2009 Fabio...
Bilinear models for action and identity recognition
Oxford Brookes Vision Group
26/01/2009
Fabio Cuzzolin
Bilinear models for invariant gaitID
The identity recognition problem View-invariance in gaitIDBilinear modelsHMMs and a three-layer modelFour experiments on the Mobo database
Identity recognition from gait
biometrics increasingly popularcooperative methods: face recognition, retinal analysissurveillance context: non-cooperative usersthe problem: recognizing the identity of humans from their gaitmethods: dimensionality
reduction, silhouette analysisissues: nuisance factors,
viewpoint dependence
A brief reviewgait signatures:
silhouettes [Collins 02, Wang 03], optical flow, velocity moments, shape symmetry, static body parameters
“baseline” algorithm [Sarkar 05]computes similarity scores between a probe sequence and each gallery (training) sequence by pairwise frame correlation
methodologies: mostly pattern recognition after dimensionality reduction
eigenspaces [Abdelkader 01], PCA/MDA [Tolliver 03, Han 04]
stochastic models (HMMs): [Kale 02, Debrunner 00]
KL-divergence between Markov models
Bilinear models for invariant gaitID
The identity recognition problem View-invariance in gaitIDBilinear modelsHMMs and a three-layer modelFour experiments on the Mobo database
The view-invariance issue
many different nuisance factorsnuisance factors are involved viewpointilluminationclothes, shoes, carried objectstrajectory
big issue: view-invarianceview-invariancepossible approaches:
3D trackingvirtual view reconstructionstatic body parameters
Approches to view-invariant gait ID
[Cunado 99]: “Evidence gathering” technique coupled oscillators, Fourier description, inclination of thigh and leg
[Urtasun,Fua 04]: fitting 3D temporal motion models to synchronized video sequences
Motion parameters: coefficients of the singular value decomposition of the estimated model angles
[Bhanu,Han 02] matching a 3D kinematic model to 2D silhouettes
extracting a number of feature angles from the fitted model
[Kale 03]: synthetic side-view of the moving person using a single camera[Shakhnarovich 01]: view-normalization from volumetric intersection of the visual hulls[Johnson, Bobick 01]: static body parameters recovered across multiple views
Bilinear models for invariant gaitID
The identity recognition problem View-invariance in gaitIDBilinear modelsHMMs and a three-layer modelFour experiments on the Mobo database
Bilinear models
From view-invariance to “style” invariance“style” invariancemotions usually possess several labels: action, identity, viewpoint, emotional state, etc.Bilinear modelsBilinear models (Tenenbaum) can be used to separate the influence of two of those factors, called “style” and “content” (the label to classify)
ySC is a training set of k-dimensional observations with labels S and CbC is a parameter vector representing content, while AS is a style-specific linear map mapping the content space onto the observation space
CSSC bAy
Bilinear models
the “content” (identity, action) of an observation can be thought of as a vector in an abstract “content space” of some dimension J
bC
AS
ySC
observations are then derived from content vector linearly, through a map which depends on the “style” parameter S
Learning an asymmetric bilinear model
given an observation sequence ySC…... an asymmetric bilinear model can fitted to the data through the SVD Y=SUV’ of a stacked observation matrix
the symmetric model can be written as Y=AB where
least-squares optimal style and content parameters are
SCS
C
yy
yy
Y
1
111
]',,[ 1 SAAA ],,[ 1 CbbB
JcolUSA ,...,1][ JrowVB ,...,1]'[
Content classification of unknown style
consider a training set in which persons (content=ID) are seen walking from different viewpoints (style=viewpoint)when new motions are acquired in which a known person is walking from a different viewpoint (unknown style)…… an iterative EM procedure can be set up to classify the content (identity)
E step -> estimation of p(c|s), the prob. of the content given the current estimate s of the style M step -> estimation of the linear map for the unknown style s
2
2~
2),~|(
csbAy
ecsyp
Bilinear models for invariant gaitID
The identity recognition problem View-invariance in gaitIDBilinear modelsHMMs and a three-layer modelFour experiments on the Mobo database
Hidden Markov models
finite-state representation of an observation process
state process {Xk} is a Markov chain
given a sequence os observations (feature matrix)...... EM algorithm for parameter learning (Moore)A->transition probabilities (motion dynamics)C-> means of state-output distributions (poses)
Motions as stacked HMMsinterpretation of the C matrix: columns of C are means of the output distributions associated with the states of the model
in gaitID (cyclic motions) the dynamics is the same for all sequences (A neglected)a sequence can then be represented as a collection of poses: stacked columns of the C stacked columns of the C matrixmatrix
Three-layer model
First layer (feature representation): projection of the contour of the silhouette on a sheaf of lines passing through the center
1
Third layer: bilinear model of HMMs3
2In the second layer each sequence is encoded as a Markov model, its C matrix is stacked in an observation vector, and a bilinear model is trained over those vectors
Bilinear models for invariant gaitID
The identity recognition problem View-invariance in gaitIDBilinear modelsHMMs and a three-layer modelFour experiments on the Mobo database
Mobo database: 25 people performing 4 different walking actions, from 6 cameras
each sequence has three labels: action, id, viewaction, id, view
MOBO database
Four experiments
we can then set up four experiments in which one label is chosen as contentone label is chosen as content, another one as another one as stylestyle, and the remaining is considered as a nuisance factorcontent style nuisance
actionview-invariant
action recognition
view ID
actionID-invariant
action recognition
ID view
IDaction-invariantgaitID
action view
IDview-invariant
gaitID view action
Results – ID versus VIEW
Compared performances with “baseline” baseline” algorithmalgorithm and straight k-NN on sequence HMMs
Results – ID versus action
performance of the bilinear classifier in the ID vs action experimentID vs action experiment as a function of the nuisance (view=1:5), averaged over all the possible choices of the test action the average best-match performance of the bilinear classifier is shown
in solid red, (minimum and maximum in magenta). Best-3 matches ratio in dotted red
Feature extractionType 1: projection of the contourprojection of the contour of the silhouette on a sheaf of lines passing through the center
Type 2: size functions [Frosini 90]
Type 3: Lee’s momentsLee’s moments
Results - influence of features
Left: ID-invariant action recognitionID-invariant action recognition using the bilinear classifier. The entire dataset is considered, regardless the viewpoint. The correct classification percentage is shown as a function of the test identity in black (for models using Lee's features) and red (contour projections). Related mean levels are drawn as dotted lines. Right: View-invariant action recognitionView-invariant action recognition.
Conclusions
covariance factorscovariance factors of paramount importance in gaitID
bilinear-multilinear modelsbilinear-multilinear models provide a way to separate different factorswe proposed a three-layer modelthree-layer model in which sequence are represented through HMMs
some approaches to view-invariance are expensive and sensitiveexpensive and sensitive
experiments on the Mobo database show how much separating factor is effective for motion classificationfuture: multilinear models, testing on more realistic setups (many factors, USF database)