Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various...

18
ORIGINAL ARTICLE Gait recognition and micro-expression recognition based on maximum margin projection with tensor representation Xianye Ben 1,2 Peng Zhang 1,2 Rui Yan 3 Mingqiang Yang 1 Guodong Ge 1 Received: 29 August 2014 / Accepted: 11 August 2015 / Published online: 4 September 2015 Ó The Natural Computing Applications Forum 2015 Abstract We contribute, through this paper, to design a novel algorithm called maximum margin projection with tensor representation (MMPTR). This algorithm is able to recognize gait and micro-expression represented as third- order tensors. Through maximizing the inter-class Lapla- cian scatter and minimizing the intra-class Laplacian scatter, MMPTR can seek a tensor-to-tensor projection that directly extracts discriminative and geometry-preserving features from the original tensorial data. We show the validity of MMPTR through extensive experiments on the CASIA(B) gait database, TUM GAID gait database, and CASME micro-expression database. The proposed MMPTR generally obtains higher accuracy than MPCA, GTDA as well as state-of-the-art DTSA algorithm. Experimental results included in this paper suggest that MMPTR is especially effective in such tensorial object recognition tasks. Keywords Maximum margin projection with tensor representation (MMPTR) Dimensionality reduction Gait recognition Micro-expression recognition 1 Introduction In machine learning and statistics, dimensionality reduction (DR) is the process of reducing the number of random variables under consideration. DR is commonly defined as the process of mapping high-dimensional data to a lower- dimensional embedding [1]. Engel et al. [1] divided the basic approaches into two classes: projection-based and manifold learning methods. Projection-based methods are simply based on linear inner product transformations, while manifold learning methods can capture certain distance relationships in a nonlinear data structure along a manifold. Figure 1 provides a schematic diagram of DR techniques. If the data are modeled as a graph by the graph theory in order to optimize and learn the distances in data space, we can view these methods as graph-based ones. All the pro- jection-based methods can learn the embedding of metric distances. Manifold learning methods can learn the embedding of nonmetric distances except graph-based methods. From the data processing point of view, projec- tion-based methods can deal with linear data except graph- based technology; on the contrary, nonlinear data should be handled by manifold learning methods. In image processing, most traditional DR algorithms, such as principal component analysis (PCA), linear dis- criminant analysis (LDA), multidimensional scaling (MDS) [2], Isomap [3], locally linear embedding (LLE) [4] and recent works, such as covariance operator inverse regression (COIR) [5], collaborative representation-based projections (CRP) [6], and maximal linear embedding (MLE) [7], treat an input image or sequence as a vector before embedding. It seriously destroys the intrinsic tensor structure information of high-order data, and at the same time, may be beyond the computational processing capa- bility of computer devices after the vectorization. & Xianye Ben [email protected] 1 School of Information Science and Engineering, Shandong University, No. 27, Shanda South Road, Jinan 250100, People’s Republic of China 2 Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information, Ministry of Education, Nanjing University of Science and Technology, Nanjing 210094, People’s Republic of China 3 Computer Science Department, Rensselaer Polytechnic Institute, Troy, NY 12180, USA 123 Neural Comput & Applic (2016) 27:2629–2646 DOI 10.1007/s00521-015-2031-8

Transcript of Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various...

Page 1: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

ORIGINAL ARTICLE

Gait recognition and micro-expression recognition basedon maximum margin projection with tensor representation

Xianye Ben1,2 • Peng Zhang1,2 • Rui Yan3 • Mingqiang Yang1 • Guodong Ge1

Received: 29 August 2014 / Accepted: 11 August 2015 / Published online: 4 September 2015

� The Natural Computing Applications Forum 2015

Abstract We contribute, through this paper, to design a

novel algorithm called maximum margin projection with

tensor representation (MMPTR). This algorithm is able to

recognize gait and micro-expression represented as third-

order tensors. Through maximizing the inter-class Lapla-

cian scatter and minimizing the intra-class Laplacian

scatter, MMPTR can seek a tensor-to-tensor projection that

directly extracts discriminative and geometry-preserving

features from the original tensorial data. We show the

validity of MMPTR through extensive experiments on the

CASIA(B) gait database, TUM GAID gait database, and

CASME micro-expression database. The proposed

MMPTR generally obtains higher accuracy than MPCA,

GTDA as well as state-of-the-art DTSA algorithm.

Experimental results included in this paper suggest that

MMPTR is especially effective in such tensorial object

recognition tasks.

Keywords Maximum margin projection with tensor

representation (MMPTR) � Dimensionality reduction � Gaitrecognition � Micro-expression recognition

1 Introduction

In machine learning and statistics, dimensionality reduction

(DR) is the process of reducing the number of random

variables under consideration. DR is commonly defined as

the process of mapping high-dimensional data to a lower-

dimensional embedding [1]. Engel et al. [1] divided the

basic approaches into two classes: projection-based and

manifold learning methods. Projection-based methods are

simply based on linear inner product transformations, while

manifold learning methods can capture certain distance

relationships in a nonlinear data structure along a manifold.

Figure 1 provides a schematic diagram of DR techniques.

If the data are modeled as a graph by the graph theory in

order to optimize and learn the distances in data space, we

can view these methods as graph-based ones. All the pro-

jection-based methods can learn the embedding of metric

distances. Manifold learning methods can learn the

embedding of nonmetric distances except graph-based

methods. From the data processing point of view, projec-

tion-based methods can deal with linear data except graph-

based technology; on the contrary, nonlinear data should be

handled by manifold learning methods.

In image processing, most traditional DR algorithms,

such as principal component analysis (PCA), linear dis-

criminant analysis (LDA), multidimensional scaling

(MDS) [2], Isomap [3], locally linear embedding (LLE) [4]

and recent works, such as covariance operator inverse

regression (COIR) [5], collaborative representation-based

projections (CRP) [6], and maximal linear embedding

(MLE) [7], treat an input image or sequence as a vector

before embedding. It seriously destroys the intrinsic tensor

structure information of high-order data, and at the same

time, may be beyond the computational processing capa-

bility of computer devices after the vectorization.

& Xianye Ben

[email protected]

1 School of Information Science and Engineering, Shandong

University, No. 27, Shanda South Road, Jinan 250100,

People’s Republic of China

2 Key Laboratory of Intelligent Perception and Systems for

High-Dimensional Information, Ministry of Education,

Nanjing University of Science and Technology,

Nanjing 210094, People’s Republic of China

3 Computer Science Department, Rensselaer Polytechnic

Institute, Troy, NY 12180, USA

123

Neural Comput & Applic (2016) 27:2629–2646

DOI 10.1007/s00521-015-2031-8

Page 2: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

To address these problems, in [8], He et al. put forward

tensor subspace analysis (TSA) to do second-order tensor

dimension reduction with the help of s the intrinsic local

geometrical structure of the second-order tensor space.

Vasilescu et al. [9] proposed multilinear ICA (MICA)

model to learn the statistically independent components of

multiple factors. Since then multilinear DR operates

directly on the tensor samples that were emerging. Raj

et al. [10] proposed the fast MICA. Motivated by PCA, Lu

et al. [11] introduced multilinear principal component

analysis (MPCA) to capture most of the original tensorial

input variation. Nonnegative multilinear principal compo-

nent analysis (NMPCA) [12] was proposed for the

dimensionality reduction in the tensors by maximizing the

total tensor scatter while preserving the nonnegativity of

auditory representations. MICA, MPCA, and NMPCA are

unsupervised methods since they learn purely from the

dataset without knowing any class information. Multilinear

discriminant analysis (MLDA) approaches [13] may be

more reliable, since they are supervised ones. Among them,

discriminant analysis with tensor representation (DATER)

[14] maximized the dividing inter-class scatters by the

intra-class scatters; general tensor discriminant analysis

(GTDA) [15] maximized the differential inter-class scatters

and the weighted intra-class scatters; compound rank-

k projections for bilinear analysis [16] utilized multiple

rank-k mappings to increase monotonicity, at the same

time preserved the correlations within the matrix; tensor

discriminative locality alignment [17] can preserve the

discriminative locality for classification.

Uncorrelated features are desirable in recognition tasks

since they contain minimum redundancy and ensure inde-

pendence of features. Lu et al. improved MPCA and

MLDA. Uncorrelated multilinear principal component

analysis (UMPCA) [18] can produce uncorrelated features

while capturing most of the variation in the original ten-

sorial input. Instead of uncorrelated features, orthogonal

multilinear discriminant analysis (OMDA) [19] can extract

orthogonal discriminative features.

Some graph-based approaches have been proposed for

multilinear feature extraction and dimensionality reduction

in various pattern classification tasks. To preserve the

structural information of original tensor data, Lu et al. [20]

proposed uncorrelated multilinear geometry-preserving

projections (UMGPP) to obtain uncorrelated projection

directions, and Li et al. [21] proposed discriminant locally

linear embedding/tensorization (DLLE/T) to learn the pro-

jection directions by maximizing the margins between point

pairs on different classes. Tensor locality-preserving pro-

jections (TLPP) [22] exploited the intrinsic local geometric

and topological properties of the manifold. Tensor neigh-

borhood-preserving discriminant projections (TNPDP) [23]

encouraged instances from the same class to be close and

instances from different classes to be far. It considered

locality and discriminative information simultaneously. Lu

et al. [24] proposed a multilinear locality-preserving

canonical correlation analysis (MLPCCA) which sought

multiple sets of pairwise projection bases by maximizing the

correlation of two image sets. Wang et al. [25] introduced

discriminant tensor subspace analysis (DTSA) algorithm by

maximizing the quotient of the between-class scatter and

within-class scatter. Han et al. [26] presented multilinear

supervised neighborhood embedding (MSNE), which

directly dealt with the local descriptor tensor for extracting

discriminant and compact features. Liu et al. [27] proposed a

novel multilinear locality-preserved maximum information

embedding (MLPMIE) algorithm to preserve the local

geometry and to maximize the global discrimination simul-

taneously. Another graph-based approach preserves the

global geometry, for example multilinear isometric embed-

ding (MIE) [28]. In this paper, we concentrate on learning

multilinearmaximummargin projections under the guidance

of locality-preserving and discriminant analysis.

Multilinear DR have many various applications, both in

feature extraction and classification. Liu et al. [29] gave the

3D X-ray transform within a multilinear framework and

proposed a multilinear X-ray transform feature represen-

tation. Feng et al. [30] generated multilinear active

appearance model (MAAM) from an incomplete training

tensor with missing values to achieve face recognition

under the viewpoint, illumination, and expression varia-

tions. This paper focuses on the maximum margin

DR

projection-based methods manifold learning methodsgraph-based

methods

NoYesYesmetric

non-linearnon-linearlinear data

Fig. 1 Schematic diagram of

DR techniques

2630 Neural Comput & Applic (2016) 27:2629–2646

123

Page 3: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

projection with tensor representation (MMPTR) applied to

gait recognition and micro-expression recognition because

these recognition issues have some common ground: (1)

The beginning and final frames should be labeled when a

single recognized sample is defined; thus, the single sample

can be viewed as a third-order tensor including all the

frames between the beginning and final frames; (2) the

dimensionality reduction approach to these tensorial data is

required to reduce excess redundancy and extract the dis-

criminant feature for the recognition task.

Gait recognition has attracted significant attention

because of its wide range of applications for visual

surveillance in security-sensitive environments, such as

banks, airports, and parking lots [31]. Ben et al. [32] sur-

veyed various methods for gait recognition summarized

from unique ones for anthropometry, spatial temporal,

kinematics, kinetics, and video stream data forms. Some

representative feature expression methods are key frame

[33], time normalization [34], time series [35], outer sil-

houette [36], moments [37], modeling [38], projection

method [39], template energy images [40], fusion [41], and

tensor-based method [15]. However, the aforementioned

gait expression technologies except tensor-based method

and time normalization lost the dynamic information of

gait, which is significant in the gait recognition. Because of

the complicated adjustment process of time normalization,

this paper focuses on the tensor-based gait recognition. The

gait silhouette sequence can be viewed as a third-order

tensor with column, row, and time modes.

Micro-expression is a fast facial movement which usu-

ally lasts for 1/25 to 1/5 s; it reveals a real emotion that

people try to suppress and conceal. Micro-expression is

differentiated from expression by its short duration, so that

one can scarcely notice the micro-expression. Unlike reg-

ular expressions, few can fake a micro-expression because

it is an expression of spontaneous movements. Micro-ex-

pression may include all or a part of the facial muscle

movements of regular expressions. One technique to detect

lies is through the identification of facial micro-expressions

[42], and such the wide range of applications as settling

lawsuits in assisting judicial departments, business nego-

tiation, psychological counseling, and other fields resort to

micro-expressions. Researchers in the field of computer

vision have tried to develop micro-expression detection

algorithms but lack recognition algorithms in real meaning.

Pfister et al. [43] accurately detected the very short

expressions using a high-speed camera. Only Fu and Wang

[25] achieved the micro-expression recognition, and they

also viewed the micro-expression sequence as the tensor

sample, and proposed DTSA algorithm for distinguishing

tense, repression, disgust, and surprise.

Motivated by the discussions above and maximum

margin criterion [44], this paper aims to develop a MMPTR

that extracts locality-preserving and discriminant features.

The criterion of MMPTR is designed to strive for a serial of

transformation matrices, by maximizing the differential the

inter-class Laplacian scatters and the intra-class Laplacian

scatters. The solution is iterative, based on the alternating

projection method. Then, two classification methods,

namely direct classification and classification after tensor

vectorization, can be used in the classification. The latter

can be adopted to enhance the recognition performance and

compress the features. The effectiveness of the proposed

method has been strictly evaluated against the

CASIA(B) and TUM GAID gait database for gait recog-

nition, as well as CASME database for micro-expression

recognition.

More specifically, our contributions are as follows. First,

we propose a novel criterion for multilinear DR, which

maximizes the differential the inter-class Laplacian scatters

and the intra-class Laplacian scatters. Second, the classi-

fication after tensor vectorization can enhance the recog-

nition performance and compress the features furthermore.

Third, a solution to micro-expression recognizer by rep-

resenting the micro-expression samples as tensors and

extracting locality-preserving and discriminant features

from them was developed.

The rest of paper is organized as follows: Sect. 2

introduces tensor algebra briefly. In Sect. 3, we propose

MMPTR with an algorithm derived as an iterative process.

This section also discusses classification of MMPTR fea-

tures, initialization, convergence, termination, connections

to other tensorial subspace methods, and computational

complexity. Section 4 evaluates the effectiveness of

MMPTR in gait recognition and micro-expression recog-

nition tasks by comparing its performance against MPCA,

GTDA, and DTSA. Finally, Sect. 5 draws the conclusions.

2 Tensor algebra

2.1 n-mode unfolding of a third tensor

A data point A in the tensor space RI1�I2�����IN denotes Nth-

order tensor, where a tensor is represented by using ‘‘Eu-

clid Math One’’ font. A has N indices in; n ¼ 1; . . .;N,

where in addresses the n-mode of A. A can be unfolded

along n kinds of directions, and transformed into vectors

with rank-1. AðnÞ 2 RIn�ðI1�I2�����In�1�Inþ1�����INÞ addresses n-

mode unfolding of A. Taking a third-order tensor for

example, Fig. 21–3 illustrate 1-mode unfolding, 2-mode

unfolding, and 3-mode unfolding.

Neural Comput & Applic (2016) 27:2629–2646 2631

123

Page 4: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

2.2 n-mode projection

The projection (or called product) of A by a matrix U 2RJn�In is defined as

ðA �n UÞði1; . . .; in�1; jn; inþ1; . . .; iNÞ¼

XinAði1; . . .; iNÞ � Uðjn; inÞ ð1Þ

Based on the multilinear algebra theory, any tensor Bcan be expressed as another tensor A n-mode

(n ¼ 1; . . .;N) produced by other matrices uð1Þ; uð2Þ; . . .;

uðNÞ

B ¼ A �1 uð1ÞT �2 u

ð2ÞT �3 � � � �N uðNÞT ð2Þ

Figure 3 provides a visual illustration of a tensor pro-

duced by other matrices. In the 1-mode projection, a third-

order tensor A 2 R4�5�3 is projected in the 1-mode vector

space by a transformation matrix uð1Þ 2 R4�2, obtaining a

new tensor A�1 uð1ÞT 2 R2�5�3, therefore, the length of

each 1-mode vector of A is transformed from 4 to 2. Then,

the new tensor is projected by a transformation matrix

uð2Þ 2 R5�2 in the 2-mode projection, resulting in

A�1 uð1ÞT �2 u

ð2ÞT 2 R2�2�3, therefore, the length of each

2-mode vector of A is transformed from 5 to 2. Finally, in

the 3-mode projection, A�1 uð1ÞT �2 u

ð2ÞT �3 uð3ÞT

2 R2�2�2, therefore, the length of each 3-mode vector of Ais transformed from 3 to 2.

3 Maximum margin projection with tensorrepresentation

In this section, an MMPTR solution to the problem of

tensor-based dimensionality reduction is introduced,

investigated, and analyzed.

3.1 Algorithm

Maximum margin projection with tensor representation

(MMPTR) aims to find a multilinear transformation from

the original high-order space RI1 � RI2 � � � � � RIN (where

� denotes Kronecker product) to the reduced dimensional

space RP1 � RP2 � � � � � RPN (with Pn\In, for n ¼ 1; 2;

. . .;N):

X ! Y : Y ¼ X �1~U 1ð ÞT �2

~U 2ð ÞT �3 � � � �N~U Nð ÞT ð3Þ

The input data of MMPTR are a set of training samples

Xm;m ¼ 1; . . .;Mf g, where M is the total number of

(1)

(2)

(3)

Fig. 2 Visual illustration of n-

mode unfolding of a third

tensor. 1 1-mode unfolding, 22-mode unfolding, 3 3-mode

unfolding

2632 Neural Comput & Applic (2016) 27:2629–2646

123

Page 5: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

samples. The criterion of MMPTR is designed to strive for

N transformation matrices ~U nð Þ 2 RIn�Pn , which maximize

the inter-class Laplacian scatter UðnÞb and meanwhile min-

imize the intra-class Laplacian scatter UðnÞw . Based on this

point, we have the following optimizations on N transfor-

mation matrices ~U nð Þ: argmax ~UðnÞ UðnÞb and argmin ~UðnÞ UðnÞ

w .

If we define the total Laplacian scatters U nð Þt , we have

UðnÞb ¼ UðnÞ

t � UðnÞw , in which

~UUðnÞ ¼ ~Uðnþ1Þ � ~Uðnþ2Þ � � � � � ~UðNÞ � ~Uð1Þ � ~Uð2Þ

� � � � � ~Uðn�1Þ ð6Þ

where the superscript (n) denotes n-mode, IPnis a unit

matrix with the size of Pn � Pn, c is the number of classes,

andMi ði ¼ 1; . . .; cÞ is the sample number of class i. �X nð Þ is

the total average n-mode matrix of all the samples, which

can be expressed as

�X nð Þ ¼ ð1=MÞXM

m¼1

Xm nð Þ ð7Þ

where Xm nð Þ is the n-mode matrix for the sample m.

�XðiÞðnÞ is the average n-mode matrix of the samples

belonging to class i, which can be expressed as

�XðiÞðnÞ ¼ ð1=MiÞ

XMi

j¼1

XðiÞjðnÞ ð8Þ

where XðiÞjðnÞ is the n-mode matrix for the sample j from class i.

L ¼ D�W , Lw ¼ diag Dð1Þ�W ð1Þ

M1; D

ð2Þ�Wð2Þ

M2; . . .; D

ðcÞ�WðcÞ

Mc

� �

are two kinds of graph Laplacian matrices used to maxi-

mally preserve certain local nonlinear geometry of the

tensor data, where W is the Gaussian similarity matrix with

each entry wij, D is a diagonal matrix; its entries are col-

umn (or row) sum of W, dii ¼P

j wij.

Fig. 3 Visual illustration of a

tensor produced by other

matrices

UðnÞt ¼ 1

M

XM

m¼1

XmðnÞ � �XðnÞ� �

~UUðnÞ L� IPnð Þ ~UT

UðnÞ XmðnÞ � �XðnÞ� �T

¼ 1

2M2

XM

m¼1

XM

i¼1

XmðnÞ � XiðnÞ� �

~UUðnÞ L� IPnð Þ ~UT

UðnÞ XmðnÞ � XiðnÞ� �T

ð4Þ

U nð Þw ¼ 1

M

Xc

i¼1

XMi

j¼1

XðiÞjðnÞ � �X

ðiÞðnÞ

� �~UUðnÞ Lw � IPn

ð Þ ~UTUðnÞ X

ðiÞjðnÞ � �X

ðiÞðnÞ

� �T

¼ 1

M

Xc

i¼1

1

2Mi

XMi

j¼1

XMi

k¼1

XðiÞjðnÞ � X

ðiÞkðnÞ

� �~UUðnÞ Lw � IPn

ð Þ ~UTUðnÞ X

ðiÞjðnÞ � X

ðiÞkðnÞ

� �Tð5Þ

Neural Comput & Applic (2016) 27:2629–2646 2633

123

Page 6: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

wij ¼ e

� X i�X jk k2

2r2

� �

ð9Þ

where r is a heat kernel parameter. If X i and X j belong to

the same class, the value of X i �X j

�� ��2 should be com-

puted; on the contrary, X i � X j

�� ��2 is set þ1.

W ðiÞ is the Gaussian similarity matrix with each entry

wðiÞkl for class i. DðiÞ is a diagonal matrix for class i; its

entries are column (or row) sum of W ðiÞ, dkk ¼P

l wðiÞkl .

wðiÞkl ¼ e

� Xk�X lk k22r2

� �

ð10Þ

We unify arg ~U nð Þ maxU nð Þb and arg ~U nð Þ minU nð Þ

w into

~UðnÞ; n ¼ 1; . . .;Nn o

¼ arg ~Uð1Þ;...; ~UðNÞ max U nð Þb � U nð Þ

w

� �

¼ arg ~Uð1Þ;...; ~UðNÞmax U nð Þt � 2U nð Þ

w

� �

ð11Þ

constrained by ~UðnÞT ~UðnÞ ¼ I.

There is no known optimal solution which allows for the

simultaneous optimization of N transformation matrices.

An iterative procedure can be utilized to solve Eq. (11).

Assuming ~UðnÞ; n ¼ 1; 2; . . .k � 1; k; . . .;N are known, we

optimize ~UðkÞ acquired by combining the eigenvectors

associated with the largest Pn eigenvalues of the scatter

matrix U nð Þt � 2U nð Þ

w . Then, we iteratively optimize ~Uðkþ1Þ

by updating original ~UðkÞ to the latest optimized result and

fixing other N-2 transformation matrices, and so forth.

Generally, Pn for n ¼ 1; . . .N can mainly depend on

experience. Let kðnÞ�iðnÞbe iðnÞth full-projection eigenvalue for

n-mode. Pn can be determined by the defined testQðnÞðn ¼1; . . .;NÞ as follows:

testQðnÞ ¼PPn

iðnÞ¼1 kðnÞ�iðnÞ

PIniðnÞ¼1

kðnÞ�iðnÞ

ð12Þ

wherePPn

iðnÞ¼1 kðnÞ�iðnÞ

denotes the sum of the largest Pn

eigenvalues after the truncation of n-mode eigenvectors

beyond the Pnth, andPIn

iðnÞ¼1 kðnÞ�iðnÞ

denotes the sum of all

the eigenvalues before the truncation of n-mode eigen-

vectors. In order to simplify the selection of

testQðnÞðn ¼ 1; . . .;NÞ, testQ ¼ testQð1Þ ¼ testQð2Þ ¼ � � � ¼testQðNÞ. Thus, the dimensionality of each mode can be

selected by testQðnÞðn ¼ 1; . . .;NÞ.Table 1 summarizes the aforementioned procedure for

optimization.

Table 1 Procedure of MMPTRInput The training sample set , 1, ,

Output Projected training sample set 1 T 2 T T1 2 3 ,,1,= and set

of transformation matrices R

Procedure

(1) Preprocessing: The training sample set is centered , 1, , ,

where is the mean of all the training samples. (2) Initialization: it will be discussed in the section 3.3. (3) Optimization: Iteration cycle for =1 to { Mode cycle for =1to {

Calculate the total Laplacian scatters by Eq.(4)

Calculate the inter-class Laplacian scatters by Eq.(5)

Optimize ( ) , 1, , by Eq. (11)

end Judge convergence: it will be discussed in the section 3.3.

end

2634 Neural Comput & Applic (2016) 27:2629–2646

123

Page 7: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

3.2 Classification of MMPTR features

To classify a test tensor sample v0, we can adopt two

classification proposals, namely direct classification and

classification after tensor vectorization.

3.2.1 Direct classification

v0 is projected to be Y0 ¼ v0 �1~U 1ð ÞT �2

~U 2ð ÞT �3 � � � �N

~U Nð ÞT using MMPTR. Then, we calculate the Euclidean

distance Dis between the test sample v0 and each candidate

training sample, and the test sample is assigned the label:

the class of Xm.

DisðXm;X0Þ ¼ argmini

DisðYi;Y0Þ

¼ jjXm �1~U 1ð ÞT �2

~U 2ð ÞT �3 � � � �N~U Nð ÞT

�X0 �1~U 1ð ÞT �2

~U 2ð ÞT �3 � � � �N~U Nð ÞT jjF ð13Þ

where Yiði ¼ 1; . . .;MÞ is the projected result of the

training sample X iði ¼ 1; . . .;MÞ by using Eq. (3), and kkFdenotes F-norm.

3.2.2 Classification after tensor vectorization

The tensor data processed by MMPTR still consist of a

large number of redundant variables, and the so-called

tensor vectorization can further remove redundancy. The

entries of the new tensor Ym are rearranged into a vector

ym, ordered according to the class discriminability Cp1p2���pNin a descending order

Cp1p2���pN ¼Pc

i¼1Ni�Yiðp1;p2; . . .;pNÞ� �Yðp1;p2; . . .;pNÞ½ �2

PMm¼1 Ymðp1;p2; . . .;pNÞ� �Ycmðp1;p2; . . .;pNÞ½ �2

ð14Þ

where �Ycm ,�Yi, and �Y are the class mean feature tensor of

Xm, the class mean feature tensor belonging to class i, and

the total mean feature tensor in the projected tensor sub-

space, respectively.

In classification, the class of the test sample is deter-

mined by the nearest neighbor classifier using Euclidean

distance Dis between the vectorization of the test sample v0

and the vectorization of each candidate training sample

DisðXm;X0Þ ¼ argmini

DisðvecðYiÞ;vecðY0ÞÞ

¼ jjvecðXm

YN

n¼1

�nU nð ÞTÞ� vecðX0

YN

n¼1

�nU nð ÞTÞjjF ð15Þ

where vecð Þ denotes tensor vectorization with preserved

optimal dimension which can be determined by

experiments.

3.3 Initialization, convergence, and termination

This section discusses MMPTR design issue, such as its

initialization, convergence, and termination conditions.

3.3.1 Initialization

Due to the tensorial nature of the proposed method,

solving the projection vector in one mode requires the

projection vectors in all the other (n-1) modes. The

experimental results of Ref. [18] indicated that initializing

each mode projection vector to the all-ones vector gives

stable results. Therefore, this paper employs this uniform

initialization.

3.3.2 Convergence

Given each mode projection vector initialized by all-ones

vector, the alternating projection generates a sequence of

items f ~U nð Þk ; k ¼ 1; . . .;Kg via maximizing gð ~U 1ð Þ

k ;

~U2ð Þk ; . . .; ~U

nð Þk Þ ¼ Ui � 2Uw. The objective function is

nondecreasing, i.e., gð ~U 1ð Þk ; ~U

2ð Þk ; . . .; ~U

nð Þk Þ� gð ~U 1ð Þ

kþ1;~U

2ð Þk ;

. . .; ~Unð Þk Þ� gð ~U 1ð Þ

kþ1;~U

2ð Þkþ1; . . .;

~Unð Þk Þ� � � � � gð ~U 1ð Þ

kþ1;~U

2ð Þkþ1;

. . .; ~Unð Þkþ1Þ: We test the convergence of the MMPTR on

CASIA(B) gait database via ~Unð Þk � ~U

nð Þk�1

������F, where ~U

nð Þk

and ~Unð Þk�1 are, respectively, the n-mode transformation

matrices for the kth and (k–1)th iterations, and the change

in ~Unð Þk between two successive iterations converges to zero

for each n-mode after eight iterations, as can be seen in

Fig. 4.

3.3.3 Termination

The termination criterion is to be determined by the objective

function Ut � 2Uw. The iterative optimization procedure

terminates if ~Unð Þk � ~U

nð Þk�1

������F\gðnÞ,k ¼ 1; . . .;K, where gðnÞ

is a small predefined threshold for the n-mode. Also, the

termination criterion can be simply set to a maximum

number of iterations. In this paper, we set K = 10.

Fig. 4 Convergence test

Neural Comput & Applic (2016) 27:2629–2646 2635

123

Page 8: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

3.4 Connections to other tensorial subspace

methods

In this section, we analyze MMPTR’s relation with MPCA

and GTDA.

3.4.1 Relation to MPCA

MPCA can be thought of as revealing the internal structure

of the data in a way that best explains the variance in the

tensorial data. Its objective is the determination of the N

projection matrices ~UðnÞ; n ¼ 1; . . .;N

that maximize

the total tensor scatterPM

i¼1 Yi � �Y�� ��2

F, where

�Y ¼ 1=MPM

i¼1 Yi. When wij ¼ 1 for all i and j, and wðiÞkl ¼

1 for all k and l, MMPTR is reduced to MPCA. Compared

with MMPTR, MPCA builds a global graph in which each

tensorial data point is connected with the remaining points.

Therefore, MPCA preserves the global structural informa-

tion of the dataset.

3.4.2 Relation to GTDA

GTDA preserves the discriminative information in the

training tensors. The objective function of GTDA is the

determination of the N projection matrices

~UðnÞ; n ¼ 1; . . .;N

that maximize the differential scatter

discriminant criterionPc

i¼1 Mi�Yi � �Y

�� ��2F�1

PMj¼1 Yj�

���Yi;jk2F , where 1 is the Lagrange multiplier, �Yi is the average

tensor of embedded samples belonging to class i, and �Yi;j is

the average tensor of embedded sample j belonging to class

i. Supposing (1) wij ¼ 1=Mhif and only if i and j belong to

class h, otherwise, wij ¼ 0, (2) wðiÞkl ¼ 1=Mi

then MMPTR is

reduced to GTDA. When all of the tensorial data points are

used as the vertices of the graph, W is the similar weight of

all connection in this graph for GTDA. Therefore, GTDA

preserves the global structural information and discrimi-

nant information of the dataset.

3.5 Computational complexity

For simplicity, we assume that I1 ¼ I2 ¼ � � � ¼ IN ¼ I and

M is the number of training samples. From the previous

work, we know that its computational complexity mainly

lies in computing tensor projections, and its time com-

plexity is OðKNI3Þ, where K is the loop number to make

the optimization procedure of MMPTR converge.

4 Experimental results

In this section, we first briefly describe the CASIA(B) gait

database, TUM GAID gait database, and CASME micro-

expression database, then explore the performance of the

proposed MMPTR for the direct classification and classi-

fication after tensor vectorization, and lastly compare it

with MPCA, GTDA, and DTSA for gait recognition and

micro-expression recognition.

4.1 Experimental data

The experimental analysis was conducted on the gait

recognition and micro-expression recognition from the

following three public databases.

The first database is CASIA(B) gait database [45],

which includes a total of 124 individuals. There are six

normal gait sequences recorded at a resolution of

640 9 480 pixels with a frame rate of 25 fps. The gait

period was detected based on the dual-ellipse fitting

approach [46]. Each image from gait sequences was resized

to 64 9 64 pixels, and the silhouette was also centered.

The sample images of one individual are shown in Fig. 5.

The second database is TUM GAID gait database [47,

48], which contains RGB video, depth, and audio. There

Fig. 5 Sample images of one

individual from

CASIA(B) database

2636 Neural Comput & Applic (2016) 27:2629–2646

123

Page 9: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

are 305 people in total, and this gait database is one of the

largest to date. For the recording, the Microsoft Kinect

sensor was used. This sensor provides a video stream, a

depth stream, and four-channel audio. Six normal gait

sequences per person are recorded at a resolution of

640 9 480 pixels at a frame rate of approximately 30 fps

(slightly varying) for both video and depth. The depth

resolution is on the order of 1 cm. For depth acquisition,

the sensor sends beams of infrared light and infers the

depth from reflections on the objects. Therefore, placing

the sensor outside is not possible, since infrared light from

the sun can interfere with the depth sensor. The four-

channel audio is sampled with 24 bit at 16 kHz. The gait

period was detected according to the layered coding of

depth information [49]. To be specific, the grayscale value

of each depth image frame was extracted after the back-

ground subtraction. Then, the grayscale layered processing

of depth image was operated according to the predefined

thresholds, and all layer information was quantized and

coded uniformity. For the gait fluctuation, we constructed a

new signal which is the sum of coded pixels points over

time. Finally, the gait period was detected based on the

points with minimum value of the smoothed signal. The

sample images of one individual are shown in Fig. 6.

The third database is CASME database [50], which

contains 195 micro-expressions recorded at a resolution of

640 9 480 and 1280 9 720 pixels at 60 fps. These sam-

ples were selected from more than 1500 elicited facial

movements. There is no need to separate single indepen-

dent emotion sequence any more, because the onset, apex

and offset frames, marked action units (AUs) of micro-

expression sequences have been coded in the database. The

emotions of disgust, repression, surprise, and tense are

labeled. The sample images of one micro-expression

sequence are shown in Fig. 7.

In the recognition issue, gait recognition and micro-

expression recognition have some common features. For

example, they both can be represented as a tensor sample

with three modes, say row, column, and time, and their

recognition performances will be affected by various

numbers of frames for a sample. For gait, differences in

walking speed lead to the variance in the frame number.

Figure 8 shows that frames of No. ‘‘1’’,‘‘2’’,…,‘‘6’’ are

known, but the frame of No. ‘‘?’’ is unknown. It also

shows the interpolated images and difference images

between computed images interpolated by the two adja-

cent frames of ‘‘?’’ (such as ‘‘1’’ and ‘‘2’’ or

‘‘3’’,‘‘4’’,…,‘‘6’’) and their true images. From the total

mean (TM) grayscale value of the difference image, we

can see that the estimation error is minimal when using

the two nearest neighbor frames. Therefore, the interme-

diate frames are estimated by their two nearest neighbor

frames. In addition, the beginning and final frames for a

gait tensor are the same as the starting and ending ones

extracted from the original gait period.

For micro-expression, Shen et al. [51] have investigated

the effects of expression duration on micro-expression

recognition. So, this paper will present the frame number

normalization approach as follows:Fig. 6 Sample images of one individual from TUM GAID database

Fig. 7 Sample images of one micro-expression sequence from CASME database

Neural Comput & Applic (2016) 27:2629–2646 2637

123

Page 10: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

Given a sequence fSi 2 Rm�n; i ¼ 1; . . .;Pg and its

normalized sequence fS0j 2 Rm�n; j ¼ 1; . . .;Qg, where P is

the original number of frames, Q is the normalized number

of frames, and m and n denote the row and column of

image, respectively, and the linear interpolation compres-

sion rate rate can be defined as the ratio of the frame

interval after the interpolation to the one before

rate ¼ ðQ� 1Þ=ðP� 1Þ ð16Þ

The normalized first frame and last frame are stipulated

as

S01 ¼ S1; S0Q ¼ SP ð17Þ

If the relationships ði� 1Þ � rate\ ¼ j and i� rate[ j

were met for any normalized frame j and any original

frame i, the interpolation coefficients a and b can be

computed as

a ¼ ði� 1Þ � rate� ðj� 1Þ=ratej j ð18Þb ¼ ði� 2Þ � rate� ðj� 1Þ=ratej j ð19Þ

Thus, the image of normalized frame j can be expressed

as

S0j ¼ a� Si�1 þ b� Si ð20Þ

Through a vast number of experiments on the recogni-

tion performance assessment under various frame numbers,

Figs. 9, 10, and 11 show their optimal results interpolated

into 23 frames, 26 frames, and 64 frames for the sequences

corresponding to Figs. 5, 6, and 7, respectively.

4.2 Experiments on the CASIA(B) gait database

All the gait tensor samples can be expressed to be of size

64 9 64 9 23 through the proposed frame number nor-

malization approach. Three samples of each individual are

selected randomly and used for training, and the remainder

is used for testing. Experiments are conducted to test the

average recognition rate (ARR) across 30 random realiza-

tions of the training set.

4.2.1 Preprocessing step

Because of a large number of interrelated variables existing

in the tensor sample sets, we firstly employ MPCA to

reduce the dimensionality and retain most of the original

data variation. Figure 12 shows the ARR and average fea-

ture dimension (Dim) used for classification with the testQ

for MPCA in each mode ranging from 40 % to 98 %.

Results on the testQ = 90 % are significantly better than

the results on other testQs, indicating that MPCA with

testQ = 90 % can be followed by the proposed MMPTR.

Thus, all the gait tensor samples can be reduced to 8 %

ðð19� 27� 15Þ=ð64� 64� 23Þ = 8%Þ of original size:

P1 ¼ 19, P2 ¼ 27, and P3 ¼ 15 by MPCA, which is a very

important preprocessing step for the CASIA(B) gait

database.

4.2.2 Direct classification versus classification after tensor

vectorization

For the direct classification by using Eq. (13), the proposed

MMPTR can yield the best ARR of 77.7 %. However, the

proposed MMPTR following MPCA can obtain more

encouraging results. The AAR result for the post-tensor

vectorization classification of the proposed MMPTR with

various testQs following MPCA for further data reduction

Fig. 8 Example of error discussion

Fig. 9 Optimal interpolated

result with 23 frames for

CASIA(B) database

2638 Neural Comput & Applic (2016) 27:2629–2646

123

Page 11: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

Fig. 10 Optimal interpolated

result with 26 frames for TUM

GAID database

Fig. 11 Optimal interpolated result with 64 frames for CASME database

Fig. 12 ARR and average feature dimension (Dim) by using MPCA

on CASIA(B) database

85 90 95 100

0.4

0.5

0.6

0.7

0.8

0.9

1

testQ(%)

AR

R

Fig. 13 ARR by using MPCA followed by the proposed MMPTR on

CASIA(B) database

Neural Comput & Applic (2016) 27:2629–2646 2639

123

Page 12: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

is depicted in Fig. 13. In particular, a first large number of

features are useful, while beyond testQ = 92 %, the per-

formance varies very slowly with an increased testQ. We

observe that the best ARR of 94.9 % appears when

testQ = 96 %.

4.2.3 Comparison with existing methods

In this subsection, we compare our method with MPCA

[11], MPCA ? GTDA [15], and MPCA ? DTSA [25].

Table 2 presents the ARRs for our MPCA ? MMPTR

method compared against MPCA, MPCA ? GTDA, and

MPCA ? DTSA in both direct classification and classifi-

cation after tensor vectorization for the tensor DR issue.

We can see that the classification after tensor vectorization

improves the performance of direct classification signifi-

cantly. By reordering and selecting the feature preserved,

the feature size for the classification after tensor vector-

ization is also decreased considerably compared with the

direct classification. Since MPCA can be seen as the pre-

processing step, the tensor size of MPCA is larger than

MPCA ? GTDA, MPCA ? DTSA, and MPCA ?

MMPTR in the direct classification.

Figure 14 shows the plot of the ARR versus the number

of features used ranging from 10 to 370 for the MPCA,

MPCA ? GTDA, MPCA ? DTSA, and our method in the

classification after tensor vectorization. As can be seen, our

MPCA ? MMPTR method achieves the ARR of 94.9 %,

and outperforms the other three methods on larger

dimensions ranging from 205 to 260. MPCA performs the

second best with more preserved features. MPCA ? DTSA

is inferior to MPCA slightly, but needs fewer preserved

features.

Then, we use rank order statistic to evaluate the pro-

posed method. This is defined as the cumulative probability

that the actual class of a test measurement is among its

k top matches, where k is called the rank. These perfor-

mance statistics are reported as cumulative match scores

(CMS), and it can be effective to characterize features’

filter capability. According to the number of preserved

features selected as in Table 2, Fig. 15 shows the CMS for

ranks up to 10 of our MPCA ? MMPTR method also

compared with the aforementioned three methods. As can

be seen, our MPCA ? MMPTR method outperforms the

other three ones.

Finally, we design and conduct an experiment to check

the actual running time of each method. The experimental

platform is a workstation equipped with a 16 GB RAM and

hexa-core 3.47-GHz Intel(R) Xeon(R) CPU. Table 3 shows

the results, which indicates that the testing time consumed

Table 2 Recognition results on

the CASIA(B) gait databaseMethod Direct classification Classification after tensor vectorization

Tensor size ARR (%) ARR (%) Number of feature preserved

MPCA 19 9 27 9 15 74.2 92.7 330

MPCA ? GTDA 12 9 14 9 3 77.7 90.6 100

MPCA ? DTSA 15 9 22 9 12 68.8 92.2 30

MPCA ? MMPTR 18 9 24 9 14 77.7 94.9 245

Bold values indicate the best results

0 50 100 150 200 250 300 350 4000.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of Features used

AR

R

MPCAMPCA+GTDAMPCA+DTSAMPCA+MMPTR

Fig. 14 ARR versus the number of features used on

CASIA(B) database

1 2 3 4 5 6 7 8 9 100.9

0.91

0.92

0.93

0.94

0.95

0.96

0.97

0.98

0.99

Rank

CM

SMPCAMPCA+GTDAMPCA+DTSAMPCA+MMPTR

Fig. 15 Recognition performance in terms of rank order statistics on

CASIA(B) database

2640 Neural Comput & Applic (2016) 27:2629–2646

123

Page 13: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

by each method is almost the same. Nonetheless,

MPCA ? DTSA and MPCA ? MMPTR need more

training time than MPCA and MPCA ? GTDA. The main

reason is threefold: First, MPCA combined with other

algorithms requires optimizing N more transformation

matrices; second, Laplacian scatters in both MMPTR and

DTSA take much longer time to develop than no-Laplacian

scatters in the MPCA and GTDA; third, the testing time

hinges on both projection calculation and the size of final

feature used for classification. We can also observe that

compared with direct classification proposal, a little more

time is spent during the training process for the

Table 3 Time consumed (/s)

during the whole training and

testing processing on the

CASIA(B) gait database

Method Direct classification Classification after tensor vectorization

Training Testing Training Testing

MPCA 34.0 1.3 41.1 0.9

MPCA ? GTDA 78.3 1.2 86.3 0.7

MPCA ? DTSA 1410.4 1.4 1415.9 0.6

MPCA ? MMPTR 1337.6 1.5 1344.3 0.8

Fig. 16 ARR and average feature dimension (Dim) by using MPCA

on TUM GAID database

91 92 93 94 95 96 97 98 99 1000

AR

R

testQ(%)

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Fig. 17 ARR by using MPCA followed by the proposed MMPTR on

TUM GAID database

0 100 200 300 400 500 600

0.4

0.5

0.6

0.7

0.8

0.9

1

Number of Features used

AR

R

MPCAMPCA+GTDAMPCA+DTSAMPCA+MMPTR

Fig. 18 ARR versus the number of features used on TUM GAID

database

1 2 3 4 5 6 7 8 9 100.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

0.96

0.98

1

Rank

CM

S

MPCAMPCA+GTDAMPCA+DTSAMPCA+MMPTR

Fig. 19 Recognition performance in terms of rank order statistics on

the TUM GAID database

Neural Comput & Applic (2016) 27:2629–2646 2641

123

Page 14: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

classification after tensor vectorization proposal. This is

because it needs to spend extra time on tensor’s rear-

rangement into a vector according to the class discrim-

inability, which is with a little calculated amount for the

workstation. Though our proposed MPCA ? MMPTR

method in general exchanges training speed for precision,

its recognition decision time is quite short.

4.3 Experiments on the TUM GAID gait database

In this gait database experiments, all the gait tensor samples

are normalized to 64 9 44 9 26 pixels. For each individual,

four samples are randomly selected for training and the rest

are used for testing. We report the results, also including

ARR and average feature dimension (Dim) used for classi-

fication over 30 random splits. We also provide comparisons

with MPCA, MPCA ? GTDA, and MPCA ? DTSA.

4.3.1 Preprocessing step

We use MPCA to reduce the dimensionality. Figure 16

shows the ARR and average feature dimension (Dim) used

for classification with the testQ for MPCA in each mode

ranging from 40 % to 98 %. We can see that both

testQ = 89 % and testQ = 97 % are better than the results

on other testQs. It is worth noting that the larger testQ

produces larger dimension feature which may benefit

classification; therefore, MPCA with testQ = 97 % can be

chosen as the preprocessing step for the proposed MMPTR.

Table 4 Recognition results on

the TUM GAID databaseMethod Direct classification Classification after tensor vectorization

Tensor size ARR (%) ARR (%) Number of features preserved

MPCA 41 9 23 9 21 68.0 93.3 80

MPCA ? GTDA 23 9 15 9 14 68.3 91.3 40

MPCA ? DTSA 40 9 22 9 21 55.3 81.3 50

MPCA ? MMPTR 39 9 20 9 20 74.3 94.3 70

Bold values indicate the best results

Table 5 Time consumed (/s)

during the whole training and

testing processing on the TUM

GAID database

Method Direct classification Classification after tensor vectorization

Training Testing Training Testing

MPCA 61.2 42.2 75.5 0.8

MPCA ? GTDA 168.9 73.2 170.0 1.2

MPCA ? DTSA 1752.6 82.5 1754.5 1.3

MPCA ? MMPTR 1643.7 81.7 1645.2 1.4

Fig. 20 ARR and average feature dimension (Dim) by using MPCA

on CASME database

86 88 90 92 94 96 98 100

0.4

0.5

0.6

0.7

0.8

0.9

1

testQ(%)

AR

R

Fig. 21 ARR by using the proposed MMPTR on CASME database

2642 Neural Comput & Applic (2016) 27:2629–2646

123

Page 15: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

Thus, all the gait tensor samples can be reduced to

P1 ¼ 41, P2 ¼ 23, and P3 ¼ 21 by MPCA. In Fig. 17, we

show the ARR results by using MPCA followed by the

proposed MMPTR, for the horizontal axis which indicates

the testQ of MMPTR, and the top AAR of 94.3 % appears

when testQ = 99 %.

4.3.2 Direct classification v.s. classification after tensor

vectorization

Our MPCA ? MMPTR method based on direct classifi-

cation with 39 9 20 9 20-dimension features and classi-

fication after tensor vectorization with 70-dimension

features achieves the ARR of 74.3 % and 94.3 %, respec-

tively. From the results, we see that classification after

tensor vectorization outperforms direct classification.

4.3.3 Comparison with existing methods

We compare our methods with MPCA, MPCA ? GTDA,

and MPCA ? DTSA. In Fig. 18, the ARR with different

numbers of features is drawn based on classification after

tensor vectorization. We can see that our MPCA ? MMPTR

method outperforms the other three methods on several

dimensions ranging from 50 to 400. The ARR achieved by

MPCA with 80-dimension features and MPCA ? GTDA

with 40-dimension features is 93.3 % and 91.3 %, respec-

tively. The ARR of MPCA ? DTSA with 50-dimension

features is only 81.3 %, much lower thanMPCA.We present

the results obtained based on direct classification and clas-

sification after tensor vectorization in Table 4.

Besides, according to the number of preserved feature

selected as Table 4, we plot the CMS results for ranks up to

10 of different methods in Fig. 19. From the results, we can

see that our method outperforms MPCA, MPCA ? GTDA,

and MPCA ? DTSA. In addition, the CPU time consumed

by each method on the TUM GAID gait database is tested

and given in Table 5. The results suggest that the compared

time-consumed performance is similar to the

CASIA(B) gait database, however, the TUM GAID gait

database needs more training and testing time except for

MPCA’s testing for classification after tensor vectorization

proposal than the CASIA(B) one. The TUM GAID gait

database is larger than the CASIA(B) one, and as a result,

most of methods dealing with the TUM GAID gait data-

base require more time. In terms of MPCA’s testing for

classification after tensor vectorization proposal, the final

feature used for classification is shorter in the TUM GAID

gait database than in the CASIA(B) gait database, there-

fore, MPCA’s testing time is less (Table 5).

Table 6 Recognition results on

the CASME databaseMethod Direct classification Classification after tensor vectorization

Tensor size ARR (%) ARR (%) Number of features preserved

MPCA 42 9 42 9 43 42.5 47.6 60

GTDA 59 9 59 9 60 43.3 79.3 20

DTSA 59 9 61 9 53 48.6 52.0 100

MMPTR 64 9 64 9 64 49.5 80.2 25

Bold values indicate the best results

0 50 100 150 200 250 300 350 400 450 5000.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

Number of Features used

AR

R

MPCAGTDADTSAMMPTR

Fig. 22 ARR versus the number of features used on the CASME

database

1 2 3 4 5 6 7 8 9 100.4

0.5

0.6

0.7

0.8

0.9

1

Rank

CM

S

MPCAGTDADTSAMMPTR

Fig. 23 Recognition performance in terms of rank order statistics on

the CASME database

Neural Comput & Applic (2016) 27:2629–2646 2643

123

Page 16: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

4.4 Experiments on the CASME micro-expression

database

All the micro-expression samples are normalized to a

third-order tensor of 64 9 64 9 64 in this experiment.

For each micro-expression, 15 samples are randomly

selected for training and the rest are used for testing. We

report the results, also including ARR and average fea-

ture dimension (Dim) used for classification over 30

random splits.

4.4.1 Preprocessing step

In this subsection, we first estimate whether MPCA as a

preprocessing step will work or not. Figure 20 shows the

plots of the ARR and average feature dimension (Dim)

versus testQ for MPCA. As can be seen, the perfor-

mance of the MPCA varies with the testQ, and the

highest AAR is achieved when testQ = 100 %; that is to

say, there is no sense in taking MPCA as a prepro-

cessing step for the micro-expression task. The reason

could be due to the fact that the detailed and tiny fea-

tures also play a role in recognizing the micro-expres-

sion. Therefore, we will compare MMPTR with the

three tensor-based algorithms MPCA, GTDA, and

DTSA in the subsequent subsection.

4.4.2 Direct classification v.s. classification after tensor

vectorization

The AAR for MMPTR is enhanced from 49.45 % to

80.18 % when direct classification is replaced with clas-

sification after tensor vectorization. In Fig. 21, we show the

ARR results of the proposed MMPTR based on classifi-

cation after tensor vectorization, for the horizontal axis

which indicates the testQ of MMPTR, and the top AAR of

80.2 % appears when testQ = 100 %. The feature dimen-

sions for MMPTR method based on direct classification

and classification after tensor vectorization are

64 9 64 9 64 and 25, respectively. From the results, we

can observe that classification after tensor vectorization

outperforms direct classification (Table 6).

4.4.3 Comparison with existing methods

We compare our methods with MPCA, GTDA, and DTSA.

Both ARR and CMS are reported in Figs. 22 and 23, and

the tensor size as well as ARR for the direct classification

and number of features preserved for the classification after

tensor vectorization is listed in Table 6. Figure 22

demonstrates that starting from 10 features, DTSA gives

better recognition performance than all the other three

algorithms. From Fig. 23, we see that DTSA again out-

performs MPCA, GTDA, and DTSA, furthermore, when a

list of top-10 possible identifications of the tested are found

out, DTSA performs with 98.7 % accuracy. We also

compare the CPU time consumption of these four methods,

and Table 7 lists the time-consumed results. What is more,

similar results arise as both CASIA(B) and TUM GAID

gait databases. So far, the proposed MMPTR for micro-

expression recognition, to our best knowledge, has been the

state-of-the-art method in the aspects of degree of accuracy

and rapid identification.

5 Conclusion

This work effectively proposes a novel tensor subspace

analysis algorithm, named MMPTR for gait recognition

and micro-expression recognition. By finding N transfor-

mation matrices for N-order tensor data through maxi-

mizing the inter-class Laplacian scatter and meanwhile

minimizing the intra-class Laplacian scatter, we can extract

discriminative and geometry-preserving features for

recognition. Our results show that the proposed MMPTR

using these features produces better recognition perfor-

mance than MPCA, GTDA, and DTSA.

Acknowledgments We sincerely thank the Institute of Automation

Chinese Academy of Sciences for granting us permission to use the

CASIA(B) gait database, and thank the Institute for Human–Machine

Communication, Technische Universitat Munchen for granting us

permission to use the TUM GAID database, and also thank the

Institute of Psychology, Chinese Academy of Sciences for granting

us permission to use the CASME database. This project is supported

by the Natural Science Foundation of China (Grant Nos.

61201370, 61571275, and 61571274), the Specialized Research Fund

Table 7 Time consumed (/s)

during the whole training and

testing processing on the

CASME database

Method Direct classification Classification after tensor vectorization

Training Testing Training Testing

MPCA 2.6 4.7 5.7 0.5

GTDA 23.0 21.5 23.2 0.9

DTSA 302.1 20.6 304.1 1.2

MMPTR 203.1 22.3 204.2 0.9

2644 Neural Comput & Applic (2016) 27:2629–2646

123

Page 17: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

for the Doctoral Program of Higher Education of China (Grant No.

20120131120030), the Independent Innovation Foundation for Post-

doctoral Scientists of Shandong Province (Grant No. 201303100), the

Special Financial Program of China Post-doctoral Science Foundation

(Grant No. 2014T70636), the Key Laboratory of Intelligent Percep-

tion and Systems for High-Dimensional Information, Ministry of

Education (Grant No. 30920140122006), the Shandong Provincial

Natural Science Foundation, China (Grant Nos. ZR2014FM030 and

ZR2013FM32), and the Young Scholars Program of Shandong

University (Grant No. 2015WLJH39).

References

1. Daniel E, Lars H, Bernd H (2011) A survey of dimension reduction

methods for high-dimensional data analysis and visualization. In:

Visualization of large and unstructured data sets, pp 135–149

2. Andrew RW (1995) Multidimensional scaling by iterative

majorization using radial basis functions. Pattern Recogn

28:753–759

3. Tenenbaum JB, Silva V, Langford JC (2000) A global geometric

framework for nonlinear dimensionality reduction. Science

290(5500):2319–2323

4. Roweis ST, Lawrence KS (2000) Nonlinear dimensionality

reduction by locally linear embedding. Science 290:2323–2326

5. Kim M, Pavlovic V (2011) Central subspace dimensionality

reduction using covariance operators. IEEE Trans Pattern Anal

Mach Intell 33:657–670

6. Yang W, Wang Z, Sun C (2015) A collaborative representation

based projections method for feature extraction. Pattern Recogn

48:20–27

7. Wang R, Shan S, Chen X, Chen J, Gao W (2011) Maximal linear

embedding for dimensionality reduction. IEEE Trans Pattern

Anal Mach Intell 33:1776–1792

8. He XF, Cai D, Niyogi P (2005) Tensor subspace analysis. In:

Advances in neural information proceeding systems 18, Van-

couver, Canada

9. Vasilescu MAO, Terzopoulos D (2005) Multilinear independent

components analysis. IEEE Comput Soc Conf Comput Vision

Pattern Recogn 1:547–553

10. Raj RG, Bovik AC (2010) A fast multilinear ICA algorithm. In:

17th IEEE international conference on image processing (ICIP),

pp 1889–1892

11. Lu H, Plataniotis KN, Venetsanopoulos AN (2008) MPCA:

multilinear principal component analysis of tensor objects. IEEE

Trans Neural Netw 19:18–39

12. Panagakis Y, Kotropoulos C, Arce GR (2010) Non-negative

multilinear principal component analysis of auditory temporal

modulations for music genre classification. IEEE Trans Audio

Speech Lang Process 18:576–588

13. Yan S, Xu D, Yang Q, Zhang L, Tang X, Zhang HJ (2007)

Multilinear discriminant analysis for face recognition. IEEE

Trans Image Process 16:212–220

14. Yan S, Xu D, Yang Q, Zhang L, Tang X, Zhang HJ (2005)

Discriminant analysis with tensor representation. IEEE Comput

Soc Conf Comput Vision Pattern Recogn 1:526–532

15. Tao D, Li X, Wu X, Maybank SJ (2007) General tensor dis-

criminant analysis and gabor features for gait recognition. IEEE

Trans Pattern Anal Mach Intell 29:1700–1715

16. Chang X, Nie F, Wang S, Yang Y, Zhou X, Zhang C (2015)

Compound Rank-k Projections for Bilinear Analysis. IEEE Trans

Neural Netw Learn Syst. doi:10.1109/TNNLS.2015.2441735

17. Zhang L, Zhang L, Tao D, Huang X (2013) Tensor discriminative

locality alignment for hyperspectral image spectral-spatial feature

extraction. IEEE Trans Geosci Remote Sens 51:242–256

18. Lu H, Plataniotis KN, Venetsanopoulos AN (2009) Uncorrelated

multilinear principal component analysis for unsupervised mul-

tilinear subspace learning. IEEE Trans Neural Netw 20:

1820–1836

19. Ben XY, Jiang MY, Yan R et al (2015) Orthogonal multilinear

discriminant analysis and its subblock tensor analysis version.

Optik Int J Light Electron Opt 126:361–367

20. Lu J, Tan YP (2009) Uncorrelated multilinear geometry pre-

serving projections for multimodal biometrics recognition. In:

IEEE international symposium on circuits and systems,

pp 2601–2604

21. Li X, Lin S, Yan S, Xu D (2008) Discriminant locally linear

embedding with high-order tensor data. IEEE Trans Syst Man

Cybern B Cybern 38:342–352

22. Zheng D, Du X, Cui L (2010) Tensor locality preserving pro-

jections for face recognition. In: 2010 IEEE international con-

ference on systems man and cybernetics (SMC), pp 2347–2350

23. Lu J, Tan YP (2008) Enhanced face recognition using tensor

neighborhood preserving discriminant projections. In: 15th IEEE

international conference on image processing, pp 1916–1919

24. Lu J, Wang G, Tan YP (2011) Multilinear locality preserving

canonical correlation analysis for face recognition, information.

In: 8th international conference on communications and signal

processing (ICICS), pp 1–5

25. Wang SJ, Chen HL, Yan WJ, Chen YH, Fu XL (2014) Face

recognition and micro-expression recognition based on discrim-

inant tensor subspace analysis plus extreme learning machine.

Neural Process Lett 39(1):25–43

26. Han XH, Chen YW, Ruan X (2012) Multilinear supervised

neighborhood embedding of a local descriptor tensor for scene/

object recognition. IEEE Trans Image Process 21:1314–1326

27. Liu Y, Liu Y, Chan KCC (2010) Tensor distance based multi-

linear locality-preserved maximum Information embedding.

IEEE Trans Neural Netw 21:1848–1854

28. Liu Y, Liu Y, Chan KCC (2009) Multilinear isometric embedding

for visual pattern analysis. In: IEEE 12th international conference

on computer vision workshops (ICCV Workshops), pp 212–218

29. Liu M, Chowdhury AKR (2010) Multilinear feature extraction

and classification of multi-focal images, with applications in

nematode taxonomy, 2010. In: IEEE conference on computer

vision and pattern recognition (CVPR), pp 2823–2830

30. Feng ZH, Kittler J, Christmas W, Wu XJ, Pfeiffer S (2012)

Automatic face annotation by multilinear AAM with missing

values. In: 2012 21st international conference on pattern recog-

nition (ICPR), pp 2586–2589

31. Hu H (2013) Enhanced gabor feature based classification using a

regularized locally tensor discriminant model for multiview gait

recognition. IEEE Trans Circuits Syst Video Technol 23:

1274–1286

32. Ben XY, Xu S, Wang KJ (2012) Review on pedestrian gait

feature expression and recognition. Pattern Recog Artif Intell

25:71–81

33. Yu S, Wang L, Huang K et al. (2004) Gait analysis for human

identification in frequency domain. In: Proceedings of the 3rd

international conference on image and graphics, pp 282–285

34. Boulgouris NV, Plataniotis KN, Hatzinakos D (2006) Gait

recognition using linear time normalization. Pattern Recogn

39:969–979

35. Chen C, Liang J, Zhu X (2011) Gait recognition based on

improved dynamic bayesian networks. Pattern Recogn

44:988–995

36. Wang L, Hu WM, Tan TN (2003) Gait-Based human identifi-

cation. Chin J Comput 26:353–360

37. Goffredo M, Carter JN, Nixon MS (2008) Front view gait

recognition. In: Proceedings of the 2nd IEEE international con-

ference on biometrics: theory, applications and systems, pp 1–6

Neural Comput & Applic (2016) 27:2629–2646 2645

123

Page 18: Gait recognition and micro-expression recognition based on ... · Multilinear DR have many various applications, both in feature extraction and classification. Liu et al. [29] gave

38. Urtasun R, Fua P (2004) 3D tracking for ait characterization and

recognition. In: Proceedings of the sixth IEEE international

conference on automatic face and gesture recognition, pp 17–22

39. Ben XY, Xu S, Wang KJ (2012) Research on gait recognition

based on trace transform. J Jilin Univ (Engineering and Tech-

nology Edition) 42:1–5

40. Toby HWL, Cheung KH, James NKL (2011) Gait flow image: a

silhouette-based gait representation for human identification.

Pattern Recogn 44:973–987

41. Wang KJ, Ben XY, Liu LL et al (2009) Gait recognition using

information fusion of energy. J Huazhong Univ Sci Technol

37:14–17

42. Owayjan M, Kashour A, Haddad NA, Fadel M, Souki GA (2012)

The design and development of a lie detection system using facial

micro-expressions. In: 2012 2nd international conference on

advances in computational tools for engineering applications

(ACTEA), pp 33–38

43. Pfister T, Li X, Zhao G, Pietikainen M (2011) Recognising

spontaneous facial micro-expressions. In: 2011 IEEE interna-

tional conference on computer vision (ICCV), pp 1449–1456

44. Yang W, Wang J, Ren M, Yang J (2009) Feature extraction based

on laplacian bidirectional maximum margin criterion. Pattern

Recogn 42:2327–2334

45. Yu S, Tan D, Tan T (2006) A framework for evaluating the effect

of view angle, clothing and carrying condition on gait recogni-

tion. In: Proceedings of the 18th international conference on

pattern recognition, Hong Kong, China, pp 441–444

46. BenXY,MengWX,YanR (2012) Dual-ellipse fitting approach for

robust gait periodicity detection. Neurocomputing 79:173–178

47. Hofmann M, Geiger J, Bachmann S, Schuller B, Rigoll G (2013)

The TUM gait from audio, image and depth (GAID) database:

multimodal recognition of subjects and traits. In: Journal of visual

communication and image representation, special issue on visual

understanding and applications with rgb-d cameras, Elsevier

48. Hofmann M, Bachmann S, Rigoll G (2012) 2.5D gait biometrics

using the depth gradient histogram energy image. In: IEEE fifth

international conference on biometrics: theory, applications and

systems (BTAS 2012), Washington, DC, USA, pp 23–26

49. Zhang HL, Ben XY, Zhang P, Liu TJ (2014) Gait period detec-

tion based on layered coding of depth information. In: The 2014

international conference on mechanical engineering and intelli-

gent systems, Qingdao, China

50. Yan WJ, Wu Q, Liu YJ, Wang SJ, Fu XL (2013) CASME

database: A dataset of spontaneous micro-expressions collected

from neutralized faces. In: 2013 10th IEEE international con-

ference and workshops on automatic face and gesture recognition

(FG), pp 1–7

51. Shen X, Wu Q, Fu X (2012) Effects of the duration of expressions

on the recognition of micro-expressions. J Zhejiang Univ Sci B

13:221–230

2646 Neural Comput & Applic (2016) 27:2629–2646

123