Principal Component Analysis for Tensor Analysis and EEG classification

.

.

. ..

.

.

Tensor Analysis for EEG data

Tatsuya Yokota

Tokyo Institute of Technology

February 2, 2012

February 2, 2012 1/26

Outline

.. .1 Introduction

.. .2 Principal Component Analysis

.. .3 Experiments

.. .4 Summary

February 2, 2012 2/26

Brain Computer Interface

A brain-computer interface(BCI) is a direct communication pathway between thebrain and an external device. BCIs are often aimed at assisting, augmenting, orrepairing human cognitive or sensory-motor functions. BCIs can be separated intothree approaches as follow:

Invasive BCIs

Partially-invasive BCIs (ECoG)

Non-invasive BCIs (EEG, MEG, MRI, fMRI)

Invasive and partially-invasive BCIs are accurate. However there are risks of theinfection and the damage. Furthermore, it requires the operation to set theelectrodes in the head.On the other hand, non-invasive BCIs are inferior than invasive BCIs in accuracy,but costs and risks are very low. Especially, EEG approach is the most studiedpotential non-invasive interface, mainly due to its fine temporal resolution, ease ofuse, portability and low set-up cost.

February 2, 2012 3/26

Electroencephalogram:EEG

EEG is the recording of electrical activity along the scalp. EEG measures voltagefluctuations resulting from ionic current flows within the neurons of the brain.

(a) Electrodes(32 channels) (from’wikipedia’)

(b) EEG data (from ’wikipedia’)

In this research, we analyze the EEG signals to extract the important features ofthem.

February 2, 2012 4/26

Overview of EEG Analysis

There are some steps in EEG analysis. Here, we consider following three steps: Tobegin with, we record EEG signals from electrodes. Next, EEG signals aretransformed into the sparse representation. In this step, data becomes tensor.After that we apply the tensor decomposition technique to extract importantfeatures.

February 2, 2012 5/26

Wavelet Transform for Sparse Representation[Goupillaud et al., 1984]

In the first, we introduce the wavelet transform (WT) as one of the approaches forsparse representation. The wavelet transform is given by

W (b, a) =1√a

∫ ∞

−∞f(t)ψ

(t− b

a

)dt, (1)

where f(t) is a signal, ψ(t) is a wavelet function. There are many kind of waveletssuch as Haar wavelet, Meyer wavelet, Mexican Hat wavelet and Morlet wavelet. Inthis research, we use the Complex MORlet wavelet (CMOR) which is given by

ψfb,fc(t) =1√πfb

ei2πfct−(t2/fb). (2)

February 2, 2012 6/26

What’s Tensor

Tensor is a general name of multi-way array data. For example, 1d-tensor is avector, 2d-tensor is a matrix and 3d-tensor is a cube. We can image 4d-tensor asa vector of cubes. In similar way, 5d-tensor is a matrix of cubes, and 6d-tensor is acube of cubes.

February 2, 2012 7/26

Tensor Calculation

We introduce some important calculations for tensor algebra. A tensor isdescribed as

Y ∈ RI1×I2×···×IN . (3)

And each element of Y is described as yi1,i2,...,iN .

.mode-n tensor matrix product..

.

. ..

.

.

Y = G×n A, (4)

yi1,...,j,...,iN =

In∑in=1

gi1,...,in,...,iNain,j , (5)

where Y ∈ RI1×···×J×···×IN , G ∈ RI1×···×IN , and A ∈ RIn×J .We have following calculation rules:

(G×n A)×m B = (G×m B)×n A = G×n A×m B, (6)

(G×n A)×n B = G×n (BA). (7)

February 2, 2012 8/26

Outer product and Kronecker product

The outer product of vectors is given by

A = a ◦ b = abT ∈ RI×J , (8)

Z = a ◦ b ◦ c ∈ RI×J×K , (9)

Y = a(1) ◦ · · · ◦ a(N) ∈ RI1×···×IN . (10)

The Kronecker product of two matrices A ∈ RI×J and B ∈ RT×R is a matrixdenoted as

A⊗B ∈ RIT×JR (11)

and defined as

A⊗B =

a11B a12B · · · a1JBa21B a22B · · · a2JB...

.... . .

...aI1B aI2B · · · aIJB

. (12)

February 2, 2012 9/26

Unfolding Tensor (Matricization)

Unfolding is a very important technique in tensor analysis. Y(n) denotes themode-n unfolded matrix of Y.

.Unfolding..

.

. ..

.

.

Let Y ∈ RI1×I2×···×IN is a Nd-tensor, the unfolded matrix is follows:

Y(n) ∈ RIn×(I1···In−1In+1···IN ). (13)

Figure: Unfolding Image of 4d-tensor

February 2, 2012 10/26

Tucker3 model

We introduce the Tucker3 model isgiven by

Z = C×1 G×2 H ×3 E, (14)

=

R∑r=1

S∑s=1

T∑t=1

crstgr ◦ hs ◦ et. (15)

Using unfolding, it also can bedescribed as

Z(1) = GC(1)(ET ⊗HT ), (16)

Z(2) = HC(2)(GT ⊗ ET ), (17)

Z(3) = EC(3)(HT ⊗GT ). (18)

February 2, 2012 11/26

Tucker Decomposition (general formula)

Tucher Model is a very famous and general model of tensor decomposition. Giventensor Y is decomposed into a set of matrices {A(n)}Nn=1 and one small coretensor G.

.Tucker Model..

.

. ..

.

.

Y = G×1 A(1) ×2 A

(2) · · · ×N A(N) (19)

=

J1∑j1=1

· · ·JN∑

jN=1

gj1,...,jNa(1)j1

◦ · · · ◦ a(N)jN

(20)

Furthermore, it can be described as follow by using unfolding.

.Unfolded Tucker Model..

.

. ..

.

.

Y(n) = A(n)G(n)(A(N) ⊗ · · · ⊗A(n+1) ⊗A(n−1) ⊗ · · · ⊗A(1))T (21)

February 2, 2012 12/26

Kind of Tensor Decomposition [Cichocki et al., 2009]

The degree of freedom of tensor decomposition is very large. So there are manymethods of tensor decomposition. The kind of tensor decomposition is depend onthe constraint. For example, if we constrain the matrices {A(n)}Nn=1 and the coretensor G as non-negative matrices and tensor, then this method is thenon-negative tensor factorization (NTF). And if we consider the in-dependencyconstraint, then this method is the independent component analysis (ICA). And ifwe consider the sparsity constraints, then it is the sparse component analysis(SCA). And if we consider the orthogonal constraints, then it is the principalcomponent analysis (PCA).

February 2, 2012 13/26

Principal Component Analysis[Kroonenberg and de Leeuw, 1980] [Henrion, 1994]

Principal Component Analysis (PCA) is very typical method for signal analysis. Inthis slide, we explain PCA in case of 3d-tensor decomposition. The tensordecomposition model is given by

Z = C×1 G×2 H ×3 E. (22)

And the criterion of PCA is given by.Criterion for PCA..

.

. ..

.

.

minimize ||Z− C×1 G×2 H ×3 E||2F (23)

subject to GTG = I, HTH = I, ETE = I. (24)

The goal of this criterion is to minimize the error of decomposed model, subjectto the matrices {G,H,E} are orthogonal. And (23) also can be described asfollow by using unfolding:

min ||Z(1) −GC(1)(E ⊗H)T ||2F . (25)

February 2, 2012 14/26

Criterion for 3-way PCA

Criterion for 3-way PCA is given by

minimize f := ||Z(1) −GC(1)(ET ⊗HT )||2F (26)

subject to GTG = I, ETE = I, HTH = I. (27)

From (27),

C(1) = GTZ(1)(E ⊗H). (28)

Substituting (28) into f ,

f = ||Z(1) −GGTZ(1)(E ⊗H)(ET ⊗HT )||2F (29)

= tr(Z(1)ZT(1))− tr(GTZ(1)(EE

T ⊗HHT )ZT(1)G). (30)

tr(Z(1)ZT(1)) is constant, then the criterion is rewritten by

maximize tr(GTZ(1)(EET ⊗HHT )ZT

(1)G) (31)

subject to GTG = I, ETE = I, HTH = I. (32)

February 2, 2012 15/26

Solution Algorithm

Note that

p(G,H,E) :=tr(GTZ(1)(EET ⊗HHT )ZT

(1)G) (33)

=tr(HTZ(2)(GGT ⊗ EET )ZT

(2)H) (34)

=tr(ETZ(3)(HHT ⊗GGT )ZT

(3)E). (35)

The image of solution algorithm is described as follows:

Figure: Alternating Least Square(ALS) Algorithm

February 2, 2012 16/26

Experiments:Data sets [Blankertz, 2005]

.BCI Competition III : IVa..

.

. ..

.

.

EEG Motor Imagery Classification data set (right, foot)

There are 5 subjects(aa,al,av,aw,ay)

One imagery for 3.5s

118 EEG channels

Table: Number of Samples

#train #testaa 168 112al 224 56av 84 196aw 56 224ay 28 252

-1

-0.5

0

0.5

1

-1 -0.5 0 0.5 1

12

34

56

7 89

1011 12

131415

16 17 18 19 2021

22

2324 25 26 27 28 29

303132

33 34 35 36 37 38 3940

41

42 43 44 45 46 47 48 49

50 51 52 53 54 55 56 57 58

59 60 61 62 63 64 65 66

6768

69 70 71 72 73 74 7576

777879 80 81 82 83 84

85

86

8788

89 90 91 92 9394

95

9697

9899 100

101

102103104105106107108

109110 111112 113 114

115 116117 118

February 2, 2012 17/26

Experiments:Procedure

...1 Transformation into CMOR domain (time × frequency × channels ×samples)

CMOR 6-1 (case 1, case 2, case 3)...2 Applying Dimensionality Reduction

UnusedPCA (6-6-6,4-4-4,2-2-2,1-1-1)

...3 Classification

K-Nearest Neighbor methodLeast Squares Regression (Kernel regression)

time frame frequency channels samples # of elementscase 1 35(0:0.1:3.5) 23(8:30) 118 280 26597200case 2 350(0:0.01:3.5) 23(8:30) 7(51:57) 280 15778000case 3 35(0:0.1:3.5) 23(8:30) 7(51:57) 280 1577800

February 2, 2012 18/26

Experiments:Results I

Table: case 1, kNN-3

Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 50.00 49.10 44.64 50.00 58.92al 62.50 62.50 62.50 58.92 46.42av 54.08 51.53 53.06 53.06 55.10aw 54.91 50.00 53.12 49.10 54.01ay 51.98 47.22 44.84 41.66 44.84Ave. 54.69 52.07 51.63 50.55 51.86

Table: case 1, LSR

Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 58.92 60.71 59.82 55.35 58.92al 76.78 71.42 76.78 75.00 57.14av 58.16 57.65 54.08 55.61 56.12aw 63.83 64.28 56.69 62.50 62.50ay 48.81 51.19 50.79 48.41 48.80Ave. 61.31 61.05 59.63 59.37 56.69

February 2, 2012 19/26

Experiments:Results II


Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 56.25 44.64 52.67 47.32 48.21al 78.57 80.35 78.57 80.35 57.14av 60.20 52.04 57.65 55.61 46.42aw 58.48 57.58 54.46 57.58 52.67ay 57.93 59.92 59.92 58.33 44.84Ave. 62.28 58.90 60.65 59.83 47.65

Table: case 2, LSR

Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 62.50 58.92 58.03 58.92 53.57al 87.50 87.50 85.71 83.92 64.28av 61.22 56.12 56.12 52.04 55.61aw 66.07 64.73 61.60 62.94 60.71ay 57.53 62.69 70.23 73.80 48.41Ave. 66.96 65.99 66.33 66.32 56.51

February 2, 2012 20/26

Experiments:Results III


Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 55.35 50.89 50.00 46.42 44.64al 78.57 78.57 78.57 78.57 57.14av 60.20 53.57 55.61 51.02 50.00aw 57.14 55.35 54.01 57.58 51.78ay 58.33 58.33 59.52 56.34 44.64Ave. 61.91 59.34 59.54 57.98 49.64

Table: case 3, LSR

Unused (6-6-6) (4-4-4) (2-2-2) (1-1-1)

aa 57.14 60.71 55.35 56.25 51.78al 87.50 85.71 85.71 83.92 60.71av 59.18 55.61 54.08 54.08 53.06aw 66.07 63.39 58.03 62.50 60.26ay 57.53 57.93 61.11 63.88 48.41Ave. 65.48 64.67 62.85 64.12 54.84

February 2, 2012 21/26

BCI Competition: IVa Ranking

contributor ave. aa al av aw ay1 Yijun Wang 94.74 95.5 100 80.6 100 97.62 Yuanqing Li 87.40 89.3 98.2 76.5 92.4 80.63 Liu Yang 84.54 82.1 94.6 70.4 87.5 88.14 Zhou Zongtan 77.24 83.9 100 63.3 50.9 88.15 Michael Bensch 74.14 73.2 96.4 70.4 79.9 50.86 Codric Simon 73.28 83.0 91.1 50.0 87.9 54.47 Elly Gysels 72.36 69.6 96.4 64.3 69.6 61.98 Carmen Viduarre 69.62 66.1 100 63.3 64.3 54.49 Le Song 69.00 66.1 92.9 67.3 68.3 50.410 Ehsan Arbabi 68.26 70.5 94.6 56.1 63.8 56.311 Cyrus Shahabi 61.98 57.1 76.8 57.7 64.3 54.012 Kiyoung Yang 59.02 52.7 85.7 61.2 51.8 43.713 Hyunjin Yoon 53.76 50.0 67.9 52.6 52.7 45.614 Wang Feng 52.26 50.9 53.6 54.6 56.2 46.0

My best Result is 66.96% average.

February 2, 2012 22/26

Summary

N-way PCA is very efficient for dimensionality reduction.

High dimensional data can be reduced in easily.In this results, the accuracies almost kept.However, the accuracies didn’t become good.

EEG classification is very difficult problem.

It is considered that the feature extraction and preprocessing are important.Especially, channel selection might be very important.

February 2, 2012 23/26

Bibliography I

[Blankertz, 2005] Blankertz, B. (2005).Bci competition iii.http://www.bbci.de/competition/iii/.

[Cichocki et al., 2009] Cichocki, A., Zdunek, R., Phan, A. H., and Amari, S.(2009).Nonnegative Matrix and Tensor Factorizations: Applications to ExploratoryMulti-way Data Analysis.Wiley.

[Goupillaud et al., 1984] Goupillaud, P., Grossmann, A., and Morlet, J. (1984).Cycle-octave and related transforms in seismic signal analysis.Geoexploration, 23(1):85 – 102.

[Henrion, 1994] Henrion, R. (1994).N-way principal component analysis theory, algorithms and applications.Chemometrics and Intelligent Laboratory Systems, 25:1–23.

[Hyvarinen et al., 2001] Hyvarinen, A., Karhunen, J., and Oja, E. (2001).Independent Component Analysis.Wiley.

February 2, 2012 24/26

http://www.bbci.de/competition/iii/

Bibliography II

[Kroonenberg and de Leeuw, 1980] Kroonenberg, P. and de Leeuw, J. (1980).Principal component analysis of three-mode data by means of alternating leastsquares algorithms.Psychometrika, 45:69–97.

February 2, 2012 25/26

Thank you for listening

February 2, 2012 26/26

Principal Component Analysis for Tensor Analysis and EEG classification

Technology

Transcript of Principal Component Analysis for Tensor Analysis and EEG classification