L1-Penalized N-way PLS for subset of electrodes selection in BCI experiments

This content has been downloaded from IOPscience Please scroll down to see the full text

Download details

IP Address 155247166234

This content was downloaded on 11112014 at 2339

Please note that terms and conditions apply

L1-Penalized N-way PLS for subset of electrodes selection in BCI experiments

View the table of contents for this issue or go to the journal homepage for more

2012 J Neural Eng 9 045010

(httpiopscienceioporg1741-255294045010)

Home Search Collections Journals About Contact us My IOPscience

IOP PUBLISHING JOURNAL OF NEURAL ENGINEERING

J Neural Eng 9 (2012) 045010 (14pp) doi1010881741-256094045010

L1-Penalized N-way PLS for subset ofelectrodes selection in BCI experimentsAndrey Eliseyev12 Cecile Moro2 Jean Faber12 Alexander Wyss2Napoleon Torres2 Corinne Mestais2 Alim Louis Benabid23

and Tetiana Aksenova124

1 Foundation Nanosciences Grenoble France2 ClinatecLETICEA Grenoble France3 Joseph Fourier University of Grenoble France

E-mail tetianaaksenovaceafr

Received 11 November 2011Accepted for publication 18 April 2012Published 25 July 2012Online at stacksioporgJNE9045010

AbstractRecently the N-way partial least squares (NPLS) approach was reported as an effective toolfor neuronal signal decoding and brainndashcomputer interface (BCI) system calibration Thismethod simultaneously analyzes data in several domains It combines the projection of a datatensor to a low dimensional space with linear regression In this paper the L1-Penalized NPLSis proposed for sparse BCI system calibration allowing uniting the projection technique withan effective selection of subset of features The L1-Penalized NPLS was applied for the binaryself-paced BCI system calibration providing selection of electrodes subset Our BCI system isdesigned for animal research in particular for research in non-human primates

(Some figures may appear in colour only in the online journal)

Introduction

The multi-way (tensor-based) analysis recently was reportedas an effective tool for neuronal signal processing (Martınez-Montes et al 2004 Nazarpour et al 2006 Acar et al2007 Moslashrup et al 2006 2008 Zhao et al 2009) Theadvantage of this approach is the simultaneous treatment ofdata in several domains (modalities or ways of analysis)to improve information extraction Spatial frequency andtemporal modalities are mostly considered in neuronal signalprocessing (Pfurtscheller et al 2003 Schalk et al 2007Vidaurre et al 2009) For the multi-way data analysisobservations are represented in a form of multi-way arrays(tensors) To map the neuronal recordings to the spatialndashfrequencyndashtemporal space wavelet transform is mainly used

Recently the multi-way analysis was reported as a tool forneuronal signal decoding in brainndashcomputer interface (BCI)studies (Nazarpour et al 2006 Zhao et al 2009 Chao et al2010 Phan et al 2010 Li et al 2009 Li and Zhang 2010Eliseyev et al 2011) BCI aims to provide an alternative

4 Author to whom any correspondence should be addressed

non-muscular communication pathway to send commandsto the external world by means of analysis of recordedbrain neuronal activity Tensor-based approaches have beenapplied to decode electroencephalograms (EEG) (Zhao et al2009 Phan et al 2010 Li et al 2009 Li and Zhang 2010)and electrocorticograms (ECoG) (Eliseyev et al 2011 Chaoet al 2010) associated with cue-paced (Nazarpour et al 2006Phan et al 2010) and self-paced (Eliseyev et al 2011 Chaoet al 2010) BCI paradigms The cue-paced (synchronized)control strategy uses external cues for driving the interactionbetween subjects and the BCI system Thus the users aresupposed to generate commands only during specific periodsAs opposed to the cue-paced systems no stimulus is used forself-paced BCI systems As users control them intentionallyself-paced BCI systems provide more freedom and controlflexibility However they are based on continuous monitoringof neuronal activity and are more difficult to be realized Asa result most reported BCIs are synchronized (for instancesee Wolpaw et al 2002 Schalk et al 2008) Even if self-paced tasks were carried out only selected time intervals(trials) corresponding to task preparation and execution wereclassified (eg Ball et al 2009) Nevertheless several groups

1741-256012045010+14$3300 1 copy 2012 IOP Publishing Ltd Printed in the UK amp the USA

J Neural Eng 9 (2012) 045010 A Eliseyev et al

have recently designed self-paced BCI systems introducing alsquoclass zerorsquo to discriminate mental events from basic neuronalactivity (eg Bashashati et al 2007 Leeb et al 2007 Schereret al 2008 Fatourechi et al 2008 Muller-Putz et al 2010 Qianet al 2010)

In general the first stage of the BCI study consists incalibration (learning) of the BCI system Identified on thisstage model allows controlling an external effector by meansof neuronal activity at the next stage of online experiments Themodel estimates the effector state (output variable) dependingon neuronal activity (input variables) Often tensor input andscalar output variables are chosen in tensor-based BCIs (Zhaoet al 2009 Eliseyev et al 2011) In cue-paced BCIs the binaryoutput can be used to characterize different types of control(eg left- versus right-hand activity see Zhao et al 2009)In self-paced BCIs binary output can be used to distinguishpredefined events from general activity (eg pedal pressingversus non-pressing event see Eliseyev et al 2011) Binaryoutput can be predicted by classifiers (Phan et al 2010 Zhaoet al 2009 Li et al 2009 Li and Zhang 2010) or by regression ofbinary output from observation tensors (Eliseyev et al 2011)All studies have in common that tensor factorizations havebeen carried out to project the data to a low dimensional space

The particular goal of our study is to adapt self-paced BCIto real-life applications including long-term BCI experimentsin a natural environment Recently (Eliseyev et al 2011) self-paced BCI was shown to work well in an almost naturalenvironment A freely moving animal (rat) was trained topush a pedal to activate a food dispenser without any externalstimulus Neuronal activity was monitored and intentionalcontrol patterns were recognized by the BCI system Toidentify the predictive model the multi-way analysis namelyN-way partial least squares (NPLS) approach (Bro 1996) wasapplied to project the feature tensor into a low dimensionalfeature space of latent variables and to estimate the regressionmodel for the intentional control prediction NPLS involvesclass information to perform the tensor decomposition thatsignificantly increases the efficiency of modeling As NPLSworks without any prior knowledge it can be efficientlyapplied to automatically generate a model predicting BCIevents from recordings of neuronal brain activity Thereforethis method has been chosen as a basic approach in this study

Note that in BCI experiments neuronal signals of thebrain are processed in real time Thus the computationalefficiency of BCI systems is of crucial importance Selectingthe appropriate features subset allows optimization ofcomputational efficiency and improves quality of control InBCIs in particular the selection of electrode subsets is crucial(see for example Lal et al 2004) To solve this problemapproaches based on classification accuracy (Sannelli et al2010) spatial filters (Wang et al 2005 Lv and Liu 2008)mutual information (Lan et al 2005) correlation (Barachantet al 2008) and Riemannian geometry (Barachant and Bonnet2011) were proposed For all of them the procedure ofselecting electrodes and the calibration of the model are doneapart The procedure of selecting electrodes is carried outeither during pre-processing (eg Wang et al 2005) or doneaccording to classification accuracy of the models based on

different subsets of electrodes (eg Lal et al 2004) Applyingthe first method has a constraint (loss of information) whereasthe second one is time and labor consuming

Generic NPLS involves all variables to generate the finalmodel In this paper we propose the L1-Penalized NPLSalgorithm to directly include feature selection in the modelingprocess Generic NPLS leads to a linear combination of allfeatures The L1-Penalized NPLS provides a sparse solutionfor different directions of analyses (eg spatial frequency ortemporal modalities) In this study the L1-Penalized NPLSwas used for binary self-paced BCI system calibration andthe selection of the subset of electrodes Real datasets of onenon-human primate collected in a series of experiments lastingfor up to 30 min were run with the L1-Penalized version aswell as with the generic NPLS Subsequently results werecompared The animal was trained to push a pedal to activatea food dispenser without any external stimulus Calibrationsbased on recordings from training sessions for differentpositions of the pedal were used to establish predictive modelsThe models were tested for their generalization abilitiesFor this purpose recordings from training sessions wereevaluated offline playing back the corresponding datasetsComputational experiments confirmed the high performanceof the system

Methods

Generic NPLS

The N-way PLS algorithm combines the data projection to alow dimensional feature space (the space of latent variables)with estimation of a linear regression model This method wasintroduced by Bro (1996) as a generalization of the ordinarypartial least squares (PLS) (Geladi and Kowalski 1986) formulti-way data sets (tensors) The PLS regression modelsa linear relationship between a vector of output variablesand a vector of input variables on the basis of observationmatrices X and Y To build the model the observations areprojected into the low dimensional spaces in such a way that themaximum variances of X and Y are explained simultaneouslyAt the first iteration the matrices X and Y are represented asX = t1pT

1 + E1 Y = u1qT1 + F1 where t1 and u1are the latent

variables (score vectors) whereas p1 and q1 are the loadingvectors E1 and F1 are the matrices of residuals The scorevectors are calculated to maximize the covariance between t1and u1 (Geladi and Kowalski 1986) The coefficient b1 of aregression u1 = b1t1 + r1 is calculated to minimize the normof the residuals r1 Then the procedure is applied iterativelyto the residual matrices

Similar to PLS NPLS projects the tensor of data intothe space of latent variables Tensors (multi-way arrays) are ahigher-order generalization of vectors and matrices Elementsof a tensor X isin RI1timesI2timestimesIN are denoted as x

i1 i2 iN Here

N is the order of the tensor ie the number of dimensions(ways or modes) The number of the variables Ii in the modei shows the dimensionality of this mode (Kolda and Bader2007) For example in a fourth-order tensor of observationsX isin RntimesI1timesI2timesI3 which contains n samples xi isin RI1timesI2timesI3

2


i = 1 n each sample xi isin RI1timesI2timesI3 is a third-order (cube)tensor In other words this corresponds to a simultaneousanalysis of neuronal activity in three domains In this studyeach cube represents the projection of one data epoch tothe spatialndashtemporalndashfrequency space A vector y isin Rn ofn observations of scalar variables is considered as an outputThe particular case of binary output yi isin 0 1 corresponds tothe binary self-paced BCI experiments

First the NPLS method decomposes the tensor X asindicated below

X = t1 w1 w2 w3 + E1

where the operation lsquorsquo is called the outer product (see Koldaand Bader 2007) The latent variable t1 isin Rn is extractedfrom the first mode of the tensor X providing a maximum ofcovariance between t1 and y In parallel the algorithm formsthe factor ie the set of projectors w1 isin RI1 w2 isin RI2 w3 isinRI3 ||wi|| = 1 i = 1 2 3 related to the second the thirdand the fourth mode of X respectively in such a way thatthe projection of the tensor X on these vectors results in t1The projectors correspond to each modality of analyses (egspatial frequency and temporal) To build the projectors atensor of correlation Z = X times1 y is calculated (times1 is the first-mode vector product of the tensor X and the vector y) Thenthe vectors w1 w2 w3 are estimated decomposing the tensorZ Z = w1 w2 w3 + E∥∥Z minus w1 w2 w3

∥∥F rarr min (1)

where middotF is the Frobenius norm which is the generalizationof the Euclidean norm for tensors (Kolda and Bader 2007)

To solve the optimization problem (1) the alternatingleast squares (ALS) (Yates 1933) algorithm can be applied(see appendix A) ALS is an iterative procedure It fixesall the projectors except one which is estimated in a least-square sense After the projectors w1 w2 w3 and the latentvariable t1 are defined a coefficient b1 of a regressiony = b1t1 + f1 is estimated using the minimal least squares(MLS) approach The next factors are calculated decomposingthe residuals After F iterations all the particular regressionsy f = T f b f f = 1 F are summarized to a finalmodel y = sumF

f=1 T f b f = Tb A vector b represents theregression coefficients for the whole set of latent variablesT = [t1| |tF ]

L1-Penalized NPLS algorithm

The NPLS approach can be generalized including additionalfeature selection opportunities For this purpose the ALSalgorithm can be substituted by its penalized version fordecomposition of the tensor Z = X times1 y

(Z minus w1 w2 w32F + P(w1 w2 w3)) rarr min

Here P (middot) is a penalization term Penalization is widelyused for solving various optimization tasks (Tychonoff andArsenin 1977) For different applications various penalizationoperators were considered the least absolute shrinkageselection operator (LASSO) P(w) = λ w1 (Tibshirani1996) the fusion lasso P(w) = λ Dw1 where D is a

difference operator (Land and Friedman 1996) the elasticnet (Enet) (Zou and Hastie 2005) (weighted 1-norm and 2-norm) etc Here λ is a non-negative penalization parameterTo obtain a sparse solution often the 1-norm penalty(LASSO) is applied LASSO can be implemented easilyproviding a sufficient level of selectivity Alternating penalizedleast squares proposed in Martınez-Montes et al (2008)combines L1 and L2 penaltiesIn this study the 1-normpenalty was integrated into the ALS algorithm consideringP(w1 w2 w3) = λ1w11 + λ2w21 + λ3w31 At eachstep of the algorithm all projectors are fixed except one Thatleads to the optimization task

wi = arg minwi

(Z minus w1 w2 w32F + λiwi1

)

i = 1 2 3

Penalized decomposition of tensor Z = Xtimes1 y results in factorw1 w2 w3

(see appendix B)

To select the optimal value of parameters of penalizationdifferent approaches can be used eg cross-validation(Devijver and Kittler 1982) Akaikersquos information criterion(Akaike 1974) Schwartzrsquos Bayesian information criterion(Schwartz 1978) etc

The algorithm results in a regression model predictingoutput yi from the observations xi isin RI1timesI2timesI3

The L1-Penalized NPLS algorithm combines computa-tional simplicity and moderate memory consumption withsufficient selectivity This method was applied for the binaryself-paced BCI system calibration and for the selection ofa subset of electrodes in the context of BCI experiments innon-human primates Estimated at the calibration stage thepredictive model is intended to send commands to the effectorduring execution in real-time applications

Influence analysis

The elements of the input data have an implicit impact onthe NPLS regression model through the latent variables Themodality influence (MI) analysis (Cook and Weisberg 1982)allows estimating the relative importance of the elements ofeach mode for the final model

The MI analysis can be applied to estimate the importanceof electrodes frequency bands and time intervals related tocontrol events (Eliseyev 2011)

Application

Data description

Data were collected in behavioral experiments in non-humanprimates based on a simple reward-oriented task During theexperiment the monkey is sitting in a custom-made primatechair minimally restrained Its neck is collar hooked to thechair The monkey has to push a pedal which can be mountedin four different positions (denoted below as lsquoleftrsquo lsquorightrsquolsquouprsquo and lsquodownrsquo) on a vertical panel facing the monkey Everycorrect pushing event activates a food dispenser We did notuse any cue or conditioning stimulus to trigger the pressingintention of the monkey A set of ECoG recordings was

3


(A)

(B)

(C)(D)

Figure 1 (A) BCI system calibration results in a BCI model and a decision rule y(t) characterizes the position of the pedal at the moment ts(t) contains the signal from the brain recorded during the experiment X is the tensor representation of the brainrsquos signals in the system(B) Time epochs of the multi-channel ECoG recording mapped with continuous wavelet transform to the temporalndashfrequencyndashspatialfeature space (C) Scheme of implantation of electrodes on the cerebral surface of the monkey (D) Formation of the observation tensor Xand the response vector y

collected from 32 surface electrodes chronically implantedon the cortex of the monkey (figure 1(C)) Simultaneouslyinformation about the state of the pedal was stored The ECoGsignals were obtained at a sampling rate of 1 kHz using theMicromed Rcopy system (Micromed SD64 Micromed Italy) Thesignals were band-pass filtered between 05 and 500 Hz

BCI system calibration

One recording of each position was used to calibrate theBCI system for this position (figure 1(A)) Training datasets included all event-related epochs (at least 50 trials)

and randomly selected 1000 lsquonon-eventrsquo epochs over all theexperiment

To form a tensor of observation the brain activity signalswere mapped to the temporalndashfrequencyndashspatial space Foreach epoch j (determined by its final moment t) electrode cfrequency f and time shift τ elements x jτ f c of the tensorX were calculated as norm of CWT of ECoG signal (seefigures 1(B) and (D)) Frequency bands of 10 300 Hz withstep δ f = 2 Hz and sliding windows [t minus τ t] τ = 05 swith step δτ = 001 s were considered for all electrodesc = 1 2 32 The resulting dimension of a point is(146 times 51 times 32)

4


Figure 2 Computation time for online prediction (05 s buffer) as afunction of the number of analyzed electrodes

After comparing the set of mother wavelets theMeyer wavelet was chosen for signal decomposition (seeappendix C)

The binary-dependent variable was set to one y j = 1 ifthe pedal was pressed at the moment t otherwise the binary-dependent variable was set to zero y j = 0 The resultingtensors of observations and the binary vector indicating thepedal position were used for modeling

Five factors and the corresponding latent variables ti i =1 2 5 were extracted by the Penalized NPLS algorithmfor each pedal position To find a subset of electrodesmost impacting the final model the L1-penalized versionof the NPLS algorithm was applied to the spatial modality(λ1 = 09(λ1)max λ2 = λ3 = 0 where (λ1)max is definedby the L1-penalized algorithm see appendix B figure A1)The penalization parameter λ1 was chosen taking into accountreal-time computational restrictions imposed by the systemFigure 2 illustrates time consumption for online prediction(Intel Dual Core 316 GHz) as a function of the numberof electrodes The calculation time is estimated for a signalacquisition system using a 05 s buffer Taking into accountdata collection transfer visualization of results hard driveaccess time operation system needs etc with our BCI systemrecordings of about ten electrodes can be processed in real-time The parameter λ1 was chosen to provide a high level ofselectivity ie less than ten electrodes with non-zero weightsin the final model for all positions of the pedal

To transform the prediction of output variable y to theBCI system decision (lsquoeventrsquondashlsquonon-eventrsquo) a simple decisionrule based on binarization of the predicted value was appliedA scalar threshold was chosen in such a way to maximizethe overall performance (OP) criterion (Eliseyev 2011) of theoffline playbacks of the recordings of BCI experiments (seedetails below)

BCI evaluation

The variety of experimental paradigms and evaluation criteria(Leeb et al 2007 Scherer et al 2008 Fatourechi et al 2008Muller-Putz et al 2010 Qian et al 2010) makes BCI systemcomparison complicated The accuracy of classification (ACC)referred to also as decoding accuracy (DA) (Ball et al 2009) ordecoding power (DP) (Rickert et al 2005) is a commonly used

evaluation criterion in BCI research It shows the percentageof correctly classified samples However being efficient forthe cue-paced BCIs ACC as well as error rate (ERR = 1 minusACC) fails to characterize the performance of the self-pacedBCIs due to highly unbalanced classes (Schlogl et al 2007)ACC depends on class ratio which varies essentially evenwithin the same series of experiments For example in thegiven set of experiments the class lsquozerorsquo is represented byabout 90 of buffers That is why other criteria are appliedto characterize the efficiency of the self-paced BCIs The truepositive rate (TPR = TP(TP + FN)) and the false positivesrate (FPR = FP(FP + TN)) are widely used to evaluate theself-paced BCIs performance They show the percentages oferrors for individual classes namely the relative amountof successfully detected events (TPR) and the relative amountof false activations (FPR) Nevertheless FPR is also influencedby the decision rate and the ratio of the classes Additionalcriteria characterizing false activations of the self-paced BCIswere proposed the number of false activations per minute(Mason and Birch 2000) and the positive predictive value(PPV = TP(TP + FP)) (Muller-Putz et al 2010) While TPRshows the percentage of successfully detected events PPVcorresponds to the percentage of correct detections Howeversimultaneous comparison of several criteria is complicatedSince standard ACC (DA) fails to evaluate the performance ofself-paced BCIs numerous attempts were made to introducea single metrics weighted ACC (Zhu and Yao 2004) HF-difference (Huggins et al 1999) F1-criterion (Rijsbergen1979) TPR at a fixed false positive rate (Borisoff et al 2004)ratio TPRFPR (Fatourechi et al 2007) and others In our studythe overall performance characteristic balancing FP and FNtypes of errors OP = (TPR + PPV)2 was chosen

Introducing the chance level (CL) can help to correlate andcompare the systems of different paradigms In synchronizedBCIs the Bernoulli scheme is applied explicitly or implicitlyto estimate the level of decoding by chance according tothe probabilities of classes It results in 50 of CL forclassification accuracy in the case of two classes of equalprobabilities 25 for four equiprobable classes etc Theself-paced BCI proceeds continuous monitoring of neuronalactivity In this case the assumption of the Bernoulli schemefor the sequence of independent and identically distributedrandom variables can lead to misestimates That is why wedecided to estimate the CL with computational experimentsfor given datasets Namely for each recording the randomdetection was made with probabilities of classes estimatedfrom recordings The computational experiments are explainedbelow

Simulation of BCI experiments

To study the generalization ability of the predictive model aset of offline playbacks of the BCI experiments were carriedout for each position of the pedal The decision (lsquoeventrsquo orlsquonon-eventrsquo) was made analyzing buffer by buffer the entirerecordings Buffers were of 05 s length in accordance with the

5


(A)

(B)

Figure 3 (A) The system is blocked for 15 s after any detection(B) the real event is considered as detected (TP) if the distancebetween the event and the moment of detection is less than 05 s

real-time data acquisition system of the CLINATECCEA BCIexperimental platform The predictors were calculated every0125 s ie four times per buffer The buffer was considered ascontaining the lsquoeventrsquo if at least one of these predictors passedthe threshold of binarization After any detection the systemwas blocked for 15 s to prevent multiple activations Thereal event was considered as detected (TP) if the time intervalbetween the real event and its detection did not exceed theTP interval To optimize the TP interval the delay distributionwas analyzed for all training recordings taking into accountthe detection events within a 15 s interval around the realevent (Fatourechi et al 2008) Then 95 confidential intervalwas calculated and rounded to the nearest entire buffer Theresulted TP interval was plusmn 05 s (figure 3)

The same experiments were carried out to estimate CLThe detections were randomly generated with probability ofclass lsquoeventsrsquo in the given recording The same true positiveand silent intervals (05 s and 15 s) were applied

Results

Results of calibration

The BCI system was calibrated for each pedal positionlsquoleftrsquo lsquorightrsquo lsquouprsquo and lsquodownrsquo Namely the signal ofthe corresponding training dataset was mapped to thefrequencyndashspatialndashtemporal space Then five factors andthe corresponding latent variables ti were extracted by L1-Penalized NPLS The number of factors was determined bythe cross-validation procedure The coefficients blowast

i of thenormalized predictive model y = sum5

i=1 tlowasti blowasti correspond to

weights of the related factors in the final decomposition

lsquoleftrsquo 0346 0273 0232 0111 0038lsquorightrsquo 0346 0217 0195 0138 0104lsquouprsquo 0383 0263 0158 0151 0045lsquodownrsquo 0278 0210 0194 0182 0138

The resulting predictive models are based on subsets offew electrodes 6 6 7 and 9 for respective positions lsquoleftrsquolsquorightrsquo lsquouprsquo and lsquodownrsquo of the pedal The corresponding mostinfluenced factors are shown in figure A2 MI analysis revealed

the summarized influence of elements of each modality(figure A3)

Applied to the spatial modality the MI analysis revealedthat the electrode 22 located in the primary motor cortexhas the highest impact on the decision rule (46 71 56and 37 of extracted information for respective positionslsquoleftrsquo lsquorightrsquo lsquouprsquo and lsquodownrsquo of the pedal) High frequencies( 100 Hz) are crucial for decisions However the influenceof the lower frequencies (lt 100 Hz) can be important tooespecially for the position lsquoleftrsquo of the pedal In the time domainthe interval (minus02 0) s before the event is the most a significantfor all positions of the lever

Comparison of L1-Penalized NPLS to generic NPLS

To compare the L1-Penalized NPLS method with the genericalgorithm the BCI system was calibrated with ordinary NPLS(figure A4) using the same training tensors Figure 4 showsthe weights of the different electrodes in the predictionrules obtained by the generic NPLS and L1-Penalized NPLSapproaches for different positions of the pedal The modelswere compared on the test recordings for each pedal positionand for a different number of factors The computationalexperiment revealed that the L1-Penalized NPLS providescomparable results (RMSE) or outperforms the generic NPLS(figure 5) At the same time the penalized algorithm uses asignificantly restricted subset of electrodes

Single electrode calibration

The single electrode calibration was the next step Five factorswere extracted by NPLS for the electrode (22) which is thebest in all the cases (figure 4) The relative weights of thesefactors in the final decomposition are as follows

lsquoleftrsquo 0419 0234 0116 0092 0089lsquorightrsquo 0371 0228 0137 0122 0079lsquouprsquo 0313 0201 0182 0137 0092lsquodownrsquo 0232 0212 0167 01210106

The leverages of elements of each modality for therespective best electrode according to the MI analyses areshown in figure A5 The results of the calibration procedureare the single-electrode predictor of the pedal-pressing eventsand the threshold-based decision rule

Validation of the generalization ability of predictive models

To study the generalization ability the NPLS the L1-PenalizedNPLS and the single-electrode predictive models were testedin a set of offline playbacks of previously collected dataFor this purpose the recordings of four series of behavioralexperiments 6 times 4 = 24 experiments in total were usedThe experiments lasted 4 to 20 min about 8 min onaverage Thresholds of binarization were adjusted to eachrecording Playbacks of BCI experiments were carried outoffline to estimate the BCI performance Part of one experiment(about 15 min) and corresponding time-delay histograms ofdetections for the single-electrode calibration are represented

6


Figure 4 The weights of the different electrodes in the prediction models obtained by the Generic NPLS and L1-Penalized NPLS fordifferent positions of the pedal

Figure 5 Comparison of prediction errors (root mean squared error RMSE) of the NPLS and the L1-Penalized NPLS algorithms on the testsets of different numbers of factors and different positions of the pedal

Figure 6 15 min length fragment of the experiment and its time-delay histograms of detection

7


Table 1 Performance of offline playbacks of BCI experiments

Method No of events Time (s) TPR () PPV () FPR () ERR () OP () FP (minminus1)

Left NPLS 81 plusmn 33 493 plusmn 197 610 plusmn 71 625 plusmn 67 324 plusmn 112 610 plusmn 206 617 plusmn 68 353 plusmn 115L1-PNPLS 637 plusmn 102 643 plusmn 56 316 plusmn 123 570 plusmn 184 640 plusmn 78 344 plusmn 125NPLS(1el) 608 plusmn 57 619 plusmn 65 328 plusmn 107 616 plusmn 197 613 plusmn 187 357 plusmn 107Random 646 plusmn 213 803 plusmn 245 653 plusmn 207 1371 plusmn 440 725 plusmn 224 713 plusmn 210

Right NPLS 71 plusmn 25 437 plusmn 137 595 plusmn 233 580 plusmn 210 450 plusmn 139 761 plusmn 138 588 plusmn 219 478 plusmn 142L1-PNPLS 622 plusmn 214 642 plusmn 190 367 plusmn 185 668 plusmn 214 632 plusmn 200 390 plusmn 191NPLS(1el) 627 plusmn 60 632 plusmn 93 319 plusmn 100 597 plusmn 143 627 plusmn 73 349 plusmn 104Random 762 plusmn 268 1077 plusmn 471 802 plusmn 262 1739 plusmn 637 919 plusmn 359 842 plusmn 247

Up NPLS 52 plusmn 11 462 plusmn 329 631 plusmn 124 621 plusmn 119 341 plusmn 134 592 plusmn 153 626 plusmn 118 371 plusmn 135L1-PNPLS 725 plusmn 68 754 plusmn 83 221 plusmn 094 420 plusmn 126 739 plusmn 64 240 plusmn 097NPLS(1el) 637 plusmn 206 630 plusmn 209 390 plusmn 140 675 plusmn 217 634 plusmn 207 415 plusmn 146Random 515 plusmn 342 667 plusmn 457 680 plusmn 204 1415 plusmn 435 591 plusmn 398 741 plusmn 198

Down NPLS 59 plusmn 18 486 plusmn 186 478 plusmn 152 480 plusmn 158 331 plusmn 044 620 plusmn 090 479 plusmn 155 371 plusmn 043L1-PNPLS 556 plusmn 150 523 plusmn 161 320 plusmn 046 559 plusmn 049 539 plusmn 155 359 plusmn 052NPLS(1el) 526 plusmn 189 519 plusmn 186 304 plusmn 059 558 plusmn 078 523 plusmn 187 341 plusmn 061Random 310 plusmn 513 390 plusmn 606 521 plusmn 182 1104 plusmn 345 350 plusmn 559 581 plusmn 189

Figure 7 Comparison of the overall performance in the set ofcomputational experiments for the Single-Electrode NPLS theGeneric NPLS the L1-Penalized NPLS as well as the RandomDetection procedure (chance level)

in figure 6 A summary of the results of the performanceevaluation is shown in table 1 and figure 7

To compare the results to the CL the random detectionprocedure was applied to the test recordings

Discussion

The self-paced BCI requires a high level of selectivityfor identification and discrimination of specific neuronalactivity against background brain activity during continuousmonitoring To achieve the necessary level of selectivity themulti-way analysis was chosen since it provides simultaneoussignal processing in several domains To extract knowledgefrom the experimental data NPLS was applied as a basicapproach for the BCI system calibration This method requiresneither exhaustive search of the model nor regularization ofthe task The disadvantage of this projection-based method isthat the final model includes all available variables While themain goal of the study was to discriminate the specific neuronalpattern related to the control action an additional goal wasto make the decision using a few electrodes or even onlyone electrode To overcome the problem the L1-penalizationwas incorporated into the NPLS method providing the sparsemodeling Penalized NPLS is introduced to improve the NPLScapacity by means of feature selection integration to thegeneric algorithm

Dimension reduction in general and selection of effectivefeature subsets in particular are the important problem inBCI studies Traditional approaches use either projectionsor sorting procedures L1-penalization integrated to theprojection-based NPLS algorithm provides sparse projectionsand allows integrated feature selection Applied to the spatialmodality it results in selection of electrodes subset Letus stress that tensor data representation and processingallow introducing different penalties in different directionsaccording to the particular problem (feature selection withL1 penalty in one modality robust solution with L2 penaltyin another etc) Moreover additional restrictions can beintroduced to the optimization in particular directions (egnon-negativity unimodality etc)

8


To achieve the main goal namely the fully autonomousself-paced BCI functioning in a natural environment we haveanalyzed the recordings of the series of behavioral experimentsin the non-human primate controlling the food dispenser bypushing the lever The duration of experiments varied from4 to 20 min The factors extracted by Penalized NPLS canbe interpreted taking into account their influence on the finalmodel with the aid of the MI analysis In our research wefound out that electrodes located in motor primary cortex(especially the electrode 22) have the strongest influence onthe decision Additional experiments will allow better studyof the stability of signalrsquos localization and evolution withtime brainrsquos plasticity etc In all series of experiments 6ndash9electrodes out of 32 were involved in the final models In othermodalities the most significant influence on the decisionshas the increase of signal energy in the high frequencies(gt100 Hz) and the time interval 02 s length before thepushing event Frequency and temporal modalities of the mostinfluenced factor show high gamma synchronization and betadesynchronization in the time interval 01 s before the pushingevent (figure A2)

The variety of experimental paradigms and evaluationcriteria complicates BCI systems comparison In this paperwe use the OP criterion due to its efficiency for self-pacedBCIs OP for the respective pedal positions (640 for lsquoleftrsquo632 for lsquorightrsquo 739 for lsquouprsquo and 539 for lsquodownrsquo) isachieved for the restricted set of electrodes (from 6 to 9)For the single-electrode case OP equals 613 for the lsquoleftrsquo627 for the lsquorightrsquo 634 for the lsquouprsquo and 523 for thelsquodownrsquo positions of the pedal correspondingly The relativelylow level of correct detections for the position lsquodownrsquo canbe explained by the experiment setup (the monkey did notmake a sufficient effort to reach this pedal) At the same timeL1-Penalized NPLS outperforms the generic NPLS It can beexplained by excluding non-informative electrodes from thepredictive model by the penalized version of algorithm

The CL can be considered as the basis for the comparisonof BCIs of different paradigms and using different criteria TheCL for the OP criterion is about 65 for our experimentswhich corresponds to CL of the synchronized BCI with 15equiprobable classes For comparison in previous studies forthe case of the BCI with eight equiprobable classes (CL =125) decoding accuracy (DA) asymp 25 was achieved usingone electrode whereas DA asymp 50 using eight electrodes(Rickert et al 2005) DA asymp 50 was demonstrated for tenelectrodes for the local field potential recordings (Mehringet al 2003) and up to 60 for ECoG (Ball et al 2009) Finallyonly the intervals preceding the events were used to create thepredictive model while in most studies the intervals beforeand after the movement are analyzed as well (eg Ball et al2009)

(A)

(B)

Figure A1 Maximum of correlation between wavelet coefficientsand the signal of the pedal for all recordings (A) and groupedaccording to the pedal position (B) Mean values are shown as solidlines

We consider the presented method as a prospectivemachine-learning approach for BCI studies It does not dependon any prior neurophysiological knowledge and allows fullyautomatic system calibration and features selection accordingto the particular modality of analysis Computational efficiencyof the algorithm is sufficient for real-time BCI systemsbased on standard portable computers Additional real-timeclosed-loop BCI experiments will allow us to study long-term robustness stability of predictive models brain plasticityetc

Acknowledgments

This work was partially supported by project CEICoBI Nanosciences Foundation RTRA Edmond J SafraPhilanthropic Foundation Fondation de lrsquoAvenir France

Appendix A Alternating least squares

ALS is an iterative procedure Vectors w1 w2 w3 areinitialized by 1 Then ALS fixes all the projectors exceptone which is estimated in a least-square sense

w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9


Figure A2 The first factor frequency temporal and spatial projections for every position of the pedal The values of elements of the spatialprojectors are shown in colors according to the color bar the electrode positions are indicated by their numbers

w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2

are fixed As a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)

where matrix Z(i) is the unfolding of the tensor Z alongthe modality i wi1i2 = vect

(wi1 wi2

) The solutions of

the optimization problems are w1 = Z(1)w23

(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13

)minus1 w3 = Z(3)w12

(wT

12w12

)minus1

The ALS procedure is repeated until convergence

Appendix B L1-Penalized alternating least squares

L1-Penalized alternating least squares is an iterative procedureVectors w1 w2 w3 are initialized by 1 Then all projectorsexcept one are fixed that leads to the task

wi = arg minwi


)

i = 1 2 3

10


Figure A3 L1-Penalized NPLS calibration impact (weights) of components of different modalities on the predictive model for each pedalposition (according to the MI analysis) Spatial modalities are represented by the graphs and corresponding color maps

Consider the particular case where i = 1

w1 = arg minw1

(Z minus w1 w2 w32F + λ1w11

)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)

The optimization problem related to the 1-norm penalizationcan be solved using various approaches (see Schmidt 2005)In this study we applied an algorithm proposed in Shevadeand Keerthi (2003) Schmidt (2005) The advantages of this

algorithm are its simplicity and its low iteration cost aswell as its low memory consumption We have applied thisalgorithm to solve the optimization task (B1) Namely theanti-gradient of a functional

(∥∥Z(1) minus w1wT23

∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)

For the first iteration w1 is set equal to zero minusG(0+) =2wT

23ZT(1) minus λ11 Then the elements of w1 with the largest

magnitude of the anti-gradient are added to a set of lsquofreersquovariables These lsquofreersquo variables are optimized in a lsquoone at a

11


Figure A4 Generic NPLS calibration impact (weights) of components of different modalities on the predictive model for each pedalposition (according to the MI analysis) Spatial modalities are represented by the graphs and corresponding color maps

timersquo way For details see Shevade and Keerthi (2003) Notethat if λ1 λmax = max

(2wT

23ZT(1)

) the method returns as a

solution w1 = 0

Appendix C Mother wavelet selection

Several mother wavelets ψ (Meyer Morlet Symlet lsquo7rsquo andlsquo8rsquo 2nd and 10th order Debauchies Coiflets lsquo5rsquo and Haar)were compared to be used for signal decomposition Resultswere evaluated with respect to the maximum level of Pearsoncorrelation between the absolute value of the waveletrsquoscoefficients Cψ (t minus τ s) and the signal of the pedal y(t)

Rψ = maxsτ

corr

t

(∣∣Cψ (t minus τ s)∣∣ y(t)

)using all recordings

representing all series of experiments Pearson correlationcoefficients were calculated for scale factors s correspondingto the frequencies of the band [10 300] Hz and time shiftsτ isin [0 05] s y(t) isin 0 1 represents the position of thepedal at the moment t Comparison shows that the Haarmother wavelet leads to an unstable and relatively low level ofcorrelation whereas the performances of all other wavelets arecomparable (figure A1) Considering computational efficiency(Sherwood and Derakhshani 2009) the Meyer wavelet waschosen as the mother wavelet to form the tensor of observationX

12


Figure A5 Single-electrode calibration impact on the predictive model of the components of the different modalities according to the MIanalysis for each pedal position

References

Acar E Aykut-Bingol C Bingol H Bro R and Yener B 2007Multiway analysis of epilepsy tensors Bioinformatics 23 i10ndash8

Akaike H 1974 A new look at the statistical model identificationIEEE Trans Autom Control 19 716ndash23

Ball T Schulze-Bonhage A Aertsen A and Mehring C 2009Differential representation of arm movement direction inrelation to cortical anatomy and function J Neural Eng6 016006

Barachant A Aksenova T and Bonnet S 2008 Filtrage spatialrobuste a partir drsquoun sous-ensemble optimal drsquoelectrodes enBCI EEG 22nd Coll GRETSI (Dijon France) 2009

Barachant A and Bonnet S 2011 Channel selection procedure usingRiemannian distance for BCI applications 5th Int IEEEEMBSConf on Neural Engineering (NER) 2011 pp 348ndash51

Bashashati A Fatourechi M Ward R K and Birch G E 2007 Asurvey of signal processing algorithms in brainndashcomputerinterfaces based on electrical brain signals J Neural Eng4 R32ndash57

Borisoff J F Mason S G Bashashati A and Birch G E 2004Brainndashcomputer interface design for asynchronous controlapplications improvements to the LF-ASD asynchronous brainswitch IEEE Trans Biomed Eng 51 985ndash92

Bro R 1996 Multiway calibration Multi-linear PLS J Chemom10 47ndash61

Chao Z C Nagasaka Y and Fujii N 2010 Long-term asynchronousdecoding of arm motion using electrocorticographic signals inmonkeys Front Neuroeng 3 1ndash10

Cook R D and Weisberg S 1982 Residuals and Influence inRegression (London Chapman and Hall)

Devijver P A and Kittler J 1982 Pattern Recognition A StatisticalApproach (London Prentice-Hall)

Eliseyev A Moro C Costecalde T Torres N Gharbi S Mestais CBenabid A L and Aksenova T 2011 Iterative N-way PLS forself-paced BCI in freely moving animals J Neural Eng8 046012

Fatourechi M Birch G E and Ward R K 2007 A self-paced braininterface system that uses movement related potentials andchanges in the power of brain rhythms J Comput Neurosci23 21ndash37

Fatourechi M Ward R K and Birch G E 2008 A self-pacedbrainndashcomputer interface system with a low false positive rateJ Neural Eng 5 9ndash23

Geladi P and Kowalski B R 1986 Partial least-squares regression atutorial Anal Chim Acta 185 1ndash17

Huggins J Levine S P BeMent S L Kushwaha R K Schuh L APassaro E A Rohde M M Ross D A Elisevich K Vand Smith B J 1999 Detection of event-related potentials fordevelopment of a direct brain interface J Clin Neurophysiol16 448

Kolda T G and Bader B W 2007 Tensor decompositions andapplications Sandia Report SAND2007-6702

Lal T N Schroder M Hinterberger T Weston J Bogdan MBirbaumer N and Scholkopf B 2004 Support vector channelselection in BCI IEEE Trans Biomed Eng 51 1003ndash10

Lan T Erdogmus D Adami A Pavel M and Mathan S 2005 SalientEEG channel selection in brain computer interfaces by mutualinformation maximization 27th Annu Int Conf of Engineeringin Medicine and Biology Society (IEEE-EMBS 2005)pp 7064ndash7

Land S and Friedman J 1996 Variable fusion a new method ofadaptive signal regression Technical Report Department ofStatistics Stanford University Stanford

Leeb R Settgast V Fellner D W and Pfurtscheller G 2007Self-paced exploring of the Austrian National Library throughthoughts Int J Bioelectromagnetism 9 237ndash44

Li J and Zhang L 2010 Regularized tensor discriminant analysis forsingle trial EEG classification in BCI Pattern Recognit Lett31 619ndash28

Li J Zhang L Tao D Sun H and Zhao Q 2009 A priorneurophysiologic knowledge free tensor-based scheme forsingle trial EEG classification IEEE Trans Neural SystRehabil Eng 17 107ndash15

Lv J and Liu M 2008 Common spatial pattern and particle swarmoptimization for channel selection in BCI 3rd Int Conf onInnovative Computing Information and Control (ICICIC rsquo08)p 457

Martınez-Montes E Sanchez-Bornot J M and Valdes-Sosa P A 2008Penalized PARAFAC analysis of spontaneous EEEGrecordings Statistica Sinica 18 1449ndash64

Martınez-Montes E Valdes-Sosa P A Miwakeichi F Goldman R Iand Cohen M S 2004 Concurrent EEGfMRI analysis bymultiway partial least squares NeuroImage 22 1023ndash34

Mason S G and Birch G E 2000 A brain-controlled switch forasynchronous control applications IEEE Trans BiomedEng47 1297ndash307

Mehring C Rickert J Vaadia E Cardosa de O S Aertsen Aand Rotter S 2003 Inference of hand movements from localfield potentials in monkey motor cortex Nat Neurosci6 1253ndash4

Moslashrup M Hansen L K and Arnfred S M 2008 Algorithms for sparsenonnegative tucker decomposition Neural Comput 20 2112ndash31

13


Moslashrup M Hansen L K Herrmann C S Parnas J and Arnfred S M2006 Parallel factor analysis as an exploratory tool for wavelettransformed event-related EEG NeuroImage 29 938ndash47

Muller-Putz G R Kaiser V Solis-Escalante T and Pfurtscheller G2010 Fast set-up asynchronous brain-switch based on detectionof foot motor imagery in 1-channel EEG Med Biol EngComput 48 229ndash33

Nazarpour K Sanei S Shoker L and Chambers J A 2006 Parallelspacendashtimendashfrequency decomposition of EEG signals for braincomputer interfacing 14th Eur Signal Processing Conf(EUSIPCO 2006) (Florence Italy 4ndash8 Sept 2006)

Pfurtscheller G Graimann B Huggins J E Levinec S Pand Schuhe L A 2003 Spatiotemporal patterns of betadesynchronization and gamma synchronization incorticographic data during self-paced movement ClinNeurophysiol 114 1226ndash36

Phan A H Cichocki A and Vu-Dinh T 2010 A tensorial approach tosingle trial recognition for brain computer interface Int Confon Advanced Technologies for Communications 2010 pp138ndash41

Qian K Nikolov P Huang D Fei D Y Chen X and Bai O 2010 Amotor imagery-based online interactive brainndashcontrolledswitch paradigm development and preliminary test ClinNeurophysiol 121 1303ndash13

Rickert J Oliveira S C Vaadia E Aertsen A Rotter Sand Mehring C 2005 Encoding of movement direction indifferent frequency ranges of motor cortical local fieldpotentials J Neurosci 25 8815ndash24

Rijsbergen C J 1979 Information retrieval Available athttpwwwdcsglaacuksimiainkeith

Sannelli C Dickhaus T Halder S Hammer E M Muller K Rand Blankertz B 2010 On optimal channel configurations forSMR-based brainndashcomputer interfaces Brain Topogr23 186ndash93

Schalk G Kubanek G Miller K J Anderson N R Leuthardt E COjemann J G Limbrick D Moran D Gerhardt L Aand Wolpaw J R 2007 Decoding two-dimensional movementtrajectories using electrocorticographic signals in humansJ Neural Eng 4 264ndash75

Schalk G Miller K J Anderson N R Wilson J A Smyth M DOjemann J G Moran D W Wolpaw J R and Leuthardt E C2008 Two-dimensional movement control usingelectrocorticographic signals in humans J Neural Eng 5 75ndash84

Scherer R Lee F Schlgl A Leeb R Bischof H and Pfurtscheller G2008 Toward self-paced brainndashcomputer communicationnavigation through virtual worlds IEEE Trans Biomed Eng55 675ndash82

Schlogl A Kronegg J Huggins J and Mason S G 2007 Evaluationcriteria in BCI research ed G Dornhege J R MillanT Hinterberger D McFarland and K R Muller TowardsBrain-Computer Interfacing (Cambridge MA MIT Press)

Schmidt M 2005 Least squares optimization with L1-normregularization Project Report University of British Columbia

Schwartz G 1978 Estimating the dimension of a model Ann Stat6 461ndash4

Sherwood J and Derakhshani R 2009 On classifiability of waveletfeatures for EEG-based brainndashcomputer interfaces Proc IntJoint Conf on Neural Networks pp 2508ndash15

Shevade S K and Keerthi S S 2003 A simple and efficient algorithmfor gene selection using sparse logistic regressionBioinformatics 19 2246ndash53

Tibshirani R 1996 Regression shrinkage and variable selection viathe lasso J R Stat Soc B 58 267ndash88

Tychonoff A N and Arsenin V Y 1977 Solution of Ill-posedProblems (Washington DC Winston amp Sons)

Vidaurre C Kramer N Blankertz B and Schlogl A 2009 Timedomain parameters as a feature for EEG-based brain computerinterfaces Neural Netw 22 1313ndash9

Wang Y Gao S and Gao X 2005 Common spatial pattern methodfor channel selection in motor imagery based brainndashcomputerinterface 27th Annu Int Conf of the Engineering in Medicineand Biology Society (IEEE-EMBS 2005) pp 5392ndash5

Wolpaw J R Birbaumer N McFarland D J Pfurtscheller Gand Vaughan T M 2002 Brainndashcomputer interfaces forcommunication and control Clin Neurophysiol 113 767ndash91

Yates F 1933 The analysis of replicated experiments when the fieldresults are incomplete Empire J Exp Agric 1 129

Zhao Q Caiafa C F Cichocki A Zhang L and Phan A H 2009 Sliceoriented tensor decomposition of EEG data for featureextraction in space frequency and time domains Lecture Notesin Computer Science vol 5863 (Berlin Springer) pp 221ndash8

Zhu J and Yao T 2004 An evaluation of statistical spam filteringtechniques ACM Trans Asian Lang Inf Proc (TALIP)3 243ndash69

Zou H and Hastie T 2005 Regularization and variable selection viathe elastic net J R Stat Soc B 67 301ndash20

14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References

IOP PUBLISHING JOURNAL OF NEURAL ENGINEERING

J Neural Eng 9 (2012) 045010 (14pp) doi1010881741-256094045010

L1-Penalized N-way PLS for subset ofelectrodes selection in BCI experimentsAndrey Eliseyev12 Cecile Moro2 Jean Faber12 Alexander Wyss2Napoleon Torres2 Corinne Mestais2 Alim Louis Benabid23

and Tetiana Aksenova124

1 Foundation Nanosciences Grenoble France2 ClinatecLETICEA Grenoble France3 Joseph Fourier University of Grenoble France

E-mail tetianaaksenovaceafr

Received 11 November 2011Accepted for publication 18 April 2012Published 25 July 2012Online at stacksioporgJNE9045010

AbstractRecently the N-way partial least squares (NPLS) approach was reported as an effective toolfor neuronal signal decoding and brainndashcomputer interface (BCI) system calibration Thismethod simultaneously analyzes data in several domains It combines the projection of a datatensor to a low dimensional space with linear regression In this paper the L1-Penalized NPLSis proposed for sparse BCI system calibration allowing uniting the projection technique withan effective selection of subset of features The L1-Penalized NPLS was applied for the binaryself-paced BCI system calibration providing selection of electrodes subset Our BCI system isdesigned for animal research in particular for research in non-human primates

(Some figures may appear in colour only in the online journal)

Introduction

The multi-way (tensor-based) analysis recently was reportedas an effective tool for neuronal signal processing (Martınez-Montes et al 2004 Nazarpour et al 2006 Acar et al2007 Moslashrup et al 2006 2008 Zhao et al 2009) Theadvantage of this approach is the simultaneous treatment ofdata in several domains (modalities or ways of analysis)to improve information extraction Spatial frequency andtemporal modalities are mostly considered in neuronal signalprocessing (Pfurtscheller et al 2003 Schalk et al 2007Vidaurre et al 2009) For the multi-way data analysisobservations are represented in a form of multi-way arrays(tensors) To map the neuronal recordings to the spatialndashfrequencyndashtemporal space wavelet transform is mainly used

Recently the multi-way analysis was reported as a tool forneuronal signal decoding in brainndashcomputer interface (BCI)studies (Nazarpour et al 2006 Zhao et al 2009 Chao et al2010 Phan et al 2010 Li et al 2009 Li and Zhang 2010Eliseyev et al 2011) BCI aims to provide an alternative

4 Author to whom any correspondence should be addressed

non-muscular communication pathway to send commandsto the external world by means of analysis of recordedbrain neuronal activity Tensor-based approaches have beenapplied to decode electroencephalograms (EEG) (Zhao et al2009 Phan et al 2010 Li et al 2009 Li and Zhang 2010)and electrocorticograms (ECoG) (Eliseyev et al 2011 Chaoet al 2010) associated with cue-paced (Nazarpour et al 2006Phan et al 2010) and self-paced (Eliseyev et al 2011 Chaoet al 2010) BCI paradigms The cue-paced (synchronized)control strategy uses external cues for driving the interactionbetween subjects and the BCI system Thus the users aresupposed to generate commands only during specific periodsAs opposed to the cue-paced systems no stimulus is used forself-paced BCI systems As users control them intentionallyself-paced BCI systems provide more freedom and controlflexibility However they are based on continuous monitoringof neuronal activity and are more difficult to be realized Asa result most reported BCIs are synchronized (for instancesee Wolpaw et al 2002 Schalk et al 2008) Even if self-paced tasks were carried out only selected time intervals(trials) corresponding to task preparation and execution wereclassified (eg Ball et al 2009) Nevertheless several groups

1741-256012045010+14$3300 1 copy 2012 IOP Publishing Ltd Printed in the UK amp the USA








Methods

Generic NPLS





i1 i2 iN Here


2




X = t1 w1 w2 w3 + E1











wi = arg minwi


)

i = 1 2 3


(see appendix B)




Influence analysis



Application

Data description


3


(A)

(B)

(C)(D)







4







BCI evaluation






5


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References








Methods

Generic NPLS





i1 i2 iN Here


2




X = t1 w1 w2 w3 + E1











wi = arg minwi


)

i = 1 2 3


(see appendix B)




Influence analysis



Application

Data description


3


(A)

(B)

(C)(D)







4







BCI evaluation






5


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References




X = t1 w1 w2 w3 + E1











wi = arg minwi


)

i = 1 2 3


(see appendix B)




Influence analysis



Application

Data description


3


(A)

(B)

(C)(D)







4







BCI evaluation






5


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References


(A)

(B)

(C)(D)







4







BCI evaluation






5


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References







BCI evaluation






5


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References


(A)

(B)




Results


















6





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References





7











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References











Discussion



8





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References





(A)

(B)



Acknowledgments




w1 = arg minw1

(Z minus w1 w2 w32F

) w2 w3 are fixed

9



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References



w2 = arg minw2

(Z minus w1 w2 w32F

) w1 w3 are fixed

w3 = arg minw3

(Z minus w1 w2 w32F

) w1 w2


w1 = arg minw1

(Z(1) minus w1wT232

2

)

w2 = arg minw2

(Z(2) minus w2wT132

2

)

w3 = arg minw3

(Z(3) minus w3wT122

2

)


(wi1 wi2

) The solutions of


(wT

23w23

)minus1

w2 = Z(2)w13

(wT

13w13


(wT

12w12

)minus1




wi = arg minwi


)

i = 1 2 3

10




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References




w1 = arg minw1


)

or as a matrix

w1 = arg minw1

(Z(1) minus w1wT232

2 + λ1w11)

(B1)




∥∥2

F+ λ1

∥∥w1∥∥

1

)

was calculated

minus G(w1) = 2wT23

(ZT

(1) minus w23(w1)T

) minus λ1sign(w1)




11




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References




(2wT

23ZT(1)


solution w1 = 0



Rψ = maxsτ

corr

t




12



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References



References






























13




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References




























14

Introduction

Methods

Generic NPLS


Influence analysis

Application

Data description


BCI evaluation


Results





Discussion

Acknowledgments




References

L1-Penalized N-way PLS for subset of electrodes selection in BCI experiments

Documents

Transcript of L1-Penalized N-way PLS for subset of electrodes selection in BCI experiments