HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane...
-
Upload
colleen-fluck -
Category
Documents
-
view
215 -
download
0
Transcript of HDLSS Asy’s: Geometrical Represent’n Assume, let Study Subspace Generated by Data Hyperplane...
HDLSS Asyrsquos Geometrical Representrsquon
Assume let
Study Subspace Generated by Data
Hyperplane through 0
of dimension
Points are ldquonearly equidistant to 0rdquo
amp dist
Within plane can
ldquorotate towards Unit Simplexrdquo
All Gaussian data sets are
ldquonear Unit Simplex Verticesrdquo
ldquoRandomnessrdquo appears
only in rotation of simplex
n
d ddn INZZ 0~1
d
d
Hall Marron amp Neeman (2005)
HDLSS Asyrsquos Geometrical Represenrsquotion
Explanation of Observed (Simulation) Behavior
ldquoeverything similar for very high d rdquo
bull 2 popnrsquos are 2 simplices (ie regular n-hedrons)bull All are same distance from the other classbull ie everything is a support vectorbull ie all sensible directions show ldquodata pilingrdquobull so ldquosensible methods are all nearly the samerdquo
2nd Paper on HDLSS Asymptotics
Notes on Kentrsquos Normal Scale Mixture
bull Data Vectors are indeprsquodent of each other
bull But entries of each have strong dependrsquoce
bull However can show entries have cov = 0
bull Recall statistical folklore
Covariance = 0 Independence
ddddiININX 10050050~
0 Covariance is not independence
Simple Example c to make cov(XY) = 0
0 Covariance is not independence
Result
bull Joint distribution of and ndash Has Gaussian marginals
ndash Has
ndash Yet strong dependence of and
ndash Thus not multivariate Gaussian
Shows Multivariate Gaussian means more
than Gaussian Marginals
YX
0cov YX
X Y
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Asyrsquos Geometrical Represenrsquotion
Explanation of Observed (Simulation) Behavior
ldquoeverything similar for very high d rdquo
bull 2 popnrsquos are 2 simplices (ie regular n-hedrons)bull All are same distance from the other classbull ie everything is a support vectorbull ie all sensible directions show ldquodata pilingrdquobull so ldquosensible methods are all nearly the samerdquo
2nd Paper on HDLSS Asymptotics
Notes on Kentrsquos Normal Scale Mixture
bull Data Vectors are indeprsquodent of each other
bull But entries of each have strong dependrsquoce
bull However can show entries have cov = 0
bull Recall statistical folklore
Covariance = 0 Independence
ddddiININX 10050050~
0 Covariance is not independence
Simple Example c to make cov(XY) = 0
0 Covariance is not independence
Result
bull Joint distribution of and ndash Has Gaussian marginals
ndash Has
ndash Yet strong dependence of and
ndash Thus not multivariate Gaussian
Shows Multivariate Gaussian means more
than Gaussian Marginals
YX
0cov YX
X Y
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
2nd Paper on HDLSS Asymptotics
Notes on Kentrsquos Normal Scale Mixture
bull Data Vectors are indeprsquodent of each other
bull But entries of each have strong dependrsquoce
bull However can show entries have cov = 0
bull Recall statistical folklore
Covariance = 0 Independence
ddddiININX 10050050~
0 Covariance is not independence
Simple Example c to make cov(XY) = 0
0 Covariance is not independence
Result
bull Joint distribution of and ndash Has Gaussian marginals
ndash Has
ndash Yet strong dependence of and
ndash Thus not multivariate Gaussian
Shows Multivariate Gaussian means more
than Gaussian Marginals
YX
0cov YX
X Y
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
0 Covariance is not independence
Simple Example c to make cov(XY) = 0
0 Covariance is not independence
Result
bull Joint distribution of and ndash Has Gaussian marginals
ndash Has
ndash Yet strong dependence of and
ndash Thus not multivariate Gaussian
Shows Multivariate Gaussian means more
than Gaussian Marginals
YX
0cov YX
X Y
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
0 Covariance is not independence
Result
bull Joint distribution of and ndash Has Gaussian marginals
ndash Has
ndash Yet strong dependence of and
ndash Thus not multivariate Gaussian
Shows Multivariate Gaussian means more
than Gaussian Marginals
YX
0cov YX
X Y
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Asyrsquos Geometrical RepresenrsquotionFurther Consequences of Geometric Represenrsquotion
1 DWD more stable than SVM(based on deeper limiting distributions)
(reflects intuitive idea feeling sampling variation)(something like mean vs median)
Hall Marron Neeman (2005)
2 1-NN rule inefficiency is quantified Hall Marron Neeman (2005)
3 Inefficiency of DWD for uneven sample size(motivates weighted version)
Qiao et al (2010)
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Math Stat of PCA
Consistency amp Strong Inconsistency
Spike Covariance Model Paul (2007)
For Eigenvalues
1st Eigenvector
How Good are Empirical Versions
as Estimates
11 21 dddd d
1u
11 ˆˆˆ uddd
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Consistency (big enough spike)
For
Strong Inconsistency (spike not big enough)
For
1
0ˆ 11 uuAngle
1
011 90ˆ uuAngle
HDLSS Math Stat of PCA
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
PC Scores (ie projections)
Not Consistent
So how can PCA find Useful Signals in Data
Key is ldquoProportional Errorsrdquo
Axes have Inconsistent Scales
But Relationships are Still Useful
HDLSS Math Stat of PCA
1ˆ
jji
ji Rs
s
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Interesting Question
Behavior in Very High Dimension
Answer El Karoui (2010)
bull In Random Matrix Limit
bull Kernel Embedded Classifiers ~
~ Linear Classifiers
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
HDLSS Asymptotics amp Kernel Methods
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Interesting Question
Behavior in Very High Dimension
Implications for DWD
Recall Main Advantage is for High d
So not Clear Embedding Helps
Thus not yet Implemented in DWD
HDLSS Asymptotics amp Kernel Methods
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Additional Results
Batch Adjustment Xuxin Liu
Recall Intuition from above
Key is sizes of biological subtypes
Differing ratio trips up mean
But DWD more robust
Mathematics behind this
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Liu Twiddle ratios of subtypes
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Xuxin Liu Dissertation Results
Simple Unbalanced Cluster Model
Growing at rate as
Answers depend on
Visualization of settinghellip
d d
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Asymptotic Results (as )
Let denote ratio between subgroup sizes
d
r
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For PAM Inconsistent
Angle(PAMTruth)
For PAM Strongly Inconsistent
Angle(PAMTruth)
d
2
1
2
1
0 rC
090
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Asymptotic Results (as )
For DWD Inconsistent
Angle(DWDTruth)
For DWD Strongly Inconsistent
Angle(DWDTruth)
d
2
1
2
1
090
0 rC
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Value of and for sample size ratio
only when
Otherwise for both are Inconsistent
rC
22
1cos
2
1
r
rCr
0 rr CC
r
1r
1r
rC
22
1cos
32
31
r
rCr
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and rCrC
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
HDLSS Data Combo Mathematics
Comparison between PAM and DWD
Ie between and
Shows Strong Difference
Explains Above Empirical Observation
rCrC
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Personal Observations
HDLSS world ishellip
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
HDLSS Asymptotics
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Personal Observations
HDLSS world ishellip
Surprising (many times)
[Think Irsquove got it and then hellip]
Mathematically Beautiful ()
Practically Relevant
HDLSS Asymptotics
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
The Future of HDLSS Asymptotics
bull ldquoContiguityrdquo in Hypo Testing
bull Rates of Convergence
bull Improvements of DWD
(eg other functions of distance than inverse)
bull Many Others
It is still early days hellip
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
State of HDLSS Research
DevelopmentOf Methods
MathematicalAssessment
hellip
(thanks todefiantcorbanedugtiptonnet-funiceberghtml)
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Independent Component Analysis
Personal Viewpoint
Directions
(eg PCA DWD)
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Independent Component Analysis
Personal Viewpoint
Directions that maximize independence
Motivating Context Signal Processing
ldquoBlind Source Separationrdquo
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
Independent Component Analysis
References
bull Cardoso (1993) (1st paper)
bull Lee (1998) (early book not reccorsquoed)
bull Hyvaumlrinen amp Oja (1998)
(excellent short tutorial)
bull Hyvaumlrinen Karhunen amp Oja (2001)
(detailed monograph)
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
ldquoCocktail party problemrdquo
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
ldquoCocktail party problemrdquo
bull Hear several simultaneous conversations
bull Would like to
separate them
Model for ldquoconversationsrdquo time series
and ts1 ts2
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Model for ldquoconversationsrdquo time series
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Mixed version of signals
And also a 2nd mixture
tsatsatx 2121111
tsatsatx 2221212
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Mixed version of signals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Goal Recover signal
From data
For unknown mixture matrix
where for all
tx
txtx
2
1
ts
tsts
2
1
2221
1211
aa
aaA
sAx t
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Goal is to find separating weights
so that for allxWs
W
t
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Goal is to find separating weights
so that for all
Problem would be fine
but is unknown
xWs
W
1AW
t
A
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
= matrix of eigenvectorsW
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
1 PCA (on population of 2-d vectors)
Maximal
Variance
Minimal
Variance
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
2 ICA (will describe method later)
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Solutions for Cocktail Party Problem
1PCA (on population of 2-d vectors)
[direction of maximal variation
doesnrsquot solve this problem]
2ICA (will describe method later)
[Independent Components do solve it]
[modulo sign changes and identification]
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Recall original time series
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating ExampleRelation to OODA recall data matrix
dnd
n
n
XX
XX
XXX
1
111
1
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating ExampleRelation to OODA recall data matrix
Signal Processing focus on rows
( time series indexed by )
OODA focus on columns as data objects
( data vectors)
Note 2 viewpoints like ldquodualsrdquo for PCA
dnd
n
n
XX
XX
XXX
1
111
1
d nt 1
n
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals nttsts 1)()( 21
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Study Signals nttsts 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
nttsts 1)()( 21
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Signals - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Study Data nttxtx 1)()( 21
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Scatterplot View (signal processing) Study
bull Signals
bull Corresponding Scatterplot
bull Data
bull Corresponding Scatterplot
nttsts 1)()( 21
nttxtx 1)()( 21
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Data - Corresponding Scatterplot
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Scatterplot View (signal processing) Plot
bull Signals amp Scatrsquoplot
bull Data amp Scatrsquoplot
bull Scatterplots give hint how ICA is possible
bull Affine transformation
Stretches indeprsquot signals into deprsquot
bull Inversion is key to ICA
(even w unknown)
nttsts 1)()( 21
nttxtx 1)()( 21
A
sAx
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Signals - Corresponding Scatterplot
Note Independent
Since Known Value
Of s1 Does Not
Change Distribution
of s2
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Data - Corresponding Scatterplot
Note Dependent
Since Known Value
Of s1 Changes
Distribution of s2
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
PCA - Finds direction of greatest variation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
PCA - Wrong for signal separation
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Motivating Example
Why not PCA
Finds direction of greatest variation
Which is wrong for signal separation
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 1bull sphere the data
(shown on right in scatterplot view)
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 1 sphere the data
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work with
0 I ˆ 21 XZ
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 1bull ldquosphere the datardquo
(shown on right in scatterplot view)bull ie find linear transfrsquon to make
mean = cov = bull ie work withbull requires of full rank
(at least ie no HDLSS)bull search for independent beyond
linear and quadratic structure
0 I ˆ 21 XZ
dnX
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possible
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 2 Cocktail party example
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmICA Step 2bull Find dirrsquons that make (sphered) data
independent as possiblebull Recall ldquoindependencerdquo means
joint distribution is product of marginalsbull In cocktail party example
ndash Happens only when support parallel to axesndash Otherwise have blank areas
but marginals are non-zero
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianity
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmParallel Idea (and key to algorithm) bull Find directions that maximize
non-Gaussianitybull Based on assumption starting from
independent coordinatesbull Note ldquomostrdquo projections are Gaussian
(since projection is ldquolinear combordquo)bull Mathematics behind this
Diaconis and Freedman (1984)
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA Algorithm
Worst case for ICA Gaussian
bull Then sphered data are independent
bull So have independence in all (ortho) dirrsquons
bull Thus canrsquot find useful directions
bull Gaussian distribution is characterized by
Independent amp spherically symmetric
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmCriteria for non-Gaussianity
independence
bull Kurtosis
(4th order cumulant)bull Negative Entropybull Mutual Informationbull Nonparametric Maximum Likelihoodbull ldquoInfomaxrdquo in Neural Networksbull Interesting connections between these
224 3 EXEX
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iteratively
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmMatlab Algorithm (optimizing any of above)
ldquoFastICArdquobull Numerical gradient search methodbull Can find directions iterativelybull Or by simultaneous optimization
(note PCA does both but not ICA)bull Appears fast with good defaultsbull Careful about local optima
(Recco several random restarts)
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmFastICA Notational Summary
1First sphere data
2Apply ICA Find to make
rows of ldquoindeprsquotrdquo
3Can transform back to
original data scale
ˆ 21 XZ
SW
ZWS SS
SSS 21ˆˆ
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmCareful look at identifiability
bull Seen already in above example
bull Could rotate ldquosquare of datardquo in several ways etc
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmCareful look at identifiability
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmIdentifiability Swap Flips
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
(seen as swap above)
Since for a ldquopermutation matrixrdquo
(pre-multiplication by ldquoswaps rowsrdquo)
(post-multiplication by ldquoswaps columnsrdquo)
For each column ie
ie
SS S
P
PP
SS sAz
zPWzPAsP SSS 1
zAs SS
1
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmIdentifiability problem 1
Generally canrsquot order rows of (amp )
Since for a ldquopermutation matrixrdquo
For each column
So and are also ICA solutrsquons
(ie )
FastICA appears to order in terms of ldquohow non-Gaussianrdquo
SS S
PzPWzPAsP SSS
1
SPS SPW
ZPWPS SS
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmIdentifiability problem 2
Canrsquot find scale of elements of
(seen as flips above)
Since for a (full rank) diagonal matrix
(pre-multiplrsquon by is scalar multrsquon of rows)
(post-multiplrsquon by is scalar multrsquon of colrsquos)
Again for each colrsquon
ie
So and are also ICA solutions
D
s
SDW
D
D
SDSzDWzDAsD SSs
1
zAs SS
1
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA AlgorithmSignal Processing Scale identification
(Hyvaumlrinen and Oja 1999)
Choose scale so each signal
has ldquounit average energyrdquo
(preserves energy along rows of data matrix)
Explains ldquosame scalesrdquo in
Cocktail Party Example
)(tsi1)( 2
ti ts
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA amp Non-Gaussianity
Explore main ICA principle
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA amp Non-Gaussianity
Explore main ICA principle
Projections farther from coordinate axes
are more Gaussian
For the dirrsquon vector
where
(thus )
have for large and
kd
k
k
u
u
u
1
dkj
kjku ki
10
121
)10(NXud
t
k k d
1ku
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA amp Non-Gaussianity
Illustrative examples (d = 100 n = 500)
aUniform Marginals
bExponential Marginals
cBimodal Marginals
Study over range of values of k
ICA amp Non-Gaussianity
Illustrative example - Uniform marginals
ICA amp Non-GaussianityIllustrative examples (d = 100 n = 500)
aUniform Marginalsbull k = 1 very poor fit
(Uniform far from Gaussian)bull k = 2 much closer
(Triangular closer to Gaussian)bull k = 4 very close
but still have statrsquoly sigrsquot differencebull k gt= 6 all diffrsquos could be sampling varrsquon
ICA amp Non-Gaussianity
Illustrative example - Exponential Marginals
ICA amp Non-GaussianityIllustrative examples (d = 100 n = 500)
b Exponential Marginalsbull still have convergence to Gaussian
but slower
(ldquoskewnessrdquo has stronger
impact than ldquokurtosisrdquo)bull now need k gt= 25 to see no difference
ICA amp Non-Gaussianity
Illustrative example - Bimodal Marginals
ICA amp Non-GaussianityIllustrative examples (d = 100 n = 500)
c Bimodal Marginalsbull Convergence to Gaussian
Surprisingly fast
bull Quite close for k = 9
ICA amp Non-GaussianitySummary
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-GaussianityProjections farther from coordinate axes
are more Gaussian
Conclusions
I Expect most projrsquons are Gaussian
IINon-Grsquon projrsquons (ICA target) are special
IIIIs a given sample really ldquorandomrdquo
(could test)
IVHigh dimrsquoal space is a strange place
More ICA Examples
Two Sine Waves ndash Original Signals
More ICA Examples
Two Sine Waves ndash Original Scatterplot
Far
From
Indeprsquot
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
ICA amp Non-GaussianitySummary
For indep non-Gaussian
standardized rvrsquos
Projections farther from coordinate axes
are more Gaussian
dX
X
X 1
ICA amp Non-GaussianityProjections farther from coordinate axes
are more Gaussian
Conclusions
I Expect most projrsquons are Gaussian
IINon-Grsquon projrsquons (ICA target) are special
IIIIs a given sample really ldquorandomrdquo
(could test)
IVHigh dimrsquoal space is a strange place
More ICA Examples
Two Sine Waves ndash Original Signals
More ICA Examples
Two Sine Waves ndash Original Scatterplot
Far
From
Indeprsquot
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
ICA amp Non-GaussianityProjections farther from coordinate axes
are more Gaussian
Conclusions
I Expect most projrsquons are Gaussian
IINon-Grsquon projrsquons (ICA target) are special
IIIIs a given sample really ldquorandomrdquo
(could test)
IVHigh dimrsquoal space is a strange place
More ICA Examples
Two Sine Waves ndash Original Signals
More ICA Examples
Two Sine Waves ndash Original Scatterplot
Far
From
Indeprsquot
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash Original Signals
More ICA Examples
Two Sine Waves ndash Original Scatterplot
Far
From
Indeprsquot
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash Original Scatterplot
Far
From
Indeprsquot
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash Mixed Input Data
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash Scatterplot amp PCA
Clearly
Wrong
Recovery
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash Scatterplot for ICA
Looks
Very
Good
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Two Sine Waves ndash ICA Reconstruction
Excellent
Despite
Non-Indeprsquot
Scatteplot
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian
Try Another Pair of Signals
More Like ldquoSignal + Noiserdquo
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash Original Signals
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash Original Scatterplot
Well
Set
For
ICA
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash Mixed Input Data
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash Scatterplot amp PCA
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash PCA Reconstruction
Got Sine
Wave
+
Noise
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash Scatterplot ICA
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Sine and Gaussian ndash ICA Reconstruction
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian
Try Another Pair of Signals
Understand Assumption of
One Not Gaussian
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash Original Signals
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash Original Scatterplot
Caution
Indeprsquot
In All
Directions
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash Mixed Input Data
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash PCA Scatterplot
Exploits
Variation
To Give
Good
Diredtions
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash PCA Reconstruction
Looks
Good
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash ICA Scatterplot
No Clear
Good
Rotation
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash ICA Reconstruction
Is It
Bad
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Both Gaussian ndash Original Signals
Check
Against
Original
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Now Try FDA examples ndash Recall Parabolas
Curves As
Data Objects
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Now Try FDA examples ndash Parabolas
PCA Gives
Interpretable
Decomposition
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
Now Try FDA examples ndash Parabolas
Sphering
Loses
Structure
ICA Finds
Outliers
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Same Curves
plus ldquoOutliersrdquo
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
Impact all of
bull Mean
bull PC1 (slightly)
bull PC2 (dominant)
bull PC3 (tilt)
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas w 2 Outliers
ICA
Misses Main
Directions
But Finds
Outliers
(non-Gaussian)
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
Recall 2 Clear
Clusters
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
PCA
Clusters
amp Other
Structure
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA
Does Not
Find Clusters
Reason
Random Start
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Scary Issue
Local Minima in Optimization
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 1 Use PCA to Start
Worked Here
But Not
Always
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2
Use Multiple Random Starts
bull Shows When Have Multiple Minima
bull Range Should Turn Up Good Directions
bull More to Look At Interpret
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
3rd IC Dirrsquon
Looks Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
2nd
Looks
Good
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
More ICA Examples
FDA example ndash Parabolas Up and Down
ICA Solution 2 Multiple Random Starts
Never
Finds
Clusters
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-
ICA Overview
Interesting Method has Potential
Great for Directions of Non-Gaussianity
Eg Finding Outliers
Common Application Area FMRI
Has Its Costs
Slippery Optimization
Interpetation Challenges
- HDLSS Asyrsquos Geometrical Representrsquon
- HDLSS Asyrsquos Geometrical Represenrsquotion
- 2nd Paper on HDLSS Asymptotics
- 0 Covariance is not independence
- 0 Covariance is not independence (2)
- HDLSS Asyrsquos Geometrical Represenrsquotion (2)
- HDLSS Math Stat of PCA
- HDLSS Math Stat of PCA (2)
- HDLSS Math Stat of PCA (3)
- HDLSS Asymptotics amp Kernel Methods
- HDLSS Asymptotics amp Kernel Methods (2)
- HDLSS Asymptotics amp Kernel Methods (3)
- HDLSS Additional Results
- Liu Twiddle ratios of subtypes
- HDLSS Data Combo Mathematics
- HDLSS Data Combo Mathematics (2)
- HDLSS Data Combo Mathematics (3)
- HDLSS Data Combo Mathematics (4)
- HDLSS Data Combo Mathematics (5)
- HDLSS Data Combo Mathematics (6)
- HDLSS Data Combo Mathematics (7)
- HDLSS Data Combo Mathematics (8)
- HDLSS Data Combo Mathematics (9)
- HDLSS Data Combo Mathematics (10)
- HDLSS Asymptotics
- HDLSS Asymptotics (2)
- HDLSS Asymptotics (3)
- HDLSS Asymptotics (4)
- The Future of HDLSS Asymptotics
- State of HDLSS Research
- Independent Component Analysis
- Independent Component Analysis (2)
- Independent Component Analysis (3)
- Independent Component Analysis (4)
- ICA Motivating Example
- ICA Motivating Example (2)
- ICA Motivating Example (3)
- ICA Motivating Example (4)
- ICA Motivating Example (5)
- ICA Motivating Example (6)
- ICA Motivating Example (7)
- ICA Motivating Example (8)
- ICA Motivating Example (9)
- ICA Motivating Example (10)
- ICA Motivating Example (11)
- ICA Motivating Example (12)
- ICA Motivating Example (13)
- ICA Motivating Example (14)
- ICA Motivating Example (15)
- ICA Motivating Example (16)
- ICA Motivating Example (17)
- ICA Motivating Example (18)
- ICA Motivating Example (19)
- ICA Motivating Example (20)
- ICA Motivating Example (21)
- ICA Motivating Example (22)
- ICA Motivating Example (23)
- ICA Motivating Example (24)
- ICA Motivating Example (25)
- ICA Motivating Example (26)
- ICA Motivating Example (27)
- ICA Motivating Example (28)
- ICA Motivating Example (29)
- ICA Motivating Example (30)
- ICA Motivating Example (31)
- ICA Motivating Example (32)
- ICA Motivating Example (33)
- ICA Motivating Example (34)
- ICA Algorithm
- ICA Algorithm (2)
- ICA Algorithm (3)
- ICA Algorithm (4)
- ICA Algorithm (5)
- ICA Algorithm (6)
- ICA Algorithm (7)
- ICA Algorithm (8)
- ICA Algorithm (9)
- ICA Algorithm (10)
- ICA Algorithm (11)
- ICA Algorithm (12)
- ICA Algorithm (13)
- ICA Algorithm (14)
- ICA Algorithm (15)
- ICA Algorithm (16)
- ICA Algorithm (17)
- ICA Algorithm (18)
- ICA Algorithm (19)
- ICA Algorithm (20)
- ICA Algorithm (21)
- ICA Algorithm (22)
- ICA Algorithm (23)
- ICA Algorithm (24)
- ICA Algorithm (25)
- ICA amp Non-Gaussianity
- ICA amp Non-Gaussianity (2)
- ICA amp Non-Gaussianity (3)
- ICA amp Non-Gaussianity (4)
- ICA amp Non-Gaussianity (5)
- ICA amp Non-Gaussianity (6)
- ICA amp Non-Gaussianity (7)
- ICA amp Non-Gaussianity (8)
- ICA amp Non-Gaussianity (9)
- ICA amp Non-Gaussianity (10)
- ICA amp Non-Gaussianity (11)
- More ICA Examples
- More ICA Examples (2)
- More ICA Examples (3)
- More ICA Examples (4)
- More ICA Examples (5)
- More ICA Examples (6)
- More ICA Examples (7)
- More ICA Examples (8)
- More ICA Examples (9)
- More ICA Examples (10)
- More ICA Examples (11)
- More ICA Examples (12)
- More ICA Examples (13)
- More ICA Examples (14)
- More ICA Examples (15)
- More ICA Examples (16)
- More ICA Examples (17)
- More ICA Examples (18)
- More ICA Examples (19)
- More ICA Examples (20)
- More ICA Examples (21)
- More ICA Examples (22)
- More ICA Examples (23)
- More ICA Examples (24)
- More ICA Examples (25)
- More ICA Examples (26)
- More ICA Examples (27)
- More ICA Examples (28)
- More ICA Examples (29)
- More ICA Examples (30)
- More ICA Examples (31)
- More ICA Examples (32)
- More ICA Examples (33)
- More ICA Examples (34)
- More ICA Examples (35)
- More ICA Examples (36)
- More ICA Examples (37)
- More ICA Examples (38)
- More ICA Examples (39)
- ICA Overview
-