Principal Component Analysis for SPAT PG course Joanna D. Haigh.
-
Upload
santino-jaycox -
Category
Documents
-
view
217 -
download
0
Transcript of Principal Component Analysis for SPAT PG course Joanna D. Haigh.
![Page 1: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/1.jpg)
Principal Component Analysis
for SPAT PG course
Joanna D. Haigh
![Page 2: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/2.jpg)
PCA also known as…
• Empirical Orthogonal Function (EOF) Analysis
• Singular Value Decomposition• Hotelling Transform• Karhunen-Loève Transform
11 Nov 2013
![Page 3: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/3.jpg)
Purpose/applications
• To identify internal structure in a dataset (e.g. “modes of variability”)
• Data compression – by identifying redundancy, reducing dimensionality
• Noise reduction• Feature identification, classification….
11 Nov 2013
![Page 4: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/4.jpg)
Basic approach
Data measured as function of two variables • E.g. surface pressure (space, time)• If measurements at two points in space are
highly correlated in time then we only need one measure (not two) as a function of time to identify their behaviour.
• How many measures we need overall depends on correlations between each point and every other.
11 Nov 2013
![Page 5: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/5.jpg)
Correlations
11 Nov 2013
value at point 1va
lue
at p
oint
2
• measurements at point 1 and point 2 highly correlated• main (average) signal is measure in direction of PC1• deviations (the interesting bit?) are in PC2
PC1PC2
• to calculate PCs we need to rotate axes• with M points just rotate in M dimensions
1
2
![Page 6: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/6.jpg)
11 Nov 2013
ApproachE.g. data measured N times at M spatial pointsIn M-dimensional spacei. Find axis of greatest correlation, i.e. main
variability, this is PC1.ii. Find axis orthogonal to this of next highest
variability, this is PC2.iii. Continue until M new axes, i.e. M PCs.Each PC is composed of a weighted average of the
original axes. The weightings are the EOFs.
![Page 7: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/7.jpg)
Concept
• Often it is possible to identify a particular mode/feature with an EOF.
• Each PC indicates the variation with time (in our example) of the mode identified with its EOF.
• Once EOFs established can project other datasets (e.g. different time periods) onto them to compare behaviours.
11 Nov 2013
![Page 8: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/8.jpg)
ENSO as EOF1 of SST data
• EOF1 of tropical Pacific SSTs:576 monthly anomalies Jan 1950 - Dec 1997• EOF1 explains 45% of the total SST variance
over this domain.
11 Nov 2013
http://www.esrl.noaa.gov/psd/enso/impacts/currentclimo.html
![Page 9: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/9.jpg)
Maths
• Calculate MxM covariance matrix• Find eigenvectors and eigenvalues• EOFs are the M eigenvectors, ranked in
order of decreasing eigenvalue• Eigenvalues give measure of variance• PCs from decomposition of data onto
EOFs.
11 Nov 2013
![Page 10: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/10.jpg)
Examples of applications
11 Nov 2013
Application M N Visualise data EOFs:weightings of
PCs
Meteorology space time Time series at each place (or map at each time)
places(maps)
Time series of EOFs maps
Earth obs(e.g. land cover)
spectral bands
space Map in each wavelength band
bands Maps of band combos
Earth obs(e.g. cloud)
cases wave-length
Spectrum for each case
cases Spectra of case combos
Polarity of IMF
Solar longitude
time IMF polarity f(longitude) at each time
longitudes Time series of lon. distbn
![Page 11: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/11.jpg)
High cloud E. Asia
Kang et al (1997)
11 Nov 2013
![Page 12: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/12.jpg)
Southern Annular Mode
geopotential height of 1000hPa surface
11 Nov 2013
![Page 13: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/13.jpg)
Examples of applications
11 Nov 2013
Application M N Visualise data EOFs:weightings of
PCs
Meteorology space time Time series at each place (or map at each time)
places(maps)
Time series of EOFs maps
Earth obs(e.g. land cover)
spectral bands
space Map in each wavelength band
bands Maps of band combos
Earth obs(e.g. cloud)
cases wave-length
Spectrum for each case
cases Spectra of case combos
Polarity of IMF
Solar longitude
time IMF polarity f(longitude) at each time
longitudes Time series of lon. distbn
![Page 14: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/14.jpg)
Landsat Thematic Mapper (Wageningen)
11 Nov 2013
0.5 0.6 0.7 µm
0.8 1.6 2.2
![Page 15: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/15.jpg)
example of TM EOFs (unnormalised)
[NB not for Wageningen images]
11 Nov 2013
µm0.50.60.7 0.81.62.211.5
eigenvalues: 1011 131 38 7 4 2 1 EOF: 1 2 3 4 5 6 7
![Page 16: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/16.jpg)
Examples of applications
11 Nov 2013
Application M N Visualise data EOFs:weightings of
PCs
Meteorology space time Time series at each place (or map at each time)
places(maps)
Time series of EOFs maps
Earth obs(e.g. land cover)
spectral bands
space Map in each wavelength band
bands Maps of band combos
Earth obs(e.g. cloud)
cases wave-length
Spectrum for each case
cases Spectra of case combos
Polarity of IMF
Solar longitude
time IMF polarity f(longitude) at each time
longitudes Time series of lon. distbn
![Page 17: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/17.jpg)
Modelled IR spectra of cirrus cloud
Bantges et al (1999)
11 Nov 2013
![Page 18: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/18.jpg)
PC0: Average
PC1: Ice water path
PC2: Effective radius
PC3: Aspect ratio
Bantges et al (1999)
11 Nov 2013
![Page 19: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/19.jpg)
Examples of applications
11 Nov 2013
Application M N Visualise data EOFs:weightings of
PCs
Meteorology space time Time series at each place (or map at each time)
places(maps)
Time series of EOFs maps
Earth obs(e.g. land cover)
spectral bands
space Map in each wavelength band
bands Maps of band combos
Earth obs(e.g. cloud)
cases wave-length
Spectrum for each case
cases Spectra of case combos
Polarity of IMF
Solar longitude
time IMF polarity f(longitude) at each time
longitudes Time series of lon. distbn
![Page 20: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/20.jpg)
Polarity of Interplanetary Magnetic Field
11 Nov 2013Cadavid et al 2007
![Page 21: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/21.jpg)
Maths – a little more detailRepresent data by MxN matrix DMxM covariance matrix is C = (D – D)(D – D)T
Calculate i=1,M eigenvalues λi & eigenvectors vi
EOFs in MxM matrix of eigenvectors EMxN matrix of PCs P = ET D
NB can rewrite D = (ET)-1 P = E P (E Hermitian)i.e. PCs give weighting of EOFs in data
11 Nov 2013
![Page 22: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/22.jpg)
Data reduction/noise removal
• Higher order PCs are composed of lowest correlations so uncorrelated noise lies in these.
• Can reconstruct data omitting higher order EOFs to reduce noise.
• Can reduce data by keeping only PCs of lowest order EOFs.
11 Nov 2013
![Page 23: Principal Component Analysis for SPAT PG course Joanna D. Haigh.](https://reader036.fdocuments.in/reader036/viewer/2022070308/551ba17355034669548b464d/html5/thumbnails/23.jpg)
Books
R W Priesendorfer 1988PCA in meteorology and oceanographyElsevier
I T Jolliffe 2002Principal component analysisSpringer
11 Nov 2013