Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf ·...

41
Signal Processing and Analysis Image analysis III W V Benny Thörnberg Associate professor in electronics Copyright (c) Benny Thörnberg 1:41

Transcript of Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf ·...

Page 1: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Signal Processing and AnalysisImage analysis III

W

V

Benny Thörnberg

Associate professor

in electronics

Copyright (c) Benny Thörnberg 1:41

Page 2: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Outline•Shape and texture of objects

•Computation of image gradients

•EHD – Edge Histogram Descriptor

•HOG – Histogram of Oriented Gradients

•Original Character Recognition

•Fundamental steps of OCR

•Training sets

•Minimum distance classifier

•Extension of feature vector to improve OCR

•Summary of performance for OCR

•Principal Component Analysis

Copyright (c) Benny Thörnberg 2:41

Page 3: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Shape and texture of objects

Is there a method available to compute a descriptor that is compact enough and still

provide enough information for a computerized classifier to identify objects in pictures?

Shape and texture of objects seem to capture enough information in order for a human to

distinguish between different kinds of objects present in an image

Copyright (c) Benny Thörnberg 3:41

Page 4: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Convolving image I with the Sobel matrixes gives the gradient vector

Computation of image gradients

IGX ∗

=

101

202

101

IGY ∗

−−−

=

121

000

121

( )YX GGG ,=

Gradient magnitude

and orientation (angle)

22)( YX GGGGmag +==

=

)(cos 1

Gmag

GXθ

Copyright (c) Benny Thörnberg 4:41

≈ �� + ��

Approximation used for

Sobel operator

Page 5: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD – Edge Histogram Descriptor

If the statistical distribution of all gradient vectors in a

neighborhood is collected into a histogram, we have created

a descriptor that captures local salient texture in image.

But how to create a descriptor that also has the ability to

capture shape of an object such as the woman in picture?

Copyright (c) Benny Thörnberg 5:41

Page 6: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD – Edge Histogram Descriptor

1 2 3 4

5 6 7 8

9 10 11 12

13 14 15 16

17 18 19 20

21 22 23 24

25 26 27 28

29 30 31 32

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Divide image into blocked sub-images. Compute histogram

for each block and append all histograms into a large vector.

A spatial coding is thus created such that the combined

histograms can capture information about global salient

shapes of objects.

Copyright (c) Benny Thörnberg 6:41

Page 7: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD – Edge Histogram Descriptor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

Classifier

Type of object ( human, giraffe, bird … )

Large datasets of images

used for training

L. Touil, A.B. Abdelali and M. Abdelatif, “A hardware acceleration of real time video processing”, Proc. of16th IEEE

Mediterranean Electrotechnical Conference, 28 March 2012, Yasmine Hammamet, Tunisia.

H. Ayad, S.N.H.S. Abdulah and A. Abdullah, “Visual Object Categorization based on Orientation Descriptor”, Proc. of 6th

Asia Modelling Symposium (AMS 2012), 29-31 May 2012, Bali, Indonesia.

Copyright (c) Benny Thörnberg 7:41

Page 8: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Histograms of Oriented Gradients - HOG

Reference: Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection”

Overlapping blocks

Cells

NormalizeGamma

and Colour

Computegradient vectors

Weighted vote intohistograms of gradient

orientations, onehistogram for each cell

Contrast normalizecells within

overlapping blocks

Collect HOG’s intoa descriptive

vector for wholedetection window

Linearclassifier(SVM)

Input image

Human detectedor

No human

Copyright (c) Benny Thörnberg 8:41

Page 9: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Histograms of Oriented Gradients - HOG

Reference: Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection”

12

=

k

k

V

Vv

Vector v is a normalized

block histogram,

computed from

histograms Vk, built from

all cell histograms

belonging to a single

block

Finally, a large feature vector is built from appending all normalized block histograms into a

long feature vector for whole detection window.

NormalizeGamma

and Colour

Computegradient vectors

Weighted vote intohistograms of gradient

orientations, onehistogram for each cell

Contrast normalizecells within

overlapping blocks

Collect HOG’s intoa descriptive

vector for wholedetection window

Linearclassifier(SVM)

Input image

Human detectedor

No human

Cell = 5x5 pixels

Block = 3x3 cells

Detection window = 128x64 pixels

Copyright (c) Benny Thörnberg 9:41

Page 10: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Histograms of Oriented Gradients - HOG

Examples from a training set of 924 images made available at MITMassachusetts Institute of Technology

Ref: http://cbcl.mit.edu/software-datasets/PedestrianData.html

This “large” collection of images can be used to train a classifier to recognise pedestrians in images based on computed image descriptors. It provides a test bench for researchers to compare performance of different methods for pedestrian detection.

Copyright (c) Benny Thörnberg 10:41

Page 11: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Histograms of Oriented Gradients - HOG

Reference: Navneet Dalal and Bill Triggs, “Histograms of Oriented Gradients for Human Detection”

Dalal and Triggs evaluated performance and achieved a miss rate of 10% at 10-4 False Positives Per Window.

Cell size was 4x4 pixels and block size was 2x2 cells, block stride = 8 pixels and detection window = 64 x 128 pixels

Voting for 9 bins per local HOG using weights linearly proportional to gradient strength

Copyright (c) Benny Thörnberg 11:41

Page 12: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Original Character Recognition - OCR

We will investigate and show how scanned copies of printed letters automatically can be recognised as a string of letters

Copyright (c) Benny Thörnberg 12:41

Page 13: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Fundamental steps of the OCR system

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

Labeling

A scanner is used to capture images of papers having sequences of letters printed on it

Copyright (c) Benny Thörnberg 13:41

Page 14: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Fundamental steps of the OCR system

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

Labeling

Gradient vector is computed from the Sobel matrixes

Copyright (c) Benny Thörnberg 14:41

Page 15: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Fundamental steps of the OCR system

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

Labeling

Gradient image is segmented into a binary image based on thresholdingthe gradient magnitude

Copyright (c) Benny Thörnberg 15:41

Page 16: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Fundamental steps of the OCR system

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

LabelingLabelling identifies each letter as single image components

Copyright (c) Benny Thörnberg 16:41

Page 17: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Fundamental steps of the OCR system

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

Labeling

Histograms of the gradient orientations (angle) are built from the gradient image developed at preprocessing and only for pixels belonging to an image component. This step generates an Edge Histogram Descriptor (EHD) for each segmented letter.

Copyright (c) Benny Thörnberg 17:41

Page 18: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD from training sets for letters R and B

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

0 20 40 60 80 100 120 140 160 1800.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

Copyright (c) Benny Thörnberg 18:41

Page 19: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD from training sets for letters T and S

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25

0.3

0.35Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

Copyright (c) Benny Thörnberg 19:41

Page 20: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD from training sets for letter O

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

Copyright (c) Benny Thörnberg 20:41

Page 21: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Feature vectors for letters R,B,T,S and O

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25Histogram Of Gradients

Gradient angle [degrees]

Pro

bability

0 20 40 60 80 100 120 140 160 1800.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

0.22

Histogram Of Gradients

Gradient angle [degrees]

Pro

bability

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5Histogram Of Gradients

Gradient angle [degrees]

Pro

babili

ty

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25

0.3

0.35Histogram Of Gradients

Gradient angle [degrees]

Pro

bability

0 20 40 60 80 100 120 140 160 1800

0.05

0.1

0.15

0.2

0.25Histogram Of Gradients

Gradient angle [degrees]

Pro

bability

R B

T S

O

These graphs are

generated for all letters

within a training set of

letters. The width of the

line reveals a statistical

distribution among letters

belonging to the same

class.

Copyright (c) Benny Thörnberg 21:41

Page 22: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Classification

Image acquisition

Preprocessing

Segmentation

Feature extraction

Classification

Labeling

From the EHD feature vectors and its statistical distribution over a training set of images, we can classify each feature vector as belonging to a letter (class) at a an estimated probability of correctness.

Copyright (c) Benny Thörnberg 22:41

Page 23: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Classification

This graph is showing a 3-dimensional feature space having five clearly separable classes.

Copyright (c) Benny Thörnberg 23:41

Page 24: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Minimum Distance Classifier

H.Lin and A.N. Venetsanopoulos, “ A weighted Minimum Distance

Classifier for Pattern Recognition”, Canadian Conference on

Electrical and Computer Engineering, vol.2, 904-907, 1993.

X

Euclidian distances from feature vector to mean vectors of all classes.

���� Select the class giving the shortest distance.

Copyright (c) Benny Thörnberg 24:41

Page 25: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Classification of a string of letters

SRBRTSTTRROOBBOOSS

Input

Output

From experiments, the classification success rate using an

extension to15 bins EHD feature vector covering 360 degrees is

estimated to 70 percent for hand written capital letters A to Z.

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

Copyright (c) Benny Thörnberg 25:41

Page 26: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

EHD feature for machine written letters

If instead machine printed letters are used, classification success

rate is improved to 87 percent. Letters such as R and B are still

hard to distinguish between.

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

Copyright (c) Benny Thörnberg 26:41

Page 27: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Extension of the feature vector

Displacement vector between centre of gravity and centre of bounding box. Two elements are thus added to feature vector.

How can the EHD feature vector be extended with additional features to improve possibilities for distinguishing between letters such as R and B?

If this vector is given as a fraction of the bounding box side length, this feature becomes scale invariant.

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

Copyright (c) Benny Thörnberg 27:41

Page 28: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Extension of feature vector

A four bins zonalhistogram is built from the probabilities of having letters within the four indicated areas. These zones are defined from the bounding box parameters.

This feature has four dimensions and thus adds four additional elements to the feature vector.

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

Copyright (c) Benny Thörnberg 28:41

Page 29: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

After extension of the feature vector

After adding displacement vector and zonal histogram,

classification success rate becomes 86 percent for hand written

letters and close to 100 percent for machine printed capital letters.

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

Copyright (c) Benny Thörnberg 29:41

Page 30: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Percentage of success rate

Summary of performance for OCR

Reference: Bala Subramanyam and Kassahun Frew, “Hardware Centric Original Character Recognition”

EHD EHD + Geometrical

Hand written 70 86

Machine printed 87 ~100

Copyright (c) Benny Thörnberg 30:41

Page 31: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Overview of feature space

-2

0

2

4

-3

-2

-1

0

1

-1

0

1

2

3

4

PCA comp 1PCA comp 2

PC

A c

om

p 3

B

R

S

T How can we get an overview of a feature space

that has 21 dimensions as in the previous OCR

example?

Typically, there is correlation between variables in

multi-dimensionally data used as input to

classification.

If so, then a large amount of the variance can be described by projecting e.g. 21-

dimensional data onto two or three variables. We call the new variables Principal

components.

The 3D example graph shows data clusters corresponding to letters B,R,S and T. Still 58%

of the variance of original data is described by this graph.

Copyright (c) Benny Thörnberg 31:41

Page 32: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component Analysis

��

��

��

�� is a data vector having K dimensions (only three dimensions are

shown in graph). �� = ��,� , � ∈ 1…�The 2D data matrix � is a set of � data vectors where each vector

�� represent a statistical observation

� = ��,� , � ∈ 1…� ∧ � ∈ 1…�

Input data

Copyright (c) Benny Thörnberg 32:41

Page 33: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component AnalysisMean vector

��

��

Each point is a data vector of length K

This means that input data matrix X represents a

swarm of N points in a K-dimensional space

A mean value ��vector is computed as,

��� =1����,�

��� , ∀� ∈ 1. . �

Copyright (c) Benny Thörnberg 33:41

Page 34: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component AnalysisCentering of data

��

��

Subtract the mean vector �� from all data vectors.

This means that all data vectors are equally

translated in a K-dimensional space such that the

center of point cloud is relocated to the origin.

Copyright (c) Benny Thörnberg 34:41

Page 35: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component AnalysisScaling of data

��

��

Divide all data vectors with a variance vector �̅,

��̅ =1�� ��,� − ��� �

��� , ∀� ∈ 1. . �

This means that after scaling, all K dimensions will

have unity variance,

1�� ��,� − ��� � = 1

��� , ∀� ∈ 1. . �

Copyright (c) Benny Thörnberg 35:41

Page 36: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

��

��

"#�

Principal Component AnalysisFirst component

• A principal component (PC) is a vector in K-dimensional X-space

that passes through the origin

• Scores are projections of data vectors (blue point)

• The PC is oriented such that the scores approximate the original

data as well as possible

Copyright (c) Benny Thörnberg 36:41

Page 37: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

��

��

"#�

Principal Component AnalysisSecond component

• The second PC also passes through the origin

• It is oriented to improve approximation of original data

as much as possible but under the constraint that

second PC should be orthogonal to the first one

• The two PCs are describing a plane which we can

think of as a window into X-space

"#�

Copyright (c) Benny Thörnberg 37:41

Page 38: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

��

��

"#�

Principal Component AnalysisSecond component

• The blue point is a score computed as a projection on the plane

defined by PC1 and PC2.

• Scores are approximations of their corresponding data points

• The model used for approximation in this case is plane

• If more PCs are included, the model becomes a hyperplane

• More PCs will gradually improve the approximation of data

"#�

X-space here is illustrated with three axes.

Remember that real-world X-spaces can have

hundreds or thousands of dimensions.

Copyright (c) Benny Thörnberg 38:41

Page 39: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

��

��

"#�

Principal Component AnalysisLoadings

• The direction of each PC is given by their angles (cosines) to

all K axes in X-space.

• In this example, the direction of PC1 is given by the cosines of angles $�, $� and $"#�

$�

$

$�

• When scores are computed from input data, these cosines

define how much each dimension in X-space (variable)

contribute to the PC of score values.

See formula for projection.

• For that reason, cosines of directions are called “loadings”.

%& · ( = %& · ( · )*+,

%&

(,%& · (

Copyright (c) Benny Thörnberg 39:41

Page 40: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component AnalysisLoading plot

• A loading plot summarizes how each X-

variable “load” on each PC

• Points far away from origin have larger

impact on the model compared to points

closer to origin

• If two variable loadings are located very

close in this plot, it means that they are

positively correlated

• If the two variable loadings are located in

opposite sides of the origin, in diagonally

opposed quadrant, it means that those two

variables are negatively correlated

• It is the correlation between variables that

makes it possible to summarize hundreds of

variables in a few PCs.

-0.1

-0.05

0

0.05

0.1

-0.1

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

-0.1

-0.08

-0.06

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

Component 1

Z1

Z4

G8G1G10

G2G15G9

G7

EX1

G14

G3

EX2G6

G11G13G5

G4

G12Z3

Z2

Component 2

Com

ponent

3

Copyright (c) Benny Thörnberg 40:41

Page 41: Signal Processing and Analysisapachepersonal.miun.se/~bentho/sigpronal/download/F7.pdf · •Training sets •Minimum distance ... Reference: Navneet Dalal and Bill Triggs, “Histograms

Principal Component AnalysisSummary

• T: Scores are coordinates in a hyperplane that are used to approximate data

• P: A set of loading vectors that all together defines the orientation of the hyperplane.

Each loading vector defines the orientation of a PC

• E: A residual term captures the variation of data that can not be described by the model

• -.: Center of X-data are typically positioned in origin prior to computation of PCs.

� = �� + / ∗ "1 + 2Centering of data

Structure of data

Residuals contain noise

Copyright (c) Benny Thörnberg 41:41