CVPR2012 Poster Linear Discriminative Image Processing Operator Analysis

Linear Discriminative Image Processing Operator Analysis Toru Tamaki, Bingzhi Yuan, Kengo Harada, Bisser Raytchev, Kazufumi Kaneda

Most discriminative image processing operators (IPOs)

Generating matrices (image processing operators)

Motivation

Contribution

Find a most discriminative set of image processing operations for LDA.

For a small sample size problem, many studies use an approach to increase training samples by synthetically generating new training samples. But, HOW ?

Ad-hoc… discriminatively !

Simultaneous estimation of both LDA feature space and a set of discriminative generating matrices.

Linear IPO + LDA = LDA with increased samples

xj = Gjx

m0 = Gm

Gjx =1

Gjmi = Gmi

S0i = G (Si �Ri) G

GjRiGTj

S0W = G (SW �RW ) GT +

GjRWGjT

S0B = GSBG

RW =cX

X 0 = G (X �Rall) GT +

GjRallGTj

eS0i =ATPTS0

eS0W =ATPTS0

eS0B =ATPTS0

yj =ATPTxj = ATPTGjx

0i =ATPT = ATPT Gmi

0 =ATPTm

0 = ATPT Gm

�PTS0

WP��1

PTS0BP

tr(eS0B)

tr(eS0W )

increased sample

Mean of class i for increased samples

Mean of all increased samples

an original sample

a generating matrix (an image processing operator)

average of image processing operations

Scatter matrix of class i for increased samples

Within-class scatter matrix for increased samples

Between-class scatter matrix for increased samples

are scatter matrices for original (non-increased) samples

scatter matrices Rayleigh quotient

Generalized Eigenvalue problem

Training sample

Scatter matrices

LDA Feature space

PPCA projection matrix

Covariance matrix

SW , SB

Dimensionality reduction

Given , we don’t need to actually increase training samples. But, need more memory to store…

Analysis of IPO: the spectral decomposition

Definition 1 Let f(x), g(x) 2 L2(R2

) be complex-

valued 2D functions where x 2 R2. The inner

product is defined as

(f, g) ⌘Z

f(x)g(x)dx,

where g is the complex conjugate of g.An operator G : f 7! g is linear if it satis-

fies G(af + bg) = aG(f) + bG(g), 8a, b 2 R.

G⇤is the adjoint operator of G if it satis-

fies (Gf, g) = (f,G⇤g).

Corollary 1 Filtering or geometric transformation opera-

tors G are normal operators which satisfy G⇤G = GG⇤.

�iPiA normal operator can be decomposed into projection operators!

(a) x (b) Gx (c) GTGx (d) GTx

E1x E2x E3x E4x E5x E6x

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

(a) H11, H21

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

(b) H12, H22

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

100 200 300 400 500 600 700 800 900 1000

−0.1

−0.05

indexei

(c) H13, H23 P11ix P21ix P12ix P22ix P13ix P23ix

But, is it feasible for a generating matrix? Yes! Is a fltering Hermite?

||G�GT || < 10�6Almost symmetric

Is a geometric trans. Unitary? Transpose is apparently inverse

G = H1 + iH2, H1 =G+GT

2, H2 =

G�GT

i =p�1

Are eigenvalues complex? Use Hermite decomposition.

So, two step approximation.

an operator two Hermite operators (which have real eigenvalues)

G = H1 + iH2, H1 =G+GT

2, H2 =

G�GT

ajEj =X

aj(H1j + iH2j) 'X

(�1jiP1ji + i�2jiP2ji)

Examples

Real eigenvalues can be small so that we can compress them. Eigenprojections of eigenoperators transorm images to … wavelets?

Eigenoperators transorm images to variants.

Q: To reduce the memory cost of generating matrices, can we use a decomposition for operators just like for images?

A: Yes.

LDA + IPO = LDIPOA: find a set of discriminative IPOs

G(k) = 1k+1

Pkl=0 G

(l) D = PA

eS0(k)W = DT G(k) (SW �RW ) G(k)TD +

DTG(l)RWG(l)TD

eS0(k)B = DT G(k)SBG

X 0(k) = G(k) (X �Rall) G(k)T +

G(l)RallG(l)T

S0(k)W = G(k) (SW �RW ) G(k)T +

G(l)RWG(l)T

S0(k)B = G(k)SBG

Algorithm 1 LDIPOA

1: Compute PCA P and LDA A. G0 I.2: for k = 1, . . . , do3: repeat

4: ↵ step: ↵(k) = argmax↵ E(A,P,↵)5: PCA step: Compute P with ↵(k).6: LDA step: A = argmaxA E(A,P,↵(k))7: until E converges8: end for

alpha step

PCA step

LDA step

At each step k, estimate a single generating matrix represented as a linear combination.

G(k) =JX

↵(k)j Gj (↵(k)

1 ,↵(k)2 , . . . ,↵(k)

J )T = ↵(k)

Experiments with FERET dataset

The proposed algorithm iteratively estimates • α (coeffs. of generating matrices) • P (PCA) • A (LDA) at the same time.

k: the number of estimated generating matrices

10 generating matrices are used to increase the dataset 11 times.

1 generating matrix is used to increase the dataset double.

No generating matrices are used (normal LDA)

The Rayleigh quotient

xj = Gjx

Proposition 1 A filtering is defined as

Gf(x) =

ZG(x,y)f(y)dy,

where the kernel is symmetric G(x,y) = G(y,x) and real valued.

G is an Hermite operator which satisfies G⇤ = G.

Proposition 2 A geometric (a�ne) transformation G is defined

Gf(x) = |A|1/2f(Ax+ t),

where |A| 6= 0. G is a unitary operator which satisfies G⇤G = I.

real imag real imag real imag

Size of images: 32x32 Size of generating matrices: 1024x1024 Number of classes: 1001 (fa) Training images per class: 1 (fa) Test images per class: 1 (fb) Eigen-generating matrices: 96 Initial generating matrices: 567 (3 scaling, 7 rotations, 3 Gaussian and 9 motion blurs) Classifiers: nearest neighbor PCA rates: 80% and 95% for eigen-generating matrices (G-PCA) for PCA step (LDA-PCA)

Maximized in a few steps

A few generating matrices are enough to improve the performance.

Bad approximation of generating matrices do not lead to any improvement…

CVPR2012 Poster Linear Discriminative Image Processing Operator Analysis

Technology

Transcript of CVPR2012 Poster Linear Discriminative Image Processing Operator Analysis

LECTURE 31: DISCRIMINATIVE TRAINING

BNP5spatial Cvpr2012 Applied Bayesian Nonparametrics

LECTURE 32: DISCRIMINATIVE TRAINING

Localized discriminative Gaussian process latent variable ... · PDF fileLocalized discriminative Gaussian process latent variable model for text ... we use Discriminative Gaussian

Discriminative Model Checking

Discriminative Collaborative Representation for Classiﬁcationvigir.missouri.edu/~gdesouza/Research/Conference_CDs/ACCV_2014/... · Discriminative Collaborative Representation for

Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning

Discriminative Multiple Target Tracking

BNP3hdphmm Cvpr2012 Applied Bayesian Nonparametrics

Mvfl_part2 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning

Mvfl_part1 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning

p04 Restricted Boltzmann Machines Cvpr2012 Deep Learning Methods for Vision

BNP2hierarchy Cvpr2012 Applied Bayesian Nonparametrics

Cvpr2012 Python for Matlab Users

Efﬁcient mining of discriminative co-clusters from gene ...dmkd.cs.vt.edu/papers/KAIS14.pdfEfﬁcient mining of discriminative co-clusters from gene ... For example, discriminative

Discriminative Sub-categorization

p06 Motion and Video Cvpr2012 Deep Learning Methods for Vision

Unsupervised Discovery of Mid-Level Discriminative …graphics.cs.cmu.edu/projects/discriminativePatches/discriminative... · Unsupervised Discovery of Mid-Level Discriminative Patches

P05 deep boltzmann machines cvpr2012 deep learning methods for vision

MaxentModels and Discriminative Estimation · PDF fileMaxentModels and Discriminative Estimation Generative vs. Discriminative models Christopher Manning. Discriminative Model Features