Yan Cui 2013.1.16
-
Upload
salvador-frederick -
Category
Documents
-
view
26 -
download
0
description
Transcript of Yan Cui 2013.1.16
A novel supervised feature extraction and classification framework for land cover
recognition of the off-land scenario
Yan Cui
2013.1.16
1. The related work
2. The integration algorithm
framework
3. Experiments
The related work
Locally linear embedding
Sparse representation-based classifier
K-SVD dictionary learning
Locally linear embedding
LLE is an unsupervised learning algorithm
that computers low-dimensional,
neighbor-hood-preserving embedding of
high-dimensional inputs.
Specifically, we expect data point and
its neighbors to lie on or close to a
locally linear patch of the manifold and
the local reconstruction errors of these
patches are measured by
2
1 2( ) (1)
k
i ij ji je w x w x
2
1 2( ) (2)
k
i ij ji je w y w y
Sparse representation-based classifier
The sparse representation-based classifier
can be considered a generalization of
nearest neighbor (NN) and nearest
subspace (NS), it adaptively chooses the
minimal number of training samples
needed to represent each test sample.
1
1 2
11 12 1 1 2 1 2
[ , , , ]
[ , , , , , , , , , , , , , ] i c
c
m nn i i in c c cn
A A A A
x x x x x x x x x R
(3)my A R
( ) arg 00 min ,
s.t.
L
A y
(4)
( ) arg 11 min ,
s.t.
L
A y
(5)
)(ˆ ii Ay
2 2
2 2ˆmin ( ) ( ) = (6) i i ii
r y y y y A
K-SVD dictionaries learningThe original training samples have much
redundancy as well as noise and trivial information that can be negative to the recognition.
If the training samples are huge, the computation of SR will be time consuming, so an optimal dictionary is needed for the sparse representation and classification.
The K-SVD algorithm
2
02 0min . . ( 1, 2, , )0
ii i ix D s t T i n
The dictionary update stage:
Let be the training data matrix,
is the -th class training samples matrix, a test data can be well approximated by the linear combination of the training data, i.e.
The integration algorithm for supervised learning
1 2[ , , , ] m ncB B B B R
1 2[ , , , ] ( 1,2, , ) ii
m ni i i inB x x x R i c i
1
n
i iiy x
mRy
Let be the representation coefficient vector with respect to -class. To make SRC achieve good performance on all training samples, we expect the within class residual minimized, while the between class residual maximized, simultaneously. Therefore we redefine the following optimization problem:
22
2 12min ( ) ( ) i jj i
y B y B
( )i i
(15)
22
2 12min ( ) ( )i jj i
y D y D
( ) , ) (k k i j
k
(16)
Let is the representation coefficient vector with respect to -th class, so the optimization problem in Eq. (16) is turned to
( )i
i
22
2 12min ( ) ( )i iy D y D
(17)
In order to obtain the sparse representation coefficients, we want to learn an embedding map to reduce the dimensionality of and preserve the spare reconstruction. So the optimization problem in Eq. (17) is turned to
1 2[ , , , ] m ddW w w w R
2 2
12 2,min ( ) ( )T T T T
i iWW y W D W y W D
For a given test set , we can adaptivelylearn the embedding map, the optimal dictionary and the sparse reconstruction coefficients by the following optimization problem
1 2{ , , , }lU y y y
2 2
1,
ˆmin T T T T
FW FW U W D W U W D
The feature extraction and classification algorithm
Experiments for unsupervised learning
The effect of dictionary selection
Compare with pure feature extraction
Databases descriptions
UCI databases: the Gas Sensor Array Drift Data set and the Synthetic Control Chart Time Series Date Set.
The effect of dictionary selection
Compare with pure feature extraction
Experiments
The effect of dictionary selection
Compare with pure classification
Compare with pure feature extraction
Databases descriptions
The effect of dictionary selection
Compare with pure classification
Compare with pure feature extraction
Thanks!
Question & suggestion?