Exploring Intrinsic Structures from Samples: Supervised, Unsupervised, and Semisupervised Frameworks...

Post on 02-Jan-2016

219 views 0 download

Transcript of Exploring Intrinsic Structures from Samples: Supervised, Unsupervised, and Semisupervised Frameworks...

Exploring Intrinsic Structuresfrom Samples:

Supervised, Unsupervised, andSemisupervised Frameworks

Huan Wang

Multimedia Laboratory

Department of Information Engineering

The Chinese University of Hong Kong

Supervised by Prof. Xiaoou Tang & Prof. Jianzhuang Liu

Outline

Outline

• Trace Ratio Optimization

Tensor Subspace Learning

• Correspondence Propagation

Preserve sample feature structures

Explore the geometric structures and feature domain relations concurrently

• Notations & introductions

Dimensionality reduction

Concept

Concept. Tensor• Tensor: multi-dimensional (or multi-way) arrays of components

Application

Concept. Tensor

• real-world data are affected by multifarious factorsfor the person identification, we may have facial images of different

► views and poses

► lightening conditions

► expressions

• the observed data evolve differently along the variation of different factors

► image columns and rows

Application

Concept. Tensor

• it is desirable to dig through the intrinsic connections among different affection factors of the data.

• Tensor provides a concise and effective representation.

Illumination

pose

expression

Image columns

Image rows

Images

Introduction

Concept. Dimensionality Reduction

• Preserve sample feature structures

• Enhance classification capability

• Reduce the computational complexity

Trace Ratio Optimization. Definition

,A B

w.r.t. TW W I

• Positive semidefinite

• Homogeneous property:

• Special case, when W is a vector

Generalized Rayleigh Quotient1

arg axT

T

Tw w

w Aww m

w Bw

Aw BwGEVD

• Orthoganality constraint

( )( ) ( ),

( )

TT T

T

Trace W AWJ W J WQ Q Q QQ I

Trace W BW

( )arg ax

( )

T

TW

Trace W WW m

Trace W

A

BW

Optimization over the Grassman manifold

Trace Ratio Formulation

Trace Ratio Formulation

• Linear Discriminant Analysis

2

1

2

1

|| ||( )

arg ax arg ax( )

|| ||

c

i

NT T

Tc cc b

N TW WT T w

i ci

n W x W xTrace W S W

W m mTrace W S W

W x W x

1

( )( )cN

Tb c c c

c

S n x x x x

1

( )( )i i

NT

w i c i ci

S x x x x

Trace Ratio Formulation

Trace Ratio Formulation

• Kernel Discriminant Analysis

1

1

1( ( ) )

arg ax1 1

( ( ) )

c

c

NT c cT T

c cN

A T c cT T T

c c

Tr A K I e e K An

J A m

Tr A K e e ee K An N

w.r.t. T T T TW W I A A A K A I

W A

Decompose

( )( )arg ax arg ax

( ) ( )

T T p TT p Td d d d

T T T T TA A d d d d

Tr A K K L K K ATr A KL K AJ A m m

Tr A KL K A Tr A K K L K K A

Td dK K K

w.r.t.T T

d dA K K A I

Let dK A

( )arg ax

( )

T p Td d

T Td d

Tr K L KJ m

Tr K L K

w.r.t. T I

i ix

Trace Ratio Formulation

Trace Ratio Formulation• Marginal Fisher Analysis

Intra-class graph (Intrinsic graph)

Inter-class graph (Penalty graph)

2

,

2

,

|| ||

arg ax arg ax|| ||

( ( ) ) ( )arg ax arg ax

( ( ) ) ( )

T T ci j ij

i jcT T m

W W i j ijmi j

T c c T T c T

T m m T T m TW W

W x W x WS

W m mW x W x WS

Trace W X D W X W Trace W XL X Wm m

Trace W X D W X W Trace W XL X W

Trace Ratio Formulation

Trace Ratio Formulation

• Kernel Marginal Fisher Analysis

( )arg ax

( )

T p T

T TA

Tr A KL K AJ A m

Tr A KL K A

w.r.t. T T T TW W I A A A K A I

W A

Decompose

( )( )arg ax arg ax

( ) ( )

T T p TT p Td d d d

T T T T TA A d d d d

Tr A K K L K K ATr A KL K AJ A m m

Tr A KL K A Tr A K K L K K A

Td dK K K

w.r.t. T Td dA K K A I

Let dK A

( )arg ax

( )

T p Td d

T Td d

Tr K L KJ m

Tr K L K

w.r.t. T I

i ix

Concept

Trace Ratio Formulation• 2-D Linear Discriminant Analysis

2

1 1

2

1 1

|| || ( ( ( ) ( ) ) )arg ax arg ax

|| || ( ( ( ) ( ) ) )

c c

i

N NT T T T T

c c c c cc c

N NW WT T T T T

i c c ci i

n L x R L xR Trace L n x x RR x x LW m m

L x R L x R Trace L x x RR x x L

Ti iy L x R

1

1

( ( ( ) ( ) ) )arg ax

( ( ( ) ( ) ) )

cNT T T

c c cc

NW T T T

c ci

Trace R n x x LL x x Rm

Trace R x x LL x x R

Left Projection & Right Projection

Fix one projection matrix & optimize the other

• Discriminant Analysis with Tensor Representation

1

21 1 1 1

1

| 21 1 1 1

1

|| ... ... ||( )

arg ax arg ax( )

|| ... ... ||

c

nkk k

i

N

T kc c n n n nc k b k

N T kUU k w k

i n n c n ni

n x U U x U UTrace U S U

W m mTrace U S U

x U U x U U

Trace Ratio Formulation

Trace Ratio Formulation• Tensor Subspace Analysis

2

,

2,

1|| ||

2arg min

|| ||

T Ti j ij

i j

U V i iii

U xV U x V S

Wy D

,

( ( ) )arg min

( ( ) )

T T T T Tii i i ii i i

i iT T T

U V ii i ii

Trace U D xVV x S xVV x U

Trace U D xVV x U

,

( ( ) )arg min

( ( ) )

T T T T Tii i i ii i i

i iT T T

U V ii i ii

Trace V D x UU x S x UU x V

Trace V D x UU x V

Trace Ratio Formulation

Trace Ratio Formulation

( )arg ax

( )

Tb

TW w

Trace W S WW m

Trace W S W

1| |arg ax arg ax ( ) ( )

| |

TT Tb

w bTW Ww

W S WW m m Trace W S W W S W

W S W

Conventional Solution:

b wS w S wGEVD

Singularity problem of wS Nullspace LDA

Dualspace LDA

from Trace Ratio to Trace Difference

Preprocessing

( )arg ax

( )

T p

T lU

Tr W S Wm

Tr W S W ( )

arg ax( )

T p

T tU

Tr W S Wm

Tr W S Wt l pS S S

Remove the Null Space of with Principal Component Analysis. tS

( )0 1

( )

T p

T t

Tr W S W

Tr W S W

from Trace Ratio to Trace Difference

What will we do? from Trace Ratio to Trace Difference

( )arg ax

( )

T p

TU

Tr U S Um

Tr U S UObjective:

Define( )

( )

T p

T

Tr U S U

Tr U S U

Then

( ( ) ) 0T pt tTr U S S U

( )tg U

( ) ( ( ) )T pg U Tr U S S U

Trace Ratio Trace Difference

Find ( ) ( )tg U g U

So that

( ( ) ) ( ) 0ptTr U S S U g U

( )

( )

T p

T

Tr U S U

Tr U S U

from Trace Ratio to Trace Difference

What will we do? from Trace Ratio to Trace Difference

Constraint TU U I

Let '1 1 2[ , ,..., ]k

t mU u u u

We have

1( ) ( ) 0t tg U g U

Thus

1 1

1 1

( )

( )

T pt tTt t

Tr U S U

Tr U S U

The Objective rises monotonously!

( ) ( ( ) )T pg U Tr U S S U

Where '1 2, ,...,km

u u u are the leading

eigen vectors of .( )pS S

Main Algorithm Process

Main Algorithm1: Initialization. Initialize as arbitrary column orthogonal matrices.

U

2: Iterative optimization.

For t=1, 2, . . . , Tmax, Do

1. Set.( )

( )

T p

T

Tr U S U

Tr U S U

2. Conduct Eigenvalue Decomposition: ( )p kk j jS S v v

3. Reshape the projection directions

'1 1 2[ , ,..., ]k

t mU u u u 4.

3: Output the projection matrices

Tensor Subspace Learning algorithms

Traditional Tensor Discriminant algorithms

• Tensor Subspace Analysis He et.al

• Two-dimensional Linear Discriminant Analysis

• Discriminant Analysis with Tensor RepresentationYe et.al

Yan et.al

• project the tensor along different dimensions or ways

• projection matrices for different dimensions are derived iteratively

• solve an trace ratio optimization problem

• DO NOT CONVERGE !

Discriminant Analysis Objective

Solve the projection matrices iteratively: leave one projection matrix as variable while keeping others as constant.

1

21

2| 1

|| ( ) | ||arg ax

|| ( ) | ||k nk

k n pi j k k iji j

k nU i j k k iji j

X X U Wm

X X U W

• No closed form solution

Mode-k unfolding of the tensor

2

2

|| ||arg ax

|| ||

T T

T Tk

k k k k pi j iji j

k k k kU i j iji j

U Y U Y Wm

U Y U Y W

kiY

~1 1 1

1 1 1... ...k k ni i k k nY X U U U U

~

iY

Objective Deduction

Discriminant Analysis Objective

Trace Ratio: General Formulation for the objectives of the Discriminant Analysis based Algorithms.

( )arg ax

( )

T

Tk

k p kk

k k kU

Tr U S Um

Tr U S U

( )( )k k k k k T

ij i j i ji jS Y Y Y YW

( )( )k k k k T

i j i ji j

p pk ijS Y Y Y YW

DATER:

TSA:

kS Within Class Scatter of the unfolded data

pkS Between Class Scatter of the

unfolded data

W pS Diagonal Matrix with weightsConstructed from Image Manifold

Disagreement between the Objective and the Optimization Process

Why do previous algorithms not converge?

11

1

1 111

( )arg ax

( )

T k

k

T k kk

k p

kU

Tr U S Um

Tr U S U1 1 1 1 1

11

1arg ax (( ) )T T

k

k k k k kpk

U

m Tr U S U U S U

GEVD

22

2

2 222

( )arg ax

( )

T k

k

T k kk

k p

kU

Tr U S Um

Tr U S U

2 2 2 2 2

22

1arg ax (( ) )T T

k

k k k k kpk

U

m Tr U S U U S U

( )

( )

Tr A

Tr B1( )Tr B A

The conversion from Trace Ratio to Ratio Trace induces an inconsistency among the objectives of different dimensions!

from Trace Ratio to Trace Difference

What will we do? from Trace Ratio to Trace Difference

( )arg ax

( )

T

Tk

k p kk

k k kU

Tr U S Um

Tr U S UObjective:

Define( )

( )

T

T

k p kt k t

k k kt t

Tr U S U

Tr U S U

Then

( ( ) ) 0Tk p k k

t k tTr U S S U

( )ktg U

( ) ( ( ) )T p kkg U Tr U S S U

Trace Ratio Trace Difference

Find ( ) ( )ktg U g U

So that

( ( ) ) ( ) 0p k kk tTr U S S U g U

( )

( )

T pk

T k

Tr U S U

Tr U S U

from Trace Ratio to Trace Difference

What will we do? from Trace Ratio to Trace Difference

Constraint TU U I

Let '1 1 2[ , ,..., ]k

kt m

U u u u

We have

1( ) ( ) 0k kt tg U g U

Thus

1 1

1 1

( )

( )

T

T

k p kt k t

k k kt t

Tr U S U

Tr U S U

The Objective rises monotonously!

Projection matrices of different dimensions share the same objective

( ) ( ( ) )T p kkg U Tr U S S U

Where '1 2, ,...,km

u u u are the leading

eigen vectors of .( )p kkS S

Main Algorithm Process

Main Algorithm1: Initialization. Initialize as arbitrary column orthogonal matrices.

1 20 0 0, ,..., nU U U

2: Iterative optimization.

For t=1, 2, . . . , Tmax, Do

For k=1, 2, . . . , n, Do

1. Set. 1 21 1

1 21

|| ( ) | | ||

|| ( ) | | ||

o k o n pi j o t o o t o k iji j

o k n ni j o t o o o o k iji j

X X U U W

X X U U W

2. Compute and . kS pkS

3. Conduct Eigenvalue Decomposition: ( )p kk j jS S v v

4. Reshape the projection directions

'1 1 2[ , ,..., ]k

kt m

U u u u 5.

3: Output the projection matrices

Hightlights of the Trace Ratio based algorithm

Highlights of our algorithm

• The objective value is guaranteed to monotonously increase; and the multiple projection matrices are proved to converge.

• Only eigenvalue decomposition method is applied for iterative optimization, which makes the algorithm extremely efficient.

• Enhanced potential classification capability of the derived low-dimensional representation from the subspace learning algorithms.

• The first work to give a convergent solution to the general tensor-based subspace learning.

Projection Visualization

Experimental Results

Visualization of the projection matrix W of PCA, ratio trace based LDA, and trace ratio based LDA (ITR) on the FERET database.

Face Recognition Results.Linear

Experimental Results

Comparison: Trace Ratio Based LDA vs. the Ratio Trace based LDA (PCA+LDA)

Comparison: Trace Ratio Based MFA vs. the Ratio Trace based MFA (PCA+MFA)

Face Recognition Results.Kernelization

Experimental Results

Trace Ratio Based KDA vs. the Ratio Trace based KDA

Trace Ratio Based KMFA vs. the Ratio Trace based KMFA

Results on UCI Dataset

Experimental Results

Testing classification errors on three UCI databases for both linear and kernel-based algorithms. Results are obtained from 100 realizations of randomly generated 70/30 splits of data.

Monotony of the Objective & Projection Matrix Convergence

Experimental Results

Face Recognition Results

Experimental Results

1. TMFA TR mostly outperforms all the other methods concerned in this work, with only one exception for the case G5P5 on the CMU PIE database.2. For vector-based algorithms, the trace ratio based formulation is consistently superior to the ratio trace based one for subspace learning.3. Tensor representation has the potential to improve the classification performance for both trace ratio and ratio trace formulations of subspace learning.

Correspondence Propagation

Geometric Structures & Feature Structures

Explore the geometric structures and feature domain consistency for object registration

Objective

Aim

• Exploit the geometric structures of sample features

• Introduce human interaction for correspondence guidance

• Seek a mapping of features from sets of different cardinalities

• Objects are represented as sets of feature points

Graph Construction

Spatial Graph Similarity Graph

From Spatial Graph to Categorical Product Graph

1 2 1

1 1 1 1{ , ,..., }Ni i i

1 2 2

2 2 2 2{ , ,..., }Ni i i

1G1 2 1 2

1 2{ , }i i i im

1 2 1 2

1 2{ , }j j j jm 1 1

1 1{ , }i j 2 2

2 2{ , }i j

2G

1G 2G

1 2 1 2~i i j jm m iff

1 1

1 1~i j and2 2

2 2~i j

Definition: Suppose and are the vertices

of graph and respectively. Two assignments and

are neighbors iff both pairs and are neighbors in

and respectively, namely,

where ~a b means a and b are neighbors.

Assignment Neighborhood Definition

{ 1, 2, 3, 4, 5, 6}A a a a a a a { 1, 2, 3}B b b b

{( 1, 1), ( 1, 2), ( 1, 3),

( 2, 1),..., ( 6, 3)}

A B a b a b a b

a b a b

From Spatial Graph to Categorical Product Graph

1 2aG G G

The adjacency matrix aW of aG can be derived from:

2 1aW W W

where is the matrix Kronecker product operator.

Smoothness along the spatial distribution: 21( )

2a a v v

ij i jij

ML M w m m

Feature Domain Consistency & Soft Constraints

Similarity Measure:

One-to-one correspondence penalty

where is matrix Hardamard product and returns the sum of all elements in T

1 1 2 21 1 2 2( ) ( ) ( ) ( )T T T T T T

N N N NTr A M e A M e Tr A M e A M e

2 11 N NA e I

where and2 12 N NA I e

or

Assignment Labeling

1, ( 1) 0i j i j NM M

Assign zeros to those pairs with extremely low similarity scores.

Labeled assignments: Reliable correspondence & Inhomogeneous Pairs

Inhomogeneous Pair Labeling

Reliable Pair Labeling

Assign ones to those reliable pairs1, ( 1) 1i j i j NM M

arrangement

Reliable Correspondence Propagation

Assignment variables

Arrangement:

Coefficient matrices

* ;l uM M M

*1 1 1;l uA A A

*2 2 2;l uA A A

* ;l uS S S

Spatial Adjacency matrices *a a

a ll lua a

ul uu

W WW

W W

*a a

a ll lua aul uu

L LL

L L

Objective

Reliable Correspondence Propagation

Objective:

*

1 1 2 2

* * * * *

* * * *1 1 2 2

min

( ) ( ) ( ) ( )

T a

M

T T T T T TN N N N

S M M L M

Tr A M e A M e Tr A M e A M e

Feature domain agreement:* *S M

Geometric smoothness regularization:* * *T aM L M

One-to-one correspondence penalty:

1 1 2 2

* * * *1 1 2 2( ) ( ) ( ) ( )T T T T T T

N N N NTr A M e A M e Tr A M e A M e

Solution

Reliable Correspondence Propagation

where

Relax to real domain & Closed-form Solution:

1( )u luu u ulM C B C M

* * * * *1 1 2 2

ll luT T a

ul uu

C CC A A A A L

C C

and

1 2

* * *1 2

1

2

l

N Nu

BB A e A e S

B

��������������

Rearrangement & Discretizing

Rearrangement and Discretization

Inverse process of the element arrangement: *M M

Reshape the assignment vector into matrix: M M

Thresholding: Assignments larger than a threshold are regarded as correspondences.

Eliciting: Sequentially pick up the assignments with largest assignment scores.

Semisupervised & Automatic Systems

Semi-supervised & Unsupervised Frameworks

Exact pairwise correspondence labeling:

Users give exact correspondence guidance

Obscure correspondence guidance:

Rough correspondence of image parts

Experimental Results. Demonstration

Experiment. Dataset

Experimental Results. Details

Automatic feature matching score on the Oxford real image transformation dataset. The transformations include viewpoint change ((a) Graffiti and (b) Wall sequence), image blur ((c) bikes and (d) trees sequence), zoom and rotation ((e) bark and (f) boat sequence), illumination variation ((g) leuven ) and JPEG compression ((h) UBC).

Summary

Future Works

• From point-to-point correspondence to set-to-set correspondence.

• Multi-scale correspondence searching.

Summary

Future Works

• From point-to-point correspondence to set-to-set correspondence.

• Multi-scale correspondence searching.

• Combine the object segmentation and registration.

Publications

Publications:

Publications:[1] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘A convergent solution to Tensor Subspace Learning’, International Joint Conferences on Artificial Intelligence (IJCAI 07 Regular paper) , Jan. 2007.[2] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Trace Ratio vs. Ratio Trace for Dimensionality Reduction’, IEEE Conference on Computer Vision and Pattern Recognition (CVPR 07), Jun. 2007.[3] Huan Wang, Shuicheng Yan, Thomas Huang, Jianzhuang Liu and Xiaoou Tang, ‘Transductive Regression Piloted by Inter-Manifold Relations ’, International Conference on Machine Learning (ICML 07), Jun. 2007.[4] Huan Wang, Shuicheng Yan, Thomas Huang and Xiaoou Tang, ‘Maximum unfolded embedding: formulation, solution, and application for image clustering ’, ACM international conference on Multimedia (ACM MM07), Oct. 2006.[5] Shuicheng Yan, Huan Wang, Thomas Huang and Xiaoou Tang, ‘Ranking with Uncertain Labels ’, IEEE International Conference on Multimedia & Expo (ICME07), May. 2007.[6] Shuicheng Yan, Huan Wang, Xiaoou Tang and Thomas Huang, ‘Exploring Feature Descriptors for Face Recognition ’, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP07 Oral), Apri. 2007.

Thank You!

Transductive Regression on Multi-Class Data

Explore the intrinsic feature structures w.r.t. different classes for regression

Regression Algorithms. Reviews

Exploit the manifold structures to guide the regression

Belkin et.al, Regularization and semi-supervised learning on large graphs

transduces the function values from the labeled data to the unlabeled ones utilizing local neighborhood relations,

Global optimization for a robust prediction.

Cortes et.al, On transductive regression.

Tikhonov Regularization on the Reproducing Kernel Hilbert Space (RKHS)

Classification problem can be regarded as a special version of regression

Fei Wang et.al, Label Propagation Through Linear Neighborhoods

An iterative procedure is deduced to propagate the class labels within local neighborhood and has been proved convergent

Regression Values are constrained at 0 and 1 (binary)samples belonging to the corresponding class =>1o.w. => 0

The convergence point can be deduced from the regularization framework

2

1

1arg min ( ( ), ) || || ,

K

n

i i Hf H i

f V f x y fn

22

1

1arg min ( ( ), ) || || ,

( )K

nTI

i i A Hf H i

f V f x y f f Lfn u l

The Problem We are FacingAge estimation

w.r.t. different genders

Pose Estimation

w.r.t. different

Genders

Illuminations

Expressions

Persons

w.r.t. different persons

FG-NET Aging Database

CM

U-P

IE D

ataset

The problem

The Problem We are Facing

• All samples are considered as in the same class

• Samples close in the data space X are assumed to have similar function values (smoothness along the manifold)

• For the incoming sample, no class information is given.

• Utilize class information in the training process to boost the performance

Regression on Multi-Class Samples.

-0.5

0

0.5 -0.5

0

0.5-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

Traditional Algorithms

• The class information is easy to obtain for the training data

TRIM. Intra-Manifold Regularization

• Respective intrinsic graphs are built for different sample classes

• Correspondingly, intra-manifold regularization item for different classes are calculated separately

intrinsic graph

• The Regularization

when p=1

when p=2

2

,

1( )

2T

i j iji j

f Lf f f W

2

~

|| ||T Ti ij j

i j i

f L Lf f w f

T pf L f

. . . 1, 0ij ijj

w r t w w

• It may not be proper to preserve smoothness between samples from different classes.

-0.5

0

0.5 -0.5

0

0.5-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

The algorithm

TRIM. Inter-Manifold Regularization

• Assumptions

Samples with similar labels lie generally in similar relative positions on the corresponding sub-manifolds.

• Motivation

1.Align the sub-manifolds of different class samples according to the labeled points and graph structures.

2. Derive the correspondence in the aligned space using nearest neighbor technique.

The algorithm

TRIM. Manifold Alignment

• Minimize the correspondence error on the landmark points

• Hold the intra-manifold structures

• The item is a global compactness regularization, and is the Laplacian Matrix of

11

( | )| arg min( ),

i

ii

i i i i

i

k Mkk M

k k T k k

k

C ff

f D f

21

1

( | ) || || ,i j ji i i ik ki ji ii j

i j i

Mk k kk k k kM T p T a

k ij kx xk k k

C f w f f f L f f L f

where

T af L faL aW

aijw 1 If and are of different classesix

ix0 o.w.

TRIM. Inter-Manifold Regularization

• Concatenate the derived inter-manifold graphs to form

rW 12WO 1MW21W O 2MW

...

...... ... ... ...

1MW 2MW ... O

• Laplacian Regularization

-4 -3 -2 -1 0 1 2 3 4-4

-3

-2

-1

0

1

2

3

4

-4 -3 -2 -1 0 1 2 3 4-4

-3

-2

-1

0

1

2

3

4

T rf L f

Objective Deduction

TRIM. Objective

2 2

2 2

1arg min || || || ||

1( ) ,

( )

kik lK i

k ki Kk x

f H k x X

k T p k T rkk

k

f f y fl

f L f f L fN N

• Fitness Item 21|| ||k

ik li

k kik x

k x X

f yl

• RKHS Norm 2|| ||Kf

• Intra-Manifold Regularization 2

1( )

( )k T p k

kkk

f L fN

• Inter-Manifold Regularization T rf L f

Solution

TRIM. Solution

• The solution to the minimization of the objective admits an expansion (Generalized Representer theorem)

1

( ) ( , )

k kk

N l u

i ii

f x K x x

Thus the minimization over Hilbert space boils down to minimizing the coefficient vector

1 1 1

1 1 11 1[ ,..., ,... ,..., ,..., ,..., ]M M M

M M M T

l l u l l u

over NR

The minimizer is given by 1 1( )k

k k T klk l

k

J S S Yl

where 2 2

1 1( ) ( ) ,

( )k k

k k T k k kT p k rkk kl l

k k

J S S S S K I S L S K L Kl N N

( , ),k k k k k

k

l l l l uS I O

1

1 1

( ),k k k Mk kk ki i

k k ki i

k

N NN N N N

S O I O

and K is the N × N Gram matrix of labeled and unlabeled points over all the sample classes.

Solution

TRIM.Generalization

• For the out-of-sample data, the labels can be estimated using

( )

1

( , )

k kk

N l u

new i i newi

y K x x

Note here in this framework the class information for the incoming sample is not required in the prediction stage.

22 2

1 1arg min || || ( ) ,

( )kik l

i

k k k T p k T ri kk kx

f k kx X

f f y f L f f L fl N N

Original version without kernel

Two Moons

Experiments

YAMAHA Dataset

Experiments.Age Dataset

TRIM vs traditional graph Laplacian regularized regression for the training set evaluation on YAMAHA database.

Open set evaluation for the kernelized regression on the YAMAHA database. (left) Regression on the training set. (right) Regression on out-of-sample data