Learning of Data Collections in High-dimensional Spaces Without Supervision
description
Transcript of Learning of Data Collections in High-dimensional Spaces Without Supervision
1
Djemel ZiouNSERC/Bell Canada Chair in personal imaging
Computer Science dept.Université de Sherbrooke
Quebec, Canada
Learning of Data Collections in High-dimensional Spaces
Without Supervision
1
2
Content
Visual collection managementMachine learningImage segmentationContent based image suggestion
3
Visual collection management
4
Motivations
NSF 2007, B. Efron 2002.
5
Reactive Access to Collections
5
Text-based image retrieval Text: keywords extracted from Web
pages containing the image, figure captions, …
Short term need User queries an
information retrieval system
Content-based image retrieval Visual appearance: color, shape,
texture, regions of interest,…
Limitations Query, features, similarity,
indexing, …
6
Proactive Access To Collections
6
Suggestion
1. Collaboration: Users conformity to groups Opinions of other users
2. Content : Conformity to himselfItems with same tags (keywords)
Suggestion rulesSuggestion rules
Predict the buyers needs
7
Machine Learning
7
8
Introduction
• Representation of stimulusRepresentation of stimulus
Feature space
9
Introduction
nmmm xxcxcxUL ,,),(,),,( 111
)/()/,()(maxarg),,/,()/,( **
Ui
iLi
ii xpcxppwherecxpcxp
dcxppcxp ),/,()/()/,( • Generative learning Generative learning
• DataData
Under certain assumptions (structural, MAP)
• Discriminative learningDiscriminative learning
dxcppxcp ),/()/(),/( Unlike generative learning, 1) provides no information about x ( ) ; 2) Discriminative learning cannot be used with unlabelled data (C must be observed).
)()/( xpxp
X C
X
C
10
Maximizing the conditional Log-Likelihood.
There are several drawbacks (high-dimension, separability, …)
where
Discriminative Learning: Bayesian Logistic Regression Ksantini, Ziou, Colin, Dubeau. IEEE Trans. On PAMI, 2008
Bayesian formulation
)),/(log(),(1
wxspwLl i
m
ii
110 ))exp(1(),/1( wxwwxsp T
11
Variational approximation and Jensen’s inequality lead to:
12
Generative learning: case of finite mixture of pdfs
Finite mixture model.
)(),/()/(1
jpjYpYpM
jj
Problems: pdf, estimation, model selection, …
Gamma (line)
Fourth Moment
Third Moment
0
Gau
ssia
n (p
oint
)
Beta (area)
Mixture of different Pdfs for SAR images
El Zaart and Ziou, Int J. Remote Sensing 2007
Which Pdf?Gaussian, Gamma, …, same or different pdfs for populations.
13
The Generalized Dirichlet Distribution
Generalized Dirichlet distribution (GDD)
1 and 1-D1,...,i , D11 Diiii
ii
i
jji
D
i ii
ii YYYp
)1()()(
)()(
1
1
1
14
Multi-dimensionality is Omnipresent
14
Multidimensional data Image Descriptors: 128 000 features (128 Sift features x 1000 interest
points). Faces: 128x128 pixels= 16384 features/face Text: # terms in a corpus ~10 000
15
High-Dimensional data
If is GDD ( )),....,( 1 iDii YYY
)(
id
YTid
XidY
111 idi
id
YY
Y
If d=1
for d=2…D
Each is a Beta idX
),(1
jd
D
djdj
),( jdjd
)(1)()(
)()(
11
idXXXpjd
jd
idjdjd
jdjdidBeta
Bouguila and Ziou. IEEE Trans. On PAMI, 2007
Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009
16
Feature Selection
16
Mixture model before and after transformation:
17
Feature Selection Model
Relevance Criterion: marginal independence of Xl from the class label Z
Label Xl with hidden Bernoulli variable фl, such that фl=0 when Xl ~ ξl,
General definition: ξl = mixture of K ξkl e.g. distribution of background in object images. Label Xl in the mixture ξl by hidden multinomial variable
Approximation:
New mixture model Generalized Dirichlet (GD) with selection of independent features
17
Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009
18
Unsupervised Learning using the MML Principle
pp N5.0)12log(0.5N-|))I(0.5log(|)/p(-))log(p(- MessLen
• is the number of parameters being estimated and equal to M (2D+1).
• is the prior probability.
• is the Fisher information (determinant of the Hessian matrix).
• Problems: ? And ?
pN)p()I(
)p( )I(
What is the minimum message length?
Encode Send Decode
Paradigm
Bouguila and Ziou. IEEE Trans. On TKDE, 2007
19
Unsupervised Learning MML
19
Fisher Information: E.g.
Prior distribution: E.g.
Message Length of the data set
Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009
20
Optimization of MML
Expectation Maximization (EM) algorithm E-step: expected posterior probabilities
M-step:
2x2 matrix
21
Object image categorization
Goal : Identify categories and irrelevant features Challenge: Intra-class variability + inter-class similarity Existing: Supervised, K-NN with Euclidian distance Collection: 2688 images, 8 classes Features:
Scale Invariant Feature Transform (SIFT) ≈ 1.5.106 descriptors 128-D (2 GB)
Visual vocabulary 700 “visual words” Probabilistic Latent Semantic Indexing (pLSI) P(z|I): hidden aspects defined on simplex
Non-Euclidian
Challenging problem in
computer vision
22
Results
22
Feature Selection improves the accuracy of image categorization
23
Image segmentation and object tracking
M.S. Allili and Ziou, Int. J. of Computer Mathematics, 2007.M.S. Allili and Ziou, J. Neurocomputing, 2008.
24
C
N
C
Active contour based approach
Initial Contour Final contour
Variational formulation
Problem formulation of segmentation
25
Contrast estimation
Statistical Model selection
+
Energy functionalEnergy functional
Proposed approachProposed approach
K
iik θUpUp
1)ˆ|(ˆ)ˆ|(
K
i R
i
K
i R
iK1K1 dxdyyxUpdssR)θ,,θR,E(Rii
11
1 ))),(|(log())((1...,...,
)( K1,...,RR )( K1,...,MinimizeMinimize
Euler-Lagrange PDE
R
26
Topology change (Level sets)
Experimental results
27
Object tracking in video
28
CBIS as a Model Selection Problem
Boutemdjet and Ziou, IEEE Trans. on multimedia, 2008.
29
Suggestion Criteria
29
DataUsers: U={u1,u2,…,UNu}Contexts: E={e1,e2,…,eNe}Images: X = {x1,x2,…,xNx} Ratings of user on images:
D={(u(i),e(i),x(i),r(i)),i=1,…,N},
Data modeling principle Similar users prefer visually and semantically similar products
Suggestion : consumers need highly rated and less redundant products
30
Data model: p(u,e,x,r)
30
Rating: model data Each Quadruplet (u,e,v,x) is a random vector Discover user/image classes (z,c) and Label (u,e,v,x) with 2 hidden
variables: z: user class, c: image class All variables except x are discrete ~ multinomial distributions, x~GD
Parameters:– Diversity: Penalize predicted ratings for consumed images Xue
- Consumed images become irrelevant Nue={(u,e,xtue,r-),t=1,..,Nue}
- Update Θ from Nue
New data are handled.
31
Algorithm
31
32
Results: Mean Absolute Error (MAE)
32
PCC: Pearson Correlation Coefficients (P. Resnick et al., CSCW 1994) Aspect Model (T. Hofmann, ACM TOIS 2004) Flexible Mixture Model (L. Si & R. Jin, ICML 2003) User Rating Profile (B. Marlin, NIPS 2004) V-FMM: No contextual information, E=Singleton V-GD-FMM: No Feature Selection
Feature Selection improves the rating prediction accuracy
PCC Aspect
FMM URP V-FMM V-GD-FMM I-VCC D-VCC
Avg. MAE 1.327 1.201 1.145 1.116 0.890 0.754 0.712 0.645
Std. Deviation 0.040 0.051 0.036 0.042 0.038 0.027 0.022 0.014
Improvement (%) 0.00 9.49 13.71 15.90 32.94 43.18 51.62 55.84
3333
Thank you
3434
M. S. Allili, D. Ziou. Object tracking in videos using adaptive mixture models and active contours. Neurocomputing 7, pp. 2001-2011, 2008.
M. S. Allili, D. Ziou: Automatic colour-texture image segmentation using active contours. Int. J. Comput. Math. 84(9): 1325-1338, 2007.
S. Boutemedjet, Djemel Ziou. A Graphical Model for Context-Aware Visual Content Recommendation. IEEE Trans. on Multimedia 10, pp. 52-62, 2008.
S. Boutemedjet, N. Bouguila, and D. Ziou (In press). A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009.
N. Bouguila and D. Ziou: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007.
R. Ksantini, D. Ziou, B. Colin, F. Dubeau. Weighted Pseudometric Discriminatory Power Improvement Using a Bayesian Logistic Regression Model Based on a Variational Method. IEEE Trans. Pattern Anal. Mach. Intell. 30(2): 253-266, 2008.
D. Ziou, T. Hamri, S. Boutemedjet. A hybrid probabilistic framework for content-based image retrieval with feature weighting. Pattern Recognition 42(7): 1511-1519, 2009.
M. L. Kherfi, D. Ziou. Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples. IEEE Trans. on Image Processing 15(4): 1017-1030 2006.
M.-F. Auclair-Fortier, D. Ziou. A Global Approach for Solving Evolutive Heat Transfer for Image Denoising and Inpainting. IEEE Trans. Image Processing, 15:2558-2574, 2006.
A. F. El Ouafdi, D. Ziou, and H. Krim. A smart stochastic approach for manifolds smoothing. Comput. Graphic Forum 27, pp. 1357-1364, 2008.
References