Learning of Data Collections in High-dimensional Spaces Without Supervision

1

Djemel ZiouNSERC/Bell Canada Chair in personal imaging

Computer Science dept.Université de Sherbrooke

Quebec, Canada

Learning of Data Collections in High-dimensional Spaces

Without Supervision

1

2

Content

Visual collection managementMachine learningImage segmentationContent based image suggestion

3

Visual collection management

4

Motivations

NSF 2007, B. Efron 2002.

5

Reactive Access to Collections

5

Text-based image retrieval Text: keywords extracted from Web

pages containing the image, figure captions, …

Short term need User queries an

information retrieval system

Content-based image retrieval Visual appearance: color, shape,

texture, regions of interest,…

Limitations Query, features, similarity,

indexing, …

6

Proactive Access To Collections

6

Suggestion

1. Collaboration: Users conformity to groups Opinions of other users

2. Content : Conformity to himselfItems with same tags (keywords)

Suggestion rulesSuggestion rules

Predict the buyers needs

7

Machine Learning

7

8

Introduction

• Representation of stimulusRepresentation of stimulus

Feature space

9

Introduction

nmmm xxcxcxUL ,,),(,),,( 111

)/()/,()(maxarg),,/,()/,( **

Ui

iLi

ii xpcxppwherecxpcxp

dcxppcxp ),/,()/()/,( • Generative learning Generative learning

• DataData

Under certain assumptions (structural, MAP)

• Discriminative learningDiscriminative learning

dxcppxcp ),/()/(),/( Unlike generative learning, 1) provides no information about x ( ) ; 2) Discriminative learning cannot be used with unlabelled data (C must be observed).

)()/( xpxp

X C

X

C

10

Maximizing the conditional Log-Likelihood.

There are several drawbacks (high-dimension, separability, …)

where

Discriminative Learning: Bayesian Logistic Regression Ksantini, Ziou, Colin, Dubeau. IEEE Trans. On PAMI, 2008

Bayesian formulation

)),/(log(),(1

wxspwLl i

m

ii

110 ))exp(1(),/1( wxwwxsp T

11

Variational approximation and Jensen’s inequality lead to:

12

Generative learning: case of finite mixture of pdfs

Finite mixture model.

)(),/()/(1

jpjYpYpM

jj

Problems: pdf, estimation, model selection, …

Gamma (line)

Fourth Moment

Third Moment

0

Gau

ssia

n (p

oint

)

Beta (area)

Mixture of different Pdfs for SAR images

El Zaart and Ziou, Int J. Remote Sensing 2007

Which Pdf?Gaussian, Gamma, …, same or different pdfs for populations.

13

The Generalized Dirichlet Distribution

Generalized Dirichlet distribution (GDD)

1 and 1-D1,...,i , D11 Diiii

ii

i

jji

D

i ii

ii YYYp

)1()()(

)()(

1

1

1

14

Multi-dimensionality is Omnipresent

14

Multidimensional data Image Descriptors: 128 000 features (128 Sift features x 1000 interest

points). Faces: 128x128 pixels= 16384 features/face Text: # terms in a corpus ~10 000

15

High-Dimensional data

If is GDD ( )),....,( 1 iDii YYY

)(

id

YTid

XidY

111 idi

id

YY

Y

If d=1

for d=2…D

Each is a Beta idX

),(1

jd

D

djdj

),( jdjd

)(1)()(

)()(

11

idXXXpjd

jd

idjdjd

jdjdidBeta

Bouguila and Ziou. IEEE Trans. On PAMI, 2007

Boutemedjet, Bouguila and Ziou. IEEE Trans. On PAMI, 2009

16

Feature Selection

16

Mixture model before and after transformation:

17

Feature Selection Model

Relevance Criterion: marginal independence of Xl from the class label Z

Label Xl with hidden Bernoulli variable фl, such that фl=0 when Xl ~ ξl,

General definition: ξl = mixture of K ξkl e.g. distribution of background in object images. Label Xl in the mixture ξl by hidden multinomial variable

Approximation:

New mixture model Generalized Dirichlet (GD) with selection of independent features

17


18

Unsupervised Learning using the MML Principle

pp N5.0)12log(0.5N-|))I(0.5log(|)/p(-))log(p(- MessLen

• is the number of parameters being estimated and equal to M (2D+1).

• is the prior probability.

• is the Fisher information (determinant of the Hessian matrix).

• Problems: ? And ?

pN)p()I(

)p( )I(

What is the minimum message length?

Encode Send Decode

Paradigm

Bouguila and Ziou. IEEE Trans. On TKDE, 2007

19

Unsupervised Learning MML

19

Fisher Information: E.g.

Prior distribution: E.g.

Message Length of the data set


20

Optimization of MML

Expectation Maximization (EM) algorithm E-step: expected posterior probabilities

M-step:

2x2 matrix

21

Object image categorization

Goal : Identify categories and irrelevant features Challenge: Intra-class variability + inter-class similarity Existing: Supervised, K-NN with Euclidian distance Collection: 2688 images, 8 classes Features:

Scale Invariant Feature Transform (SIFT) ≈ 1.5.106 descriptors 128-D (2 GB)

Visual vocabulary 700 “visual words” Probabilistic Latent Semantic Indexing (pLSI) P(z|I): hidden aspects defined on simplex

Non-Euclidian

Challenging problem in

computer vision

22

Results

22

Feature Selection improves the accuracy of image categorization

23

Image segmentation and object tracking

M.S. Allili and Ziou, Int. J. of Computer Mathematics, 2007.M.S. Allili and Ziou, J. Neurocomputing, 2008.

24

C

N

C

Active contour based approach

Initial Contour Final contour

Variational formulation

Problem formulation of segmentation

25

Contrast estimation

Statistical Model selection

+

Energy functionalEnergy functional

Proposed approachProposed approach

K

iik θUpUp

1)ˆ|(ˆ)ˆ|(

K

i R

i

K

i R

iK1K1 dxdyyxUpdssR)θ,,θR,E(Rii

11

1 ))),(|(log())((1...,...,

)( K1,...,RR )( K1,...,MinimizeMinimize

Euler-Lagrange PDE

R

26

Topology change (Level sets)

Experimental results

27

Object tracking in video

28

CBIS as a Model Selection Problem

Boutemdjet and Ziou, IEEE Trans. on multimedia, 2008.

29

Suggestion Criteria

29

DataUsers: U={u1,u2,…,UNu}Contexts: E={e1,e2,…,eNe}Images: X = {x1,x2,…,xNx} Ratings of user on images:

D={(u(i),e(i),x(i),r(i)),i=1,…,N},

Data modeling principle Similar users prefer visually and semantically similar products

Suggestion : consumers need highly rated and less redundant products

30

Data model: p(u,e,x,r)

30

Rating: model data Each Quadruplet (u,e,v,x) is a random vector Discover user/image classes (z,c) and Label (u,e,v,x) with 2 hidden

variables: z: user class, c: image class All variables except x are discrete ~ multinomial distributions, x~GD

Parameters:– Diversity: Penalize predicted ratings for consumed images Xue

- Consumed images become irrelevant Nue={(u,e,xtue,r-),t=1,..,Nue}

- Update Θ from Nue

New data are handled.

31

Algorithm

31

32

Results: Mean Absolute Error (MAE)

32

PCC: Pearson Correlation Coefficients (P. Resnick et al., CSCW 1994) Aspect Model (T. Hofmann, ACM TOIS 2004) Flexible Mixture Model (L. Si & R. Jin, ICML 2003) User Rating Profile (B. Marlin, NIPS 2004) V-FMM: No contextual information, E=Singleton V-GD-FMM: No Feature Selection

Feature Selection improves the rating prediction accuracy

PCC Aspect

FMM URP V-FMM V-GD-FMM I-VCC D-VCC

Avg. MAE 1.327 1.201 1.145 1.116 0.890 0.754 0.712 0.645

Std. Deviation 0.040 0.051 0.036 0.042 0.038 0.027 0.022 0.014

Improvement (%) 0.00 9.49 13.71 15.90 32.94 43.18 51.62 55.84

3333

Thank you

3434

M. S. Allili, D. Ziou. Object tracking in videos using adaptive mixture models and active contours. Neurocomputing 7, pp. 2001-2011, 2008.

M. S. Allili, D. Ziou: Automatic colour-texture image segmentation using active contours. Int. J. Comput. Math. 84(9): 1325-1338, 2007.

S. Boutemedjet, Djemel Ziou. A Graphical Model for Context-Aware Visual Content Recommendation. IEEE Trans. on Multimedia 10, pp. 52-62, 2008.

S. Boutemedjet, N. Bouguila, and D. Ziou (In press). A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2009.

N. Bouguila and D. Ziou: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007.

R. Ksantini, D. Ziou, B. Colin, F. Dubeau. Weighted Pseudometric Discriminatory Power Improvement Using a Bayesian Logistic Regression Model Based on a Variational Method. IEEE Trans. Pattern Anal. Mach. Intell. 30(2): 253-266, 2008.

D. Ziou, T. Hamri, S. Boutemedjet. A hybrid probabilistic framework for content-based image retrieval with feature weighting. Pattern Recognition 42(7): 1511-1519, 2009.

M. L. Kherfi, D. Ziou. Relevance feedback for CBIR: a new approach based on probabilistic feature weighting with positive and negative examples. IEEE Trans. on Image Processing 15(4): 1017-1030 2006.

M.-F. Auclair-Fortier, D. Ziou. A Global Approach for Solving Evolutive Heat Transfer for Image Denoising and Inpainting. IEEE Trans. Image Processing, 15:2558-2574, 2006.

A. F. El Ouafdi, D. Ziou, and H. Krim. A smart stochastic approach for manifolds smoothing. Comput. Graphic Forum 27, pp. 1357-1364, 2008.

References

Learning of Data Collections in High-dimensional Spaces Without Supervision

Documents

Transcript of Learning of Data Collections in High-dimensional Spaces Without Supervision