Unsupervised Learning of Prototypes and Attribute Weights

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Advisor ： Dr. Hsu

Presenter ： Yu Cheng Chen

Author: Hichem Frigui, Olfa Nasraoui

Unsupervised Learning of Prototypes and Attribute Weights

Transactions on Pattern Recognition 2004, Pages 567-581


N.Y.U.S.T.

I. M.Outline

Motivation Objective Introduction Background Simultaneous clustering and

attribute discrimination Application Conclusions Personal Opinion


N.Y.U.S.T.

I. M.Motivation

The selected and weighted attributes can effect the learning algorithms significantly.

Several methods have been proposed for feature selection and weighting.

Assume feature relevance is invariant

Only appropriate for binary weighting

No methods exist for assigning different weights for distinct classes of a data set prior to clustering


N.Y.U.S.T.

I. M.Objective

Propose a method to perform clustering and feature weighting simultaneously.

For different cluster, we assign different feature weights.


N.Y.U.S.T.

I. M.Introduction

Illustrate the need for di1erent sets of feature weights for di1erent clusters.

Dirt Dirt


N.Y.U.S.T.

I. M.Background

Prototype-based clustering- Fuzzy C-mean─ X = {xj | j = 1,…,N} be a set of N feature vectors.

─ B=(B1,…,Bc) represent the prototype set of C clusters.─ uij is the menbership of point xj in cluster Bi.

─ Minimize the equation 2.


N.Y.U.S.T.

I. M.Background

Prototype-based clustering- Fuzzy C-mean


N.Y.U.S.T.

I. M.Background

Fuzzy C-mean─ Cannot automatic determinate the optimum number of cluster─ C has to be specified a priori.

CA (Competitive Agglomeration)


N.Y.U.S.T.

I. M.Simultaneous clustering and attribute discrimination

Search for the optimal prototype parameters, B, and the optimal set of feature weights, V, simultaneously.

SCAD1 & SCAD2

─ vik represents the relevance weight of feature k in cluster I

─ dijk = | xjk − cik |


N.Y.U.S.T.


To optimize J1, with respect to V, we use the Lagrange multiplier technique.


N.Y.U.S.T.


The choice of δi in Eq. (9) is important─ If δi is too small, then the 1st term dominates

and only one feature in cluster i will be maximally relevant and assigned a weight of 1

The remaining features get assigned 0 weights.


N.Y.U.S.T.


Updated uij

Updated cik


N.Y.U.S.T.


SCAD2─ q is referred as a “discrimination exponent”.


N.Y.U.S.T.


SCAD2─ Updated uij


N.Y.U.S.T.

I. M.

Simultaneous clustering and attribute discrimination:unknown number of clusters

The objective functions in (5) and (21) complement each other, and can easily be combined into one objective function.

This algorithm is called SCAD2-CA.


N.Y.U.S.T.

I. M.Application1:color image segmentation

We illustrate the ability of SCAD2real color images

Feature data extractionTexture features: 3 attributesColor features: 2 attributes

Position features: 2 attributes (x and y)


N.Y.U.S.T.


Dirt

Grass


N.Y.U.S.T.



N.Y.U.S.T.

I. M.Application 2:supervised classification

We use SCAD2-CA for supervised classification.─ Iris data set ─ the Wisconsin Breast Cancer data set─ the Pima Indians Diabetes data set ─ the Heart Disease data set.


N.Y.U.S.T.

I. M.Conclusions

We have proposed a new approach to perform clustering and feature weighting simultaneously.

SCAD2-CA can determine the “optimal” number of clusters automatically.


N.Y.U.S.T.

I. M.Personal Opinion

Advantages─ Take into account different feature weights in different cluster.─ clustering and feature weighting simultaneously─ Writing skill

Application─ Should be applied the idea in our clustering algorithms

Limited ─ Only suit for numeric data.

Discussion ─ Clustering techniques are very hard to improve

Unsupervised Learning of Prototypes and Attribute Weights

Documents

Transcript of Unsupervised Learning of Prototypes and Attribute Weights