Unsupervised Learning of Prototypes and Attribute Weights
description
Transcript of Unsupervised Learning of Prototypes and Attribute Weights
![Page 1: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Advisor : Dr. Hsu
Presenter : Yu Cheng Chen
Author: Hichem Frigui, Olfa Nasraoui
Unsupervised Learning of Prototypes and Attribute Weights
Transactions on Pattern Recognition 2004, Pages 567-581
![Page 2: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Outline
Motivation Objective Introduction Background Simultaneous clustering and
attribute discrimination Application Conclusions Personal Opinion
![Page 3: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Motivation
The selected and weighted attributes can effect the learning algorithms significantly.
Several methods have been proposed for feature selection and weighting.
Assume feature relevance is invariant
Only appropriate for binary weighting
No methods exist for assigning different weights for distinct classes of a data set prior to clustering
![Page 4: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Objective
Propose a method to perform clustering and feature weighting simultaneously.
For different cluster, we assign different feature weights.
![Page 5: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Introduction
Illustrate the need for di1erent sets of feature weights for di1erent clusters.
Dirt Dirt
![Page 6: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Background
Prototype-based clustering- Fuzzy C-mean─ X = {xj | j = 1,…,N} be a set of N feature vectors.
─ B=(B1,…,Bc) represent the prototype set of C clusters.─ uij is the menbership of point xj in cluster Bi.
─ Minimize the equation 2.
![Page 7: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Background
Prototype-based clustering- Fuzzy C-mean
![Page 8: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Background
Fuzzy C-mean─ Cannot automatic determinate the optimum number of cluster─ C has to be specified a priori.
CA (Competitive Agglomeration)
![Page 9: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
Search for the optimal prototype parameters, B, and the optimal set of feature weights, V, simultaneously.
SCAD1 & SCAD2
─ vik represents the relevance weight of feature k in cluster I
─ dijk = | xjk − cik |
![Page 10: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
To optimize J1, with respect to V, we use the Lagrange multiplier technique.
![Page 11: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
The choice of δi in Eq. (9) is important─ If δi is too small, then the 1st term dominates
and only one feature in cluster i will be maximally relevant and assigned a weight of 1
The remaining features get assigned 0 weights.
![Page 12: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
Updated uij
Updated cik
![Page 13: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
SCAD2─ q is referred as a “discrimination exponent”.
![Page 14: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Simultaneous clustering and attribute discrimination
SCAD2─ Updated uij
![Page 15: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/15.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
Simultaneous clustering and attribute discrimination:unknown number of clusters
The objective functions in (5) and (21) complement each other, and can easily be combined into one objective function.
This algorithm is called SCAD2-CA.
![Page 16: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/16.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Application1:color image segmentation
We illustrate the ability of SCAD2real color images
Feature data extractionTexture features: 3 attributesColor features: 2 attributes
Position features: 2 attributes (x and y)
![Page 17: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/17.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Application1:color image segmentation
Dirt
Grass
![Page 18: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/18.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Application1:color image segmentation
![Page 19: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/19.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Application 2:supervised classification
We use SCAD2-CA for supervised classification.─ Iris data set ─ the Wisconsin Breast Cancer data set─ the Pima Indians Diabetes data set ─ the Heart Disease data set.
![Page 20: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/20.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Conclusions
We have proposed a new approach to perform clustering and feature weighting simultaneously.
SCAD2-CA can determine the “optimal” number of clusters automatically.
![Page 21: Unsupervised Learning of Prototypes and Attribute Weights](https://reader036.fdocuments.in/reader036/viewer/2022062423/5681477a550346895db4aed7/html5/thumbnails/21.jpg)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Personal Opinion
Advantages─ Take into account different feature weights in different cluster.─ clustering and feature weighting simultaneously─ Writing skill
Application─ Should be applied the idea in our clustering algorithms
Limited ─ Only suit for numeric data.
Discussion ─ Clustering techniques are very hard to improve