SOM U C Clustering September 2011

Clustering with Self Organizing Maps

Vahid Moosavi

Supervisor: Prof. Ludger Hovestadt

September 2011

Outline

• SOM Clustering Approaches

• U*C clustering (1)

– Basic Definitions

– The Algorithm

– Results

2(1):Alfred Ultsch: U*C: Self-organized Clustering with Emergent Feature Maps. LWA 2005: 240-244

The Learning Algorithm

Competition Cooperation And Adaptation

Representation

SOM Clustering

• One Stage Clustering: For maps with small number of nodes, each node is representative for a cluster

• Two Stage Clustering (for large maps)(1):– First train the SOM– Then apply any clustering algorithm on the nodes

instead of original data• Partitional clustering algorithms• Hierarchical clustering algorithms

• U*C Clustering Algorithm (2)

(1): VESANTO AND ALHONIEMI: CLUSTERING OF THE SELF-ORGANIZING MAP, IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 11, NO. 3, MAY 2000(2): Alfred Ultsch: U*C: Self-organized Clustering with Emergent Feature Maps. LWA 2005: 240-244 4

Weaknesses of the Existing clustering Algorithms

• Weaknesses of the other algorithms (e.g. K-Means, GK, Hierarchical Clustering):– No. of clusters should be known in advance.– Clustering algorithms are based on some geometrical

assumptions (Euclidean Distance, Ellipsoidal or spherical shapes, …)

– …

• U*C clustering improves all of the above mentioned issues.

U*C clustering(Basic Definitions)

• Component Planes

• U Matrix (1990)

• P Matrix (2003)

• U* Matrix

Presentation and visualization(Component Plane)

Presentation and visualizationU Matrix (1990)

Neuron i: ni

Neighborhood neurons of ni:N(i)

Definition of U-Matrix

A display of all U-heights on top of the grid is called a U-Matrix: Ultsch (1990)

U-Matrix can show visually the hidden clusters in the data set

The Original Data set

The U-Matrix

Water Shed (Border)

Basins (clusters)

Presentation and visualizationP Matrix (2003)

• In some Cases the U-Matrix is not enough. We use measure of the Density in addition to the Distance.

Presentation and visualizationP Matrix (2003)

• In some Cases the U-Matrix is not enough. We use measure of the Density in addition to the Distance.

Presentation and visualizationU* Matrix (1990)

• As the TwoDiamonds data set shows, a combination of distance relationships and density relationships is necessary to give an appropriate clustering. The combination of a U-Matrix and a P-Matrix is called U*-Matrix.

• The Main Idea: The U*-Matrix exhibits the local data distances as heights, when the data density is low (cluster border). If the data density is high, the distances are scaled down to zero (cluster center).

Presentation and visualizationU* Matrix (1990)

U*C Clustering(Main Ideas)

First Main Idea:Uheight in the center of a cluster is smaller than the Uheight on the border of the cluster in the U-matrix .

Second Main Idea:The P-height in the center of a cluster is larger than the P height in the border of a cluster in P-matrix.At cluster borders the local density of the points should decrease substantially

U*C Clustering(Main Ideas)

A movement from one position ni to another position nj with the result that wj is more within a cluster C than wi is called immersive.

•Some times, immersion can be find on U-Matrix (based on Gradient Descent Method).•Some times, immersion can be find on P-Matrix (based on Gradient Ascent Method)

•Then :1. Do Gradient Descent Method on U-Matrix:

•Start from point (Node) n in U-Matrix and go in a direction in its neighborhood to reach to minimum U-height (distance) point U. (this is probably a node within a cluster).

2. Do Gradient Ascent Method on P-Matrix:•Start from point U in P-matrix and go in a direction in its neighborhood to reach to Maximum P, Immersion Points (which will be probably the center of a Cluster)

3. Calculate the watersheds on the U*Matrix based on any existing algorithm.4. Partition Immersion Points using these water sheds to Cluster Centers C1,…,Cc.5. Assign the data sets to the clusters based on the Immersion Points of their corresponding

Unit of the SOM.

U*C Clustering (2005)

U*C Clustering (2005)Some Experimental Results

Conclusion• SOM and can transform high dimensional

Data sets to two dimensional representation and after that just by analyzing the distances and densities of the transformed data, we can find natural clusters hidden in original data sets.

High Dimensional Data Set

SOMModeling

Two Dimensional Representation

U-MatrixP-Matrix,…

Classification and Prediction for future

experiments

Clustering Data Sets

Conclusion• Alternative Way

High Dimensional Data Set

SOMModeling

Two Dimensional Representation

U-MatrixP-Matrix,…

Classification and Prediction for future

experiments

Clustering Data Sets

Feature Selection

and Extraction

Transformed (reduced) Data

THANKS

SOM U C Clustering September 2011

Documents

Transcript of SOM U C Clustering September 2011

Main Clustering Algorithms §K-Means §Hierarchical §SOM.

Fajar A. Nugroho, S.Kom, M.CS CLUSTERING BEST PRACTICEeprints.dinus.ac.id/14571/1/[Materi]_Fajar_A._Nugroho,_S.Kom,_M.CS... · K-Means, K-Medoids, Self ... (SOM), Fuzzy C-Means, etc

Data Mining IIavellido/teaching/13-14/... · Market Segmentation, Shopping Basket analysis Clustering, NNs (SOM, GTM), visualización CONCEPTUAL DESCRIPTION ... Microsoft PowerPoint

OUTLINE Microarrays Processing Microarray Data – K- Means Clustering – Hierarchical Clustering – SOM.

Using geWorkbench: Hierarchical & SOM Clustering Fan Lin, Ph. D Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of.

CLUSTERING. Overview Definition of Clustering Existing clustering methods Clustering examples.

ERES 2011 - Structuring and clustering 20110616 › system › files › pdf › eres2011_303.content.02576.pdfOverview Structuring REAM/ SOM OOM REPM RES ... REIM Functional Levels

Multilevel Clustering via Wasserstein Means - U-M …minhnhat/Full_nestedkmeans.pdf · Multilevel Clustering via Wasserstein Means ... a typical structural view of text data in machine

Clustering nutrients and toxic cyanobacteria communities ... · Clustering nutrients and toxic cyanobacteria communities using a self-organizing map (SOM) A bloom near Venise-en-Quebec

Classification of similar productivity zones in the sugar cane culture using clustering of SOM component planes based on the SOM distance matrix

Clustering: Partition Clustering

Clustering IV. Outline Impossibility theorem for clustering Density-based clustering and subspace clustering Bi-clustering or co-clustering.

Joaquín Dopazo. CNIO. SOM y SOTA: Clustering methods in the analysis of massive biological data.

SOM y SOTA: Clustering methods in the analysis of massive biological data

The Research on Personalized Recommendation Service of ... · recommendation based on indepth learning is usually realized by SOM neural network clustering - algorithm and Naive Bayesian

Clustering - University of Massachusetts Amherst19_clustering.pdf · Mean shift clustering 26 Slide&by&Y.&Ukrainitz&&B.&Sarel. CMPSCI 689 Subhransu Maji (UMASS) /48 Feature: L*u*v*

Spatial Clustering Using Hierarchical SOM - Opencdn.intechopen.com/...Spatial_clustering_using_hierarchical_som.pdf · Spatial Clustering Using Hierarchical SOM Roberto Henriques,

Software Reusability Classification and Predication Using ...targeting the object -oriented quality measurements, as mentioned in [28]. In this paper we applied SOM as clustering technique

Nord universitet - Offentlig journal · U, 15/01778-32 Tilbud om forlenget ansettelse i stilling som stipendiat som følge av koronapandemien - st.nr. 30072200 - FSH Bodø Journaldato:

Galilean Home Ministries · 2019/7/8 · 04 uos puo KuoL som uo,uno .uos p10-J0êK-6 61.1010 som 0101.ua4Dno t.uoaå .10aK u! HOUUOH puo uopnr uospL10J6 kw u! ewos puads 04 6U!410Ñ,

Clustering - University of Massachusetts Amherst19_clustering.pdf · Mean shift clustering 26 Slide&by&Y.&Ukrainitz&&B.&Sarel. CMPSCI 689 Subhransu Maji (UMASS) /48 Feature: Luv*