Sefl Organizing Map

Self Organizing Maps (SOM)Self Organizing Maps (SOM)

Nguyen Van Chuc- [email protected]

OutlineMotivationIntroduction to SOMSOM’s AlgorithmAn ExampleApplication: Using SOM to cluster data

Kohonen’s Neural Network or Self Organizing Maps (SOM)Self Organizing Maps (SOM)

33

MotivationMotivation

The problem is how to find out semantics relationship among lots of information without manual labor

• How do I know, where to put my new data in, if I know nothing about information‘s topology?

• When I have a topic, how can I get all the information about it, if I don‘t know the place to search them?

44

Self-Organizing Maps : OriginsSelf-Organizing Maps : Origins

Teuvo Kohonen

•Ideas first introduced by C. von der Malsburg (1973), developed and refined by T. Kohonen (Finland ) (1982)•Neural network algorithm using unsupervised competitive learning•Primarily used for organization and visualization of complex data•Neurons are arranged on a flat grid (map, lattice, 2 dimensional array).•There is no hidden layer. Only an input and output layer.•Each neuron on the grid is an output neuron•Topological relationships within the training set are maintained

55

Self-Organizing Maps Self-Organizing Maps : The Basic Idea: The Basic Idea

• Make a two dimensional array, or map, and randomize it.

• Present training data to the map and let the cells on the map compete to win in some way (Euclidean distance is usually used)

• Stimulate the winner and some friends in the “neighborhood”. (update weight matrix)

• Do this a bunch of times.• The result is a 2 dimensional “weight” map.

66

Self-Organizing Maps : IntroductionSelf-Organizing Maps : Introduction

77


The neurons in the output layer are arranged on a map

2d array of neurons

Set of input signals(connected to all neurons in lattice)

Weighted synapses

x1 x2 x3 xn...

wj1 wj2 wj3 wjn

jj

88


The figure shows a very small Kohonen network of 4 x 4 nodes connected to the input layer (shown in green) representing a two dimensional vector. Each node has a specific topological position (an x, y coordinate in the lattice) and contains a vector of weights of the same dimension as the input vectors. If the training data consists of vectors, V, of n dimensions: V1, V2, V3...VnThen each node will contain a corresponding weight vector W, of n dimensions: W1, W2, W3...Wn

Kohonen Network Architecture

99

Example Self-Organizing MapsExample Self-Organizing Maps

SOM – Result Example

‘Poverty map’ based on 39 indicators from World Bank

statistics (1992)

World Poverty Map A SOM has been used to classify statistical data describing various quality-of-life factors such as state of health, nutrition, educational services etc. . Countries with similar quality-of-life factors end up clustered together. The countries with better quality-of-life are situated toward the upper left and the most poverty stricken countries are toward the lower right.

1010

Self-Organizing Maps : AlgorithmSelf-Organizing Maps : AlgorithmStep Action

0 Initialize weights. Set max value for R, set learning rate .

1 While stopping condition false do steps 2 to 8

2 For each input vector x do steps 3 to 5

3 For each j neuron, compute the Euclidean distance

4 Find the index J such that D(J ) is a minimum

5 For all neurons j within a specified neighbourhood of J and for all i

6 Update learning rate . It is a decreasing function of the number of epochs.

7 Reduce radius of topological neighbourhood at specified times

8 Test stopping condition. Typically this is a small value of the learning rate with which the weight updates are insignificant.

ExampleExample

To make the problem very simple, suppose that there are only two neurons in the output layer as shown below:

ExampleExample

Consider a simple example in which there are only 4 input training patterns.

Following the algorithm presented in the previous algorithm:

ExampleExample

ExampleExample

For vector 0011D(1) = 0.7056, D(2) = 2.724Hence J = 1

For vector 0001D(1) = 0.66, D(2) = 2.2768Hence J = 1.

For vector 1000D(1) = 1.8656, D(2) = 0.6768Hence J = 2

Likewise for remains training patterns

ExampleExample

Now reduce learning rate (step 6):

It can be shown that after 100 presentations of all the input vector, the final weight matrix is

This matrix seems to converge to

ExampleExample

TEST NETWORKSuppose the input pattern is 1100.Then

Thus neuron 2 is the "winner", and is the localized active region of the SOM. Notice that we may label this input pattern to belong to cluster 2.

For all the other patterns, we find the clusters are as listed below.

Cluster

1 1 0 0 2

0 0 0 1 1

1 0 0 0 2

0 0 1 1 1

This matrix seems to converge to

ApplicationApplication

Using SOM to cluster DataThe IRIS dataset IRIS is a classical data set used by statisticians to check classification methods.

It is composed by 150 samples of flowers divided into 3 classes (50 setosa, 50 versicolor, 50 virginica) and described by 4 variables (petal length, petal width, sepal length, sepal width


Using SOM to cluster DataGraphical Results

we can look at the assignment of each neuron (i.e. each neuron will be coloured on the basis of the assigned class).


Using SOM to cluster DataGraphical Results

we can look at the weights of a specific neuron and at the labels of the samples placed in that neuron

ConclusionsConclusions

•SOM is Algorithm that projects high-dimensional data onto a two-dimensional map. •The projection preserves the topology of the data so that similar data items will be mapped to nearby locations on the map.•SOM still have many practical applications in pattern recognition, speech analysis, industrial and medical diagnostics, data mining•Large quantity of good quality representative training data required

Thank you for your attention !

Sefl Organizing Map

Education

Transcript of Sefl Organizing Map