Sefl Organizing Map

21
Self Organizing Maps (SOM) Self Organizing Maps (SOM) Nguyen Van Chuc- [email protected]

description

Implemented by Nguyen Van Chuc - Danang University of Economics, Danang- Vietnam

Transcript of Sefl Organizing Map

Page 1: Sefl Organizing Map

Self Organizing Maps (SOM)Self Organizing Maps (SOM)

Nguyen Van Chuc- [email protected]

Page 2: Sefl Organizing Map

OutlineMotivationIntroduction to SOMSOM’s AlgorithmAn ExampleApplication: Using SOM to cluster data

Kohonen’s Neural Network or Self Organizing Maps (SOM)Self Organizing Maps (SOM)

Page 3: Sefl Organizing Map

33

MotivationMotivation

The problem is how to find out semantics relationship among lots of information without manual labor

• How do I know, where to put my new data in, if I know nothing about information‘s topology?

• When I have a topic, how can I get all the information about it, if I don‘t know the place to search them?

Page 4: Sefl Organizing Map

44

Self-Organizing Maps : OriginsSelf-Organizing Maps : Origins

Teuvo Kohonen

•Ideas first introduced by C. von der Malsburg (1973), developed and refined by T. Kohonen (Finland ) (1982)•Neural network algorithm using unsupervised competitive learning•Primarily used for organization and visualization of complex data•Neurons are arranged on a flat grid (map, lattice, 2 dimensional array).•There is no hidden layer. Only an input and output layer.•Each neuron on the grid is an output neuron•Topological relationships within the training set are maintained

Page 5: Sefl Organizing Map

55

Self-Organizing Maps Self-Organizing Maps : The Basic Idea: The Basic Idea

• Make a two dimensional array, or map, and randomize it.

• Present training data to the map and let the cells on the map compete to win in some way (Euclidean distance is usually used)

• Stimulate the winner and some friends in the “neighborhood”. (update weight matrix)

• Do this a bunch of times.• The result is a 2 dimensional “weight” map.

Page 6: Sefl Organizing Map

66

Self-Organizing Maps : IntroductionSelf-Organizing Maps : Introduction

Page 7: Sefl Organizing Map

77

Self-Organizing Maps : IntroductionSelf-Organizing Maps : Introduction

The neurons in the output layer are arranged on a map

2d array of neurons

Set of input signals(connected to all neurons in lattice)

Weighted synapses

x1 x2 x3 xn...

wj1 wj2 wj3 wjn

jj

Page 8: Sefl Organizing Map

88

Self-Organizing Maps : IntroductionSelf-Organizing Maps : Introduction

The figure shows a very small Kohonen network of 4 x 4 nodes connected to the input layer (shown in green) representing a two dimensional vector. Each node has a specific topological position (an x, y coordinate in the lattice) and contains a vector of weights of the same dimension as the input vectors. If the training data consists of vectors, V,  of n dimensions: V1, V2, V3...VnThen each node will contain a corresponding weight vector W, of n dimensions: W1, W2, W3...Wn

Kohonen Network Architecture

Page 9: Sefl Organizing Map

99

Example Self-Organizing MapsExample Self-Organizing Maps

SOM – Result Example

‘Poverty map’ based on 39 indicators from World Bank

statistics (1992)

World Poverty Map A SOM has been used to classify statistical data describing various quality-of-life factors such as state of health, nutrition, educational services etc. . Countries with similar quality-of-life factors end up clustered together. The countries with better quality-of-life are situated toward the upper left and the most poverty stricken countries are toward the lower right.

Page 10: Sefl Organizing Map

1010

Self-Organizing Maps : AlgorithmSelf-Organizing Maps : AlgorithmStep Action

0 Initialize weights. Set max value for R, set learning rate .

1 While stopping condition false do steps 2 to 8

2 For each input vector x do steps 3 to 5

3 For each j neuron, compute the Euclidean distance

4 Find the index J such that D(J ) is a minimum

5 For all neurons j within a specified neighbourhood of J and for all i

6 Update learning rate . It is a decreasing function of the number of epochs.

7 Reduce radius of topological neighbourhood at specified times

8 Test stopping condition. Typically this is a small value of the learning rate with which the weight updates are insignificant.

Page 11: Sefl Organizing Map

ExampleExample

To make the problem very simple, suppose that there are only two neurons in the output layer as shown below:

Page 12: Sefl Organizing Map

ExampleExample

Consider a simple example in which there are only 4 input training patterns.

Following the algorithm presented in the previous algorithm:

Page 13: Sefl Organizing Map

ExampleExample

Page 14: Sefl Organizing Map

ExampleExample

For vector 0011D(1) = 0.7056, D(2) = 2.724Hence J = 1

For vector 0001D(1) = 0.66, D(2) = 2.2768Hence J = 1.

For vector 1000D(1) = 1.8656, D(2) = 0.6768Hence J = 2

Likewise for remains training patterns

Page 15: Sefl Organizing Map

ExampleExample

Now reduce learning rate (step 6):

It can be shown that after 100 presentations of all the input vector, the final weight matrix is

This matrix seems to converge to

Page 16: Sefl Organizing Map

ExampleExample

TEST NETWORKSuppose the input pattern is 1100.Then

Thus neuron 2 is the "winner", and is the localized active region of the SOM. Notice that we may label this input pattern to belong to cluster 2.

For all the other patterns, we find the clusters are as listed below.

Cluster

1 1 0 0 2

0 0 0 1 1

1 0 0 0 2

0 0 1 1 1

This matrix seems to converge to

Page 17: Sefl Organizing Map

ApplicationApplication

Using SOM to cluster DataThe IRIS dataset IRIS is a classical data set used by statisticians to check classification methods.

It is composed by 150 samples of flowers divided into 3 classes (50 setosa, 50 versicolor, 50 virginica) and described by 4 variables (petal length, petal width, sepal length, sepal width

Page 18: Sefl Organizing Map

ApplicationApplication

Using SOM to cluster DataGraphical Results

we can look at the assignment of each neuron (i.e. each neuron will be coloured on the basis of the assigned class).

Page 19: Sefl Organizing Map

ApplicationApplication

Using SOM to cluster DataGraphical Results

we can look at the weights of a specific neuron and at the labels of the samples placed in that neuron

Page 20: Sefl Organizing Map

ConclusionsConclusions

•SOM is Algorithm that projects high-dimensional data onto a two-dimensional map. •The projection preserves the topology of the data so that similar data items will be mapped to nearby locations on the map.•SOM still have many practical applications in pattern recognition, speech analysis, industrial and medical diagnostics, data mining•Large quantity of good quality representative training data required

Page 21: Sefl Organizing Map

Thank you for your attention !