Neural Networks Self Organizing Mapscancelli/retineu11_12/SOM.pdf · Neural Networks Self...

NN 4 2.09.98

1

Neural Networks

Self Organizing Maps

NN 4 1

Unsupervised Learning

Neural networks for unsupervised learning attempt to discover interesting structure in the data, without making

NN 4 2

t out a guse of information about the class of an example.

NN 4 2.09.98

2

K‐means clusteringInitialize K weight vectors, e.g. to randomly chosen examples. Each weight vector represents a cluster.Assign each input example x to the cluster c(x) with the Assign each input example x to the cluster c(x) with the nearest corresponding weight vector:

Update the weights:

)n(wxmin argc(x) jj −=

/)1(jc( )s ch that

jj Pxnw ∑=+

with the number of examples assigned to cluster j

• Repeat until no noticeable changes of weights vector occur.

NN 3 3

jc(x)such that x =

jP

Example I

NN 4 4

Initial Data and Seeds Final Clustering

NN 4 2.09.98

3

Example II

NN 4 5

Initial Data and Seeds Final Clustering

ProblemsHow many clusters?Use a given parameter Kg pWhat similarity measure?Euclidean distanceCorrelation coefficientAd‐hoc similarity measureHow to assess the quality of a clustering?Compact and well separated clusters are better … many different quality

measures have been introduced measures have been introduced.

NN 3 6

NN 4 2.09.98

4

Good ClusteringA good clustering method will produce high quality clusters withclusters with

high intra‐class similaritylow inter‐class similarity

The quality of a clustering result depends on both the similarity measure used by the method and its implementationimplementation.The quality of a clustering method is also measured by its ability to discover hidden structures.

NN 4 7

Self Organizing Maps (SOM)d l k hSOM is an unsupervised neural network that

approximates an unlimited number of input data by a finite set of nodes arranged in a grid, where neighbor nodes correspond to more similar input data.

The model is produced by a learning algorithm that automatically orders the inputs on a one or two‐dimensional grid according to their mutual similarity.

NN 4 8

NN 4 2.09.98

5

Biological Motivation

NN 4 9

Nearby areas of the cortex correspond to related brain functions

The brain maps the external multidimensional representationof the world into a similar 1 or 2 D

Brain’s self-organization

of the world into a similar 1 or 2 Dinternal representation.

That is, the brain processesthe external signals in a topology-preserving way and

NN 4 10

topology preserving way andour computational system should be able to do the same.

NN 4 2.09.98

6

SOM: the ideaSOM: the idea

oo

oo

o''oo

o

oo=data

feature space

' = network W(i)

parameters

' '' '''''

''''

' ''

''' '

''

''

x y z

2-D grid with

W assigned to input neurons

processors

processors

D t t X (X¹ X² X³) f 3 di i l

NN 4 11

- Data: vectors X = (X¹, X², X³) from 3-dimensional space.- SOM: Grid (lattice) of nodes, with local processor (called neuron) in each node; each local processor j has d =3 adaptive parameters . - Goal: change to recover data clusters in X space.jW

jWwww ≡3,21,

SOM: the ideaInput spaceInput layer

Reduced feature spaceMap layer

NN 4 12

Clustering and ordering of the cluster centers in a two dimensional grid

Cluster centers (code vectors) Place of these code vectors in the reduced space

NN 4 2.09.98

7

SOM FormalizationSOM FormalizationEvery input data component is connected with each neuron of Every input data component is connected with each neuron of the lattice.the lattice.Th t l f th l tti Th t l f th l tti ll t d fi ll t d fi i hb h d i hb h d The topology of the lattice The topology of the lattice allows to define a allows to define a neighborhood neighborhood structurestructure on the neurons, like those illustrated below.on the neurons, like those illustrated below.

22--dimensional topologydimensional topologyand two possible neighborhoods

NN 4 13

p g

with a small neighborhood11--dimensional topologydimensional topology

A 2‐dimensional SOM with 3D inputs

Layerof

inputnodes

NN 4 14

NN 4 2.09.98

8

SOM: interpretation

Each SOM neuron can be considered as representative of a cluster containing all the input examples which are mapped to that neuron. For a given input, the output of SOM is the neuronwhose weight vector is most similar to that input.

NN 4 15

SOM: learning Through repeated presentations of the training examples the weight vectors of the neurons are examples, the weight vectors of the neurons are adapted and tend to follow the distribution of the examples.This results in a topological ordering of the neurons, where neurons adjacent to each other in the map tend to have similar weight vectors.The input space of patterns is mapped into a discrete The input space of patterns is mapped into a discrete output space of neurons.

NN 4 16

NN 4 2.09.98

9

Learning Initialization. n=0. Choose random small values for weight vectors components.Sampling. Select a pattern x from the input examples.Similarity matching. Find the winning neuron i(x) at iteration n:

Updating: adjust the weight vectors of all neurons using the following rule

||)()(||minarg nwnxi jj−=

( ))((n)x)()()1( nwhnnwnw rrrr++ η

Continuation: n = n+1. Go to the Sampling step until no noticeable changes in the weights are observed.

NN 4 17

( ))(-(n)x)()()1( , nwhnnwnw jjijj ⋅⋅+=+ η

Neighborhood FunctionGaussian neighborhood function: ⎟⎟

⎟

⎠

⎞

⎜⎜⎜

⎝

⎛ −−= 2

2

, 2exp

σij

ji

rrh

function: ⎠⎝in a 1-dimensional lattice: | j - i |

σ measures the degree of cooperation in the learning process of the excited neurons in the vicinity of the winning neuron. In the learning phase σ is updated at each iteration; during the ordering phase it is updated using the

NN 4 18

during the ordering phase it is updated using the following exponential decay update rule

⎟⎠⎞⎜

⎝⎛−=

10 exp)( T

nn σσ10 and Tσ

with parameters:

NN 4 2.09.98

10

Cooperation1.2

sigma = 50

sigma = 30

hi ;j

0 4

0.6

0.8

1

sigma = 20

sigma = 10

n

0

0.2

0.4

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 169 176 183 190 197

NN 4 19

UPDATE RULE ( ))(-x)()()()1( nwnhnnwnw rrrr⋅⋅+=+ η ( ))(-x)()()()1( nwnhnnwnw jijjj ⋅⋅+=+ η

⎞⎛

Also the learning rate parameter has an exponentialdecay update:

NN 4 20

⎟⎠⎞⎜

⎝⎛−=

20 exp)( T

nn ηη

NN 4 2.09.98

11

Weight update( ))(-x )( )( nwnhn jijη

( )nwx j−( )nwj

x( )1+nwj

NN 4 21

Two‐phases learning approach

S lf i i d i h Th l i 1‐ Self‐organizing or ordering phase. The learning rate and spread of the Gaussian neighborhood function are adapted during the execution of SOM, using for instance the exponential decay update rule.

2 Convergence phase The learning rate and 2‐ Convergence phase. The learning rate and Gaussian spread have small fixed values during the execution of SOM.

NN 4 22

NN 4 2.09.98

12

Training: Ordering Phase Training: Ordering Phase Self organizing or ordering phaseSelf organizing or ordering phase::

There is a topological ordering of weight vectorsThere is a topological ordering of weight vectorsThere is a topological ordering of weight vectors.There is a topological ordering of weight vectors.It may take 1000 or more iterations of SOM algorithm.It may take 1000 or more iterations of SOM algorithm.

The choice of the parameter values is importantThe choice of the parameter values is important

With a proper initial setting of the parameters, the With a proper initial setting of the parameters, the neighborhood of the winning neuron includes almost all neurons in the network, then it shrinks slowly with time.

NN 4 23

Training: Convergence Phase Convergence phase:

Fine tune the weight vectorsFine tune the weight vectors.Must be at least 500 times the number of neurons in the network ⇒ thousands or tens of thousands of iterations.

Choice of parameter values:Choice of parameter values:η(n) maintained on the order of 0.01.η(n) maintained on the order of 0.01.Neighborhood function such that the neighbor of the winning neuron contains only the nearest neighbors. It eventually reduces to one or zero neighboring neurons.

NN 4 24

NN 4 2.09.98

13

Visualization Neurons are visualized as changing positions in the g g pweight space as learning takes place. Each neuron is described by the corresponding weight vectorconsisting of the weights of the links from the input layer to that neuron.Two neurons are connected by an edge if they are direct neighbors in the NN latticeneighbors in the NN lattice.

NN 4 25

Example 1 A two dimensional lattice driven by a two dimensional distribution:100 neurons arranged in a 2D lattice of 10 x 10 nodes.

Input is bidimensional: x = (x1, x2) from uniform distribution in region:

{ (‐1 < x < +1); (‐1 < x < +1) }{ ( 1 < x1 < +1); ( 1 < x2 < +1) }

Weights are initialized with small random values.

NN 4 26

NN 4 2.09.98

14

Example 1

NN 4 27

Initial h function (Example 1)

NN 4 28

NN 4 2.09.98

15

Example 2

A one dimensional lattice driven by a two dimensional A one dimensional lattice driven by a two dimensional distribution:100 neurons arranged in one dimensional lattice.Input space is the same as in Example 1.

Weights are initialized with random values (again like in example 1).

NN 4 29

Example 2

NN 4 30

NN 4 2.09.98

16

Example 2

NN 4 31

Application: Italian olive oilApplication: Italian olive oil572 samples of olive oil were collected from 9 Italian from 9 Italian provinces.Content of 8 fats was determined for each oil. SOM 20 x 20 network,Map 8D => 2D.

NN 4 32

Note that topographical relations are preserved.

NN 4 2.09.98

17

Other real‐life applicationsHelsinki University of Technology web site http://www.cis.hut.fi/research/refs/has a list of > 7000 papers on SOM and its applications!p p pp

Brain research: creation of various topographical maps in motor, auditory and visual areas.AI and robotics: analysis of data from sensors, control of robot’s movement (motor maps), spatial orientation maps.Information retrieval and text categorization.Bioinformatics: clusterization of genes, protein properties, g , p p p ,chemical compoundsBusiness: economical business and financial data processingData compression (images and audio), information filtering.Medical diagnosis.

NN 4 33

And some more ..And some more ..Natural language processing: linguistic analysis, parsing, learning languages, hyphenation patterns.g g g , yp p

Optimization: configuration of telephone connections, VLSIdesign, time series prediction, scheduling algorithms.

Signal processing: adaptive filters, real‐time signal analysis, radar, sonar seismic, EEG and other medical signals ...

Image recognition and processing: segmentation, object recognition, texture recognition ...

Content‐based retrieval

NN 4 34

Neural Networks Self Organizing Mapscancelli/retineu11_12/SOM.pdf · Neural Networks Self...

Documents

Transcript of Neural Networks Self Organizing Mapscancelli/retineu11_12/SOM.pdf · Neural Networks Self...