Download - Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Transcript
Page 1: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Practical Tools forSelf-OrganizingMaps

Lutz HamelDept. of Computer Science &StatisticsURI

Chris BrownDept. of ChemistryURI

Page 2: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Self-Organizing Maps(SOMs) A neural network approach to

unsupervised machine learning. Appealing visual presentation of

learned results as a 2D map.

T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001.

Page 3: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Self-Organization andLearning

Self-organization refers to a processin which the internal organization of asystem increases automaticallywithout being guided or managed byan outside source.

This process is due to localinteraction with simple rules.

Local interaction gives rise to globalstructure.

Complexity : Life at the Edge of Chaos, Roger Lewin, University Of Chicago Press; 2nd edition, 2000

We can interpret emerging globalstructures as learned structures.

Learned structures appear asclusters of similar objects.

Page 4: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Self-Organizing Maps

Data Table

“Grid of Neurons”

Algorithm:

Repeat until Done For each row in Data Table Do Find the neuron that best describes the row. Make that neuron look more like the row. Smooth the immediate neighborhood of that neuron. End ForEnd Repeat

Visualization

Page 5: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Feature Vector Construction

In order to use SOMs we need to describe our objects Feature Vectors

small medium big Tw olegs Fourlegs Hair Hooves Mane Feathers Hunt Run Fly Sw im

1 0 0 1 0 0 0 0 1 0 0 0 1

small medium big Tw olegs Fourlegs Hair Hooves Mane Feathers Hunt Run Fly Sw im

0 0 1 0 1 1 1 0 0 0 1 0 0

Page 6: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Training a SOM

Table of Feature VectorsVisualization

“Grid of Neurons”

Page 7: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Applications of SOM

Infrared Spectroscopy Goal: to find out if compounds are chemically

related without performing an expensivechemical analysis.

Each compound is tested for lightabsorbency in the infrared spectrum.

Specific chemical structures absorb specificranges in the infrared spectrum.

This means, each compound has a specific“spectral signature”.

We can use SOMs to investigate similarity.

Page 8: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Training SOM with Spectra

Grid of Neurons

Random Number Spectra

3000 2000

1000

Spectral Library

Page 9: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Self-Organizing-MapMIR Spectra

Page 10: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

MIR SOMFunctional Groups

Alkanes

Alcohols

Acids

Ketones/Aldehyde

CxHy-NH2

Esters

Aromatics

NH2-Ν

Clx-Ν

Ester-Ν

CxHy-Ν

Page 11: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

MIRCentroid Spectra

4000 3000 2000 1000 500Wavenumber, cm-1

Page 12: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

MIRSignificance Spectrum

Wavenumber, cm-1

Sig

nific

ance

4000 3000 2000 1000

-OH

-CH3

-CH2- -C=O

-C-O-

ΝΝΝH

Page 13: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

NIR SOM

Page 14: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Aromatics

Alcohols

Carbonyls

Acids

Alkanes

Amines

NIR SOMFunctional Groups

NH2-Ν

Clx-ΝEster-Ν

CH3-ΝPoly-Ν

CxHy-Ν

Page 15: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

NIRCentroid Spectra

5000 6000 7000 4000 Wavenumber, cm-1

Page 16: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

NIRSignificance Spectrum

5000 6000 7000 4000 Wavenumber, cm-1

Sig

nific

ance

-OH -CH3-CH2-

Ν

Ν-H

Significance Spectrum

Page 17: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

SOMBacterium b-cereus on different agars

Mannitol

Nutrient Blood

Chocolate Blood

“You are what you eat!”

Page 18: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Significance Spectrum b-cereus on different agars

2000 1600 1200 800 Wavenumber, cm-1

Abs

orba

nce

Sig

nific

ance

Sig nif icance

Chocolate b lood

Nutrient

Mannitol

Blood

Page 19: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

SOMBacteria Spectra

spores / vegetative

b-subtilis b-cereus

b-anthracis

b-thur

b-thur

b-subtilis

b-cereus

Page 20: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Significance Spectrum vsb-subtilis 1st Derivative Spectra

2000 1600 1200 800 Wavenumber, cm-1

vegetative

spores

Sig nif icance

Abs

orba

nce

Sig

nific

ance

Page 21: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Applications of SOM

Genome Clustering Goal: trying to understand the

phylogenetic relationship betweendifferent genomes.

Compute bootstrap support ofindividual genomes for differentphylogentic tree topologies, thencluster based on the topology support.

Joint work with Prof. Gogarten, Dept. Molecular Biology, Univ. of Connecticut

Page 22: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Phylogenetic Visualizationwith SOMs

1211

11/44

15

9

Page 23: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Applications of SOM

Clustering Proteins based on thearchitecture of their activation loops. Align the proteins under investigation. Extract the functional centers. Turn 3D representation into 1D

feature vectors. Cluster based on the feature vectors.

Joint work with Dr. Gongqin Sun, Dept. of Cell and Molecular Biology, URI

Page 24: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Structural Classification ofGTPasesCan we structurally distinguish between the Ras and Rho subfamilies?

Ras: 121P, 1CTQ, and 1QRA Rho: 1A2B and 1OW3 F = p-loop, r = 10Å

RasRho1A2B

1CTQ

Page 25: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Two Central Questions

Which features are the most importantones for clustering?

How good is the map?

Page 26: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Variance Matters!

Features with largevariance have a higherprobability of showingstructure than featureswith small variance.

Therefore, featureswith large variancetend to be moresignificant to theclustering process thanfeatures with smallvariance.

Smal

l Var

iianc

e

Large Variance

Page 27: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Bayes Theorem

Using Bayes theorem we turn the observedvariances (observed significances) intosignificance probabilities:

!

P(Ai| +) " observed significance of feature A

i

P(+ | Ai) " probability that feature A

i is significant (significance)

P(Ai) " prior of feature A

i

P(+ | Ak) =

P(Ak

| +)P(Ak)

P(Ai| +)P(A

i)

i

# Significance of feature A

k

Page 28: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Probabilistic FeatureSelection Given the significance probability of

each feature of a data set we can askinteresting questions: How many features do we need to

include in our training data in order tobe 95% sure that we included all thesignificant features?

What is the significance level of thetop 10% of my features?

Page 29: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Feature Selection

Significance Plot

Probability Distribution

Significance = 95%

Features?

Features = 10%

Significance?

Page 30: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Evaluating a SOM

The canonical performance metric formSOMs is the quantization error. Very difficult to related to the training data

(e.g., how small is the optimal quantizationerror?)

Here we take a different approach: we viewa map as non-parametric, generative model.

This gives rise to a new model evaluationcriterion via the classical two sampleproblem.

Page 31: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Generative Models

A generative model is a modelthat we can sample and computenew values of the underlyinginput domain.

The classical generative model isthe Gaussian function, once wehave fitted the function throughour known samples, then we cancompute the probability of anysample of the input domain.

However, the model isparametric; it is governed by themean µ and the standarddeviation σ.

!

Notation : N(µ," 2),#x

Image source: www.wikipedia.com

Page 32: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Insight: SOMs Sample theData Space

Given some distribution in the dataspace, SOM will try to construct asample that looks like it was drawnfrom the same distribution.

It will construct the sample usinginterpolation (neighborhood function)and constraints (the map grid).

We can then measure the quality ofthe map using a statistical two sampleapproach.

Algorithm:

Repeat until Done For each row in Data Table Do Find the neuron that best describes the row. Make that neuron look more like the row. Smooth the immediate neighborhood of that neuron. End ForEnd Repeat

Image source: www.peltarion.com

Page 33: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

SOM as a Non-parametricGenerative Model

!

Let D be our training set drawn from a distribution N(µ," 2), then

N(µD,"

D

2 ) is a good approximation to the original distribution

if D is large enough,

N(µ," 2) # N(µD,"

D

2 ).

Now, let M be the set of samples SOM constructs at its map grid

nodes, then we say that SOM is converged if the mean µM

and the

variance "M

2 of the model samples appear to be drawn from the same

underlying distribution N(µ,") as the training data,

N(µ," 2) # N(µM

,"M

2 )

Page 34: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

SOM as a Non-parametricGenerative Model

!

Now, the distribution N(µ," 2) is unknown, but we have a good

approximation to it as our training set D, N(µD,"

D

2 ).

Therefore, in order to test for convergence we have to show that,

N(µM

,"M

2 ) # N(µD,"

D

2 ),

or "we test that the model samples and training samples were drawn

from the same distribution".

This is an application of the classical statistical two sample test; we

use the student - t test to test that the means µM

and µD are due to

the same distribution and we use the F - test to show that the

variances "M

2 and "D

2 are due to the same distribution.

Page 35: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

SOM as a Non-parametricGenerative Model Observations:

The SOM model is non-parametric (ordistribution free) since there are nodistribution parameters to fit.

We can sample from a SOM model usinglinear interpolation on the node grid.

A converged model is a good fitting model, itmodels the underlying distribution very well.

Page 36: Practical Tools for Self-Organizing Maps€¦ · learned results as a 2D map. T. Kohonen, Self-organizing maps, 3rd ed. Berlin ; New York: Springer, 2001. Self-Organization and Learning

Conclusions

SOMs have a wide range of applications. We have developed two statistical tools that

allow us to evaluate SOMs very effectively: Probabilistic feature selection Goodness of fit.

In the future we need to address thereliance on normal distributions in ourtests…resampling techniques (e.g.bootstrap)