CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... ·...

27
CS494/594: Overview of Self-Organizing Maps (Material mostly derived from http://www.ai-junkie.com/ann/som/som1.html ) April 13, 2006 Instructor: Dr. Lynne E. Parker

Transcript of CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... ·...

Page 1: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

CS494/594:

Overview of Self-Organizing Maps (Material mostly derived from http://www.ai-junkie.com/ann/som/som1.html)

April 13, 2006

Instructor: Dr. Lynne E. Parker

Page 2: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Introduction: Self-Organizing Maps

• Invented by Prof. Teuvo Kohonen, Academy of Finland, in the 1970s-1980s

• Provides way of representing multidimensional data in much lower dimensional spaces (e.g., 1-2 dimensions)– This is similar to data compression technique called

“vector quantization” (clustering for the purpose of data compression)

– Also: creates network that stores info to maintain topological relationships within training set

• Not intended for optimal classification or statistical pattern recognition; SOMs are an abstraction method

Page 3: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Example: Mapping of colors (RGB) into 2 dimensions

• Here is SOM trained to recognize 8 different colors on right

• Colors presented to network as 3D vectors (one per red, green, blue)

• Network has learned to represent them in 2D space

• Note: – Colors clustered into distinct regions– Regions of similar properties are adjacent to each other

Page 4: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOMs are related to K-Means Clustering

K-Means SOM

You choose the number of clusters You choose size and shape of network of clusters;But, SOM won’t force a matching to a particular number of clusters

Input examples are processed one at a time, and the closest centroid is updated

Ditto;But also “neighbors” of centroid are also updated

High-dimensional observations projected to a two-dimensional coordinate system;

Provides similarity between clusters

Page 5: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Why are SOMs of interest?

• Hierarchical clustering is fairly fragile, especially with large data sets– SOMs scale well to large data sets

• K-means clustering finds local features of data, but doesn’t provide an overall organization– SOMs provide global structure

• Parametric clustering assumes you know the underlying distribution– SOMs are unsupervised

Page 6: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM: Unsupervised Clustering

• Here, input vector presented to Self-Organizing Map WITHOUT “correct” answer supplied

• That is – it is unsupervised

• Contrast with neural networks, which use use supervised learning

From hereon, even though SOMs will look like neural nets, forget (for now!) what you know about neural nets, in terms of

neurons, activation functions, feedforward connections, backpropagation, etc. SOMs are different!!

Page 7: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Network Architecture

• 2D lattice of “nodes”, each of which is fully connected to input layer• Here is a small SOM – 4x4 nodes connected to input layer:

• Each node:– Has a specific topological position (i.e., an x,y coordinate in the lattice)– Contains vector of weights of the same dimension as the input vectors

• Input vector V = (v1, v2, v3, …, vd) => weight vector W = (w1, w2, w3, …, wd)

NOTE: Yellow lines between nodes only represent

adjacency; these are NOT weighted connections

(Green is Input Layer)

Page 8: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Another SOM

• 40x40 SOM• Each node has 3 weights: one for each element of the input vector (i.e.,

corresponding to red, green, blue)• Each node is drawn as a rectangular cell• Each “cluster” is a feature classifier – so, graphical output is like a feature

map of the input space

Page 9: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM Topology

• A couple of ways of representing the SOM topology:

Page 10: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm Overview

1. Each node’s weights are initialized.2. A vector is chosen randomly from training data and presented to

lattice.3. Every node is examined to calculate which one’s weights are most like

the input vector. The winning node is called the “Best Matching Unit (BMU)”.

4. The radius of the neighborhood of the BMU is calculated.• Starts large, but diminishes with each time step.• Any node within the radius is “inside the BMU’s neighborhood”

5. Each neighbor node’s (i.e., node from step 4) weights are adjusted to make them more like the input vector. The closer the node is to the BMU, the more its weights get altered.

6. Repeat back to step 2 for N iterations.

Page 11: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm – More Details

1. Each node’s weights are initialized.• Usually to small random values between 0 and 1.

2. A vector is chosen randomly from training data and presented to lattice.

Page 12: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm – More Details (con’t.)

3. Every node is examined to calculate which one’s weights are most like the input vector. The winning node is called the “Best Matching Unit (BMU)”.• Iterate through all nodes, calculating Euclidean distance between each

node’s weight vector and the current input vector.

• Distance calculation:

• Node with closest weight vector is called the BMU.

2

1( )

d

i ii

dist v w=

= −∑

Page 13: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm – More Details (con’t.)

4. The radius of the neighborhood of the BMU is calculated.

• Radius starts large, but diminishes with each time step:

(Yellow nodeYellow node is BMU)

(Green arrowGreen arrow is radius)

0

0 0

radius at time = ( ) 1, 2,3...

where width of lattice at time constant

= current time step

t

t t e t

t

t

λσ σ

σλ

−= =

==

Page 14: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm – More Details (con’t.)

Decreasing radius/neighborhood over time: (Yellow nodeYellow node is BMU)(Green arrowGreen arrow is radius)

increasing time

• In practice, the BMUBMU will also move, according to the input vector presented to the network

• Over time, neighborhood shrinks to the size of just 1 node – the BMU

Page 15: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Learning Algorithm – More Details (con’t.)

5. Each neighbor node’s (i.e., node from step 4) weights are adjusted to make them more like the input vector. The closer the node is to the BMU, the more its weights get altered.

Update equation for all nodes in neighborhood (including the BMU itself):

Decay of learning rate:

Distance influence:

( 1) ( ) ( ) ( )( ( ) ( ))where time step

( ) learning rate (which decreases over time)(t)= influence of distance (from BMU) on learning

W t W t t L t V t W tt

L t

+ = +Θ −==

Θ

0( ) 1, 2,3...t

L t L e tλ−

= =(typically, L0 begins around 0.1, and ends up near 0)

2

22 ( )( ) 1, 2,3...dist

tt e tσ−

Θ = = (where ( ) is current radius value)tσ

Page 16: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Applications of SOMs

• Commonly used as visualization aids• Helpful for seeing relationship between vast amounts of data• Example: World Poverty Map

– Use SOM to classify statistical data describing various quality-of-life factors:• State of health• Nutrition• Educational services• Etc.

– Countries with similar quality-of-life factors end up clustered together

Page 17: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Example: World Poverty Map

• Countries with better quality-of-life are in upper left• Countries that are most poverty-stricken are in lower right• Here, use “hexagonal grid” (commonly called “unified distance matrix, or

“u-matrix”). Each hexagon is a node in the SOM.

(Poverty map based on 39 indicators from World Bank Statistics, 1992)

Page 18: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Example: World Poverty Map (con’t.)

• Can then transfer to world map plot:

• This visualization approach makes it much easier to understand the data

Page 19: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Another Example: Animal Classification

• Animals ordered by SOM• Animals described by attributes (e.g., size, living space)

–Size: Living space: small=0 medium=1 big=2 Land=0 Water=1 Air=2

Mouse Lion Horse Shark DoveSize small bigmedium smallbigLiving space LandLand AirWaterLand

(2/0)(0/0) (0/2)(2/1)(1/0)

(this is just a sampling of the data)

Page 20: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Example: Self-Organizing Maps

A grouping according to similarity has emerged:

Animal names and their attributes

birds

peaceful

hunters

is

has

likesto

Dove Hen Duck Goose Owl Hawk Eagle Fox Dog Wolf Cat Tiger Lion Horse Zebra Cow Small 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0

Medium 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 Big 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1

2 legs 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 4 legs 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 Hair 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1

Hooves 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 Mane 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0

Feathers 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 Hunt 0 0 0 0 1 1 1 1 0 1 1 1 1 0 0 0 Run 0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 0 Fly 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0

Swim 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0

Teuvo Kohonen, Self-Organizing Maps, Springer, 2001

Page 21: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Example: Visualization of Song Collections on a PDA

• SOM visualization and interaction frameworkNeumayer, Lidy, Rauber, Content-based organization of digital audio collections , Fifth Workshop Interactive Musiknetwork, 2005.

Page 22: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Case Study: Applying SOMs to Recognize Topographic Patterns in EEG Data

Remember this?

EEG electrodes reading brain waves: • Rotation task, left brain

• Resting task, with eye blink • Counting task

• Rotation task, right brain

Page 23: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM, EEG Case Study (con’t.)

[From IEEE Transactions on Biomedical Engineering, 42(11): 1062-1068, 1995]• Objective: Develop method to understand background EEG activity, then

use this later to find correlates of learning in disabled children• Input: extractions from short-time power spectra of EEG channels• Node in SOM: represents model for clusters of similar input patterns• “Instantaneous topographic pattern in EEG”: corresponds to location of

sample– Changes in time correspond to trajectory

• SOM learned to distinguish between these classes:– alpha– alpha attenuation– theta of drowsiness– eye movements– EMG artifacts– Electrode artifact

Page 24: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM, EEG Case Study (con’t.)

• Collect data on children with minor learning disabilities (while lying down)• Data feature extraction

– Apply FFT on 1.28s windows of data every 0.64 seconds– Power spectrum reduced to 7 features by integrating values with weighting functions:

– Dimensions reduced to 154

• SOM lattice design: 300 nodes in hexagonal formation

Page 25: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM, EEG Case Study (con’t.)

• After learning, 6 clusters result (3 shown here individually):“Continuous alpha” “Muscle activity” “Eye movements”

Page 26: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

SOM, EEG Case Study (con’t.)

• Main findings:– SOM is able to recognize topographic patterns in EEG data– It can recognize eye movements and muscle activity– It can recognize “background” alpha activity

• Uses:– Aid in analysis of brain activity in neuropsychological experiments– Used in diagnostics for online monitoring and analysis

Page 27: CS494/594: Overview of Self-Organizing Mapsweb.eecs.utk.edu/~leparker/Courses/CS594-spring06/... · Introduction: Self-Organizing Maps • Invented by Prof. Teuvo Kohonen, Academy

Strengths and Limitations of SOMs

• Strengths:– Neighborhood relationships amongst clusters gives you information on

“similarity” of different clusters– Very handy for visualization

• Limitations:– User must choose parameters (although this is true for any learning

algorithm)– Not guaranteed to converge (although it usually does in practice)– Resulting cluster may not correspond to a single natural cluster (mostly due to

dimensionality reduction)