Introduction to Neural Networks

45
Introduction to Neural Networks John Paxton Montana State University Summer 2003

description

Introduction to Neural Networks. John Paxton Montana State University Summer 2003. Chapter 4: Competition. Force a decision (yes, no, maybe) to be made. Winner take all is a common approach. Kohonen learning w j (new) = w j (old) + a (x – w j (old)) - PowerPoint PPT Presentation

Transcript of Introduction to Neural Networks

Page 1: Introduction to Neural Networks

Introduction to Neural Networks

John Paxton

Montana State University

Summer 2003

Page 2: Introduction to Neural Networks

Chapter 4: Competition

• Force a decision (yes, no, maybe) to be made.

• Winner take all is a common approach.

• Kohonen learningwj(new) = wj(old) + (x – wj(old))

• wj is closest weight vector, determined by Euclidean distance.

Page 3: Introduction to Neural Networks

MaxNet

• Lippman, 1987

• Fixed-weight competitive net.

• Activation function f(x) = x if x > 0, else 0.

• Architecture

a1 a2

-1

1

Page 4: Introduction to Neural Networks

Algorithm

1. wij = 1 if i = j, otherwise –

2. aj(0) = si, t = 0.

3. aj(t+1) = f[aj(t) –*k<>j ak(t)]

4. go to step 3 if more than one node has a non-zero activation

Special Case: More than one node has the same maximum activation.

Page 5: Introduction to Neural Networks

Example

• s1 = .5, s2 = .1, = .1

• a1(0) = .5, a2(0) = .1

• a1(1) = .49, a2(1) = .05

• a1(2) = .485, a2(2) = .001

• a1(3) = .4849, a2(3) = 0

Page 6: Introduction to Neural Networks

Mexican Hat

• Kohonen, 1989

• Contrast enhancement

• Architecture (w0, w1, w2, w3)

• w0 (xi -> xi) , w1 (xi+1 -> xi and xi-1 ->xi)

xi-3 xi-2 xi-1 xi xi+1 xi+2 xi+3

0 - + + + - 0

Page 7: Introduction to Neural Networks

Algorithm

1. initialize weights

2. xi(0) = si

3. for some number of steps do

4. xi(t+1) = f [ wkxi+k(t) ]

5. xi(t+1) = max(0, xi(t))

Page 8: Introduction to Neural Networks

Example

• x1, x2, x3, x4, x5

• radius 0 weight = 1• radius 1 weight = 1• radius 2 weight = -.5• all other radii weights = 0• s = (0 .5 1 .5 0)• f(x) = 0 if x < 0, x if 0 <= x <= 2, 2

otherwise

Page 9: Introduction to Neural Networks

Example

• x(0) = (0 .5 1 .5 1)

• x1(1) = 1(0) + 1(.5) -.5(1) = 0

• x2(1) = 1(0) + 1(.5) + 1(1) -.5(.5) = 1.25

• x3(1) = -.5(0) + 1(.5) + 1(1) + 1(.5) - .5(0) = 2.0

• x4(1) = 1.25

• x5(1) = 0

Page 10: Introduction to Neural Networks

Why the name?

• Plot x(0) vs. x(1)

x1 x2 x3 x4 x5

2

1

0

Page 11: Introduction to Neural Networks

Hamming Net

• Lippman, 1987

• Maximum likelihood classifier

• The similarity of 2 vectors is taken to be n – H(v1, v2)

where H is the Hamming distance

• Uses MaxNet with similarity metric

Page 12: Introduction to Neural Networks

Architecture

• Concrete example:

x1

x2

x3

y2

y1

MaxNet

Page 13: Introduction to Neural Networks

Algorithm

1. wij = si(j)/2

2. n is the dimensionality of a vector

3. yin.j = xiwij + (n/2)

4. select max(yin.j) using MaxNet

Page 14: Introduction to Neural Networks

Example

• Training examples: (1 1 1), (-1 -1 -1)

• n = 3

• yin.1 = 1(.5) + 1(.5) + 1(.5) + 1.5 = 3

• yin.2 = 1(-.5) + 1(-.5) + 1(-.5) + 1.5 = 0

• These last 2 quantities represent the Hamming distance

• They are then fed into MaxNet.

Page 15: Introduction to Neural Networks

Kohonen Self-Organizing Maps

• Kohonen, 1989

• Maps inputs onto one of m clusters

• Human brains seem to be able to self organize.

Page 16: Introduction to Neural Networks

Architecture

x1

ym

y1

xn

Page 17: Introduction to Neural Networks

Neighborhoods

• Linear 3 2 1 # 1 2 3

• Rectangular 2 2 2 2 2 2 1 1 1 2 2 1 # 1 2 2 1 1 1 2 2 2 2 2 2

Page 18: Introduction to Neural Networks

Algorithm

1. initialize wij

2. select topology of yi

3. select learning rate parameters

4. while stopping criteria not reached

5. for each input vector do

6. compute D(j) = (wij – xi)2

for each j

Page 19: Introduction to Neural Networks

Algorithm.

7. select minimum D(j)

8. update neighborhood units wij(new) = wij(old) + [xi – wij(old)]

9. update 10. reduce radius of neighborhood

at specified times

Page 20: Introduction to Neural Networks

Example

• Place (1 1 0 0), (0 0 0 1), (1 0 0 0), (0 0 1 1) into two clusters

(0) = .6(t+1) = .5 * (t)• random initial weights

.2 .8

.6 .4

.5 .7

.9 .3

Page 21: Introduction to Neural Networks

Example

• Present (1 1 0 0)

• D(1) = (.2 – 1)2 + (.6 – 1)2 + (.5 – 0)2 + (.9 – 0)2 = 1.86

• D(2) = .98

• D(2) wins!

Page 22: Introduction to Neural Networks

Example

• wi2(new) = wi2(old) + .6[xi – wi2(old)]

.2 .92 (bigger)

.6 .76 (bigger)

.5 .28 (smaller)

.9 .12 (smaller)

• This example assumes no neighborhood

Page 23: Introduction to Neural Networks

Example

• After many epochs

0 1 (1 1 0 0) -> category 20 .5 (0 0 0 1) -> category 1.5 0 (1 0 0 0) -> category 21 0 (0 0 1 1) -> category 1

Page 24: Introduction to Neural Networks

Applications

• Grouping characters

• Travelling Salesperson Problem– Cluster units can be represented graphically

by weight vectors– Linear neighborhoods can be used with the

first and last cluster units connected

Page 25: Introduction to Neural Networks

Learning Vector Quantization

• Kohonen, 1989

• Supervised learning

• There can be several output units per class

Page 26: Introduction to Neural Networks

Architecture

• Like Kohonen nets, but no topology for output units

• Each yi represents a known class

x1

ym

y1

xn

Page 27: Introduction to Neural Networks

Algorithm

1. Initialize the weights

(first m training examples, random)

2. choose 3. while stopping criteria not reached do

(number of iterations, is very small)

4. for each training vector do

Page 28: Introduction to Neural Networks

Algorithm

5. find minimum || x – wj ||

6. if minimum is target class

wj(new) = wj(old) + [x – wj(old)]

else

wj(new) = wj(old) – [x – wj(old)]

7. reduce

Page 29: Introduction to Neural Networks

Example

• (1 1 -1 -1) belongs to category 1• (-1 -1 -1 1) belongs to category 2• (-1 -1 1 1) belongs to category 2• (1 -1 -1 -1) belongs to category 1• (-1 1 1 -1) belongs to category 2

• 2 output units, y1 represents category 1 and y2 represents category 2

Page 30: Introduction to Neural Networks

Example

• Initial weights (where did these come from?

1 -11 -1-1 -1-1 1

= .1

Page 31: Introduction to Neural Networks

Example

• Present training example 3, (-1 -1 1 1). It belongs to category 2.

• D(1) = 16 = (1 + 1)2 + (1 + 1)2 + (-1 -1)2 + (-1-1)2

• D(2) = 4

• Category 2 wins. That is correct!

Page 32: Introduction to Neural Networks

Example

• w2(new) = (-1 -1 -1 1) + .1[(-1 -1 1 1) - (-1 -1 -1 1)] =

(-1 -1 -.8 1)

Page 33: Introduction to Neural Networks

Issues

• How many yi should be used?

• How should we choose the class that each yi should represent?

• LVQ2, LVQ3 are enhancements to LVQ that modify the runner-up sometimes

Page 34: Introduction to Neural Networks

Counterpropagation

• Hecht-Nielsen, 1987

• There are input, output, and clustering layers

• Can be used to compress data

• Can be used to approximate functions

• Can be used to associate patterns

Page 35: Introduction to Neural Networks

Stages

• Stage 1: Cluster input vectors

• Stage 2: Adapt weights from cluster units to output units

Page 36: Introduction to Neural Networks

Stage 1 Architecture

x1

xn zp

z1

ym

y1

w11 v11

Page 37: Introduction to Neural Networks

Stage 2 Architecture

x*1

x*n

zj

y*m

y*1tj1 vj1

Page 38: Introduction to Neural Networks

Full Counterpropagation

• Stage 1 Algorithm

1. initialize weights, 2. while stopping criteria is false do

3. for each training vector pair do

4. minimize ||x – wj|| + ||y – vj||wj(new) = wj(old) + [x – wj(old)]vj(new) = vj(old) + [y-vj(old)]

5. reduce

Page 39: Introduction to Neural Networks

Stage 2 Algorithm

1. while stopping criteria is false

2. for each training vector pair do

3. perform step 4 above

4. tj(new) = tj(old) + [x – tj(old)]

vj(new) = vj(old) + [y – vj(old)]

Page 40: Introduction to Neural Networks

Partial Example

• Approximate y = 1/x [0.1, 10.0]

• 1 x unit

• 1 y unit

• 10 z units

• 1 x* unit

• 1 y* unit

Page 41: Introduction to Neural Networks

Partial Example

• v11 = .11, w11 = 9.0• v12 = .14, w12 = 7.0• …• v10,1 = 9.0, w10,1 = .11

• test .12, predict 9.0.

• In this example, the output weights will converge to the cluster weights.

Page 42: Introduction to Neural Networks

Forward Only Counterpropagation

• Sometimes the function y = f(x) is not invertible.

• Architecture (only 1 z unit active)

x1

xn zp

z1

ym

y1

Page 43: Introduction to Neural Networks

Stage 1 Algorithm

1. initialize weights, (.1), (.6)

2. while stopping criteria is false do

3. for each input vector do

4. find minimum || x – w||

w(new) = w(old) + [x – w(old)]

5. reduce

Page 44: Introduction to Neural Networks

Stage 2 Algorithm

1. while stopping criteria is false do2. for each training vector pair do3. find minimum || x – w ||

w(new) = w(old) + [x – w(old)]v(new) = v(old) + [y – v(old)]

4. reduce

Note: interpolation is possible.

Page 45: Introduction to Neural Networks

Example

• y = f(x) over [0.1, 10.0]

• 10 zi units

• After phase 1, zi = 0.5, 1.5, …, 9.5.

• After phase 2, zi = 5.5, 0.75, …, 0.1