1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can...

77
1 Back-Propagation Back-Propagation

Transcript of 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can...

Page 1: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

1

Back-PropagationBack-Propagation

Page 2: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

2

Objectives

A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks.

Backpropagation is an approximate steepest descent algorithm, in which the performance index is mean square error.

In order to calculate the derivatives, we need to use the chain rule of calculus.

Page 3: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

3

Motivation

The perceptron learning and the LMS algorithm were designed to train single-layer perceptron-like networks.

They are only able to solve linearly separable classification problems.Parallel Distributed ProcessingThe multilayer perceptron, trained by the backpropagation algorithm, is

currently the most widely used neural network.

Page 4: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

4

Three-Layer Network

321 SSSR Number of neurons in each layer:R: number of inputs Sn: number of neurons in n layer

Page 5: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

5

Pattern Classification: XOR gate

The limitations of the single-layer perceptron (Minsky & Papert, 1969)

0,

0

011 tp

1,

1

022 tp

1,

0

133 tp

0,

1

144 tp

1P

2P 4P

3P

Page 6: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

6

Two-Layer XOR Network

Two-layer, 1-2-1 network

11w

12w

1P

4P

AND

11

1

11n

12n

1

5.1

11a

12a

21n 2

1a

1p

2p

22

11

5.1

Individual Decisions

Page 7: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

7

Solved Problem P11.1

Design a multilayer network to distinguish these categories.

T1 1111 p

T2 1111 p

T3 1111 p

T4 1111 p

Class I Class II01 bWp02 bWp

03 bWp04 bWp

There is no hyperplane that can separate these two categories.

Page 8: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

8

Solution of Problem P11.1

11

1

11n

12n

1

11a

12a

21n 2

1a

1p

2p

1

1

2

2

2

2

3p

4p

AND

OR

Page 9: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

9

Function Approximation

Two-layer, 1-2-1 network

nnfe

nfn

)( ,1

1)( 21

.10,10,10,10 12

11

12

11 bbww

.0,1,1 221

21 bww

Page 10: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

10

Function Approximation

The centers of the steps occur where the net input to a neuron in the first layer is zero.

The steepness of each step can be adjusted by changing the network weights.

110100

110)10(012

12

12

12

12

11

11

11

11

11

wbpbpwn

wbpbpwn

Page 11: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

11

Effect of Parameter Changes

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1

0

1

2

3

12b

20 15 10 5 0

Page 12: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

12

Effect of Parameter Changes

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1

0

1

2

3

21w

1.0

0.5

0.0

-0.5

-1.0

Page 13: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

13

Effect of Parameter Changes

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1

0

1

2

3

21w

1.0

0.5

0.0

-0.5

-1.0

Page 14: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

14

Effect of Parameter Changes

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -1

0

1

2

3

2b

1.0

0.5

0.0

-0.5

-1.0

Page 15: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

15

Function Approximation

Two-layer networks, with sigmoid transfer functions in the hidden layer and linear transfer functions in the output layer, can approximate virtually any function of interest to any degree accuracy, provided sufficiently many hidden units are available.

Page 16: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

16

Backpropagation Algorithm

For multilayer networks the outputs of one layer becomes the input to the following layer.

1,...,2,1,0 ),( 1111 Mmmmmmm baWfaMaapa ,0

Page 17: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

17

Performance Index

Training Set:

Mean Square Error:

Vector Case:

Approximate Mean Square Error:

Approximate Steepest Descent Algorithm

p1 t1{ , } p2 t2{ , } pQ tQ{ , }

F x E e2 = E t a– 2 =

F x E eTe = E t a–

Tt a– =

F̂ x t k a k – T t k a k – eTk e k = =

w i jm

k 1+ wi jm

k F̂

w i jm

------------–= bim

k 1+ bim

k F̂

bim

---------–=

Page 18: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

18

Chain Rule

If f(n) = en and n = 2w, so that f(n(w)) = e2w.

Approximate mean square error:

dw

wdn

dn

ndf

dw

wndf )()())((

2)()())((

nedw

wdn

dn

ndf

dw

wndf

)()()]()([)]()([)(ˆ TT kkkkkkF eeatatx

mji

mi

mi

mjim

ji

mji

mji w

n

n

Fkw

w

Fkwkw

,,

,,,

ˆ)(

ˆ)()1(

mi

mi

mi

mim

i

mi

mi b

n

n

Fkb

b

Fkbkb

ˆ

)(ˆ

)()1(

Page 19: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

19

Sensitivity & Gradient

The net input to the ith neurons of layer m:

The sensitivity of to changes in the ith element of the net input at layer m:

Gradient:

mi

S

j

mj

mji

mi bawn

m

1

1

1, 1 ,1

,

mi

mim

jmji

mi

b

na

w

n

F̂mi

mi nFs ˆ

1

,,

ˆˆ

m

jmim

ji

mi

mi

mji

asw

n

n

F

w

F

mi

mim

i

mi

mi

mi

ssb

n

n

F

b

F

1ˆˆ

Page 20: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

20

Steepest Descent Algorithm

The steepest descent algorithm for the approximate mean square error:

Matrix form:

1,

,,, )(

ˆ)()1(

mj

mi

mjim

ji

mi

mi

mji

mji askw

w

n

n

Fkwkw

mi

mim

i

mi

mi

mi

mi skb

b

n

n

Fkbkb

)(ˆ

)()1(

Wm

k 1+ Wm

k sm

am 1–

T

–=

bmk 1+ bm

k sm–=

sm F

nm

----------

F

n1m

---------

F

n2m

---------

F

nS

mm

-----------

=

Page 21: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

21

BP the Sensitivity

Backpropagation: a recurrence relationship in which the sensitivity at layer m is computed from the sensitivity at layer m+1.

Jacobian matrix:

.

1

2

1

1

1

12

2

12

1

12

11

2

11

1

11

1

111

m

s

m

sm

m

sm

m

s

m

s

m

m

m

m

m

m

s

m

m

m

m

m

m

m

m

mmm

m

m

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

n

Page 22: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

22

Matrix RepressionThe i,j element of Jacobian matrix

mj

mj

mmj

m

mj

mmji

mj

mj

mm

jimj

mjm

jimj

s

l

mi

ml

mli

mj

mi

n

nfn

nw

n

nfw

n

aw

n

baw

n

n

m

)()(g

).(g

)(

1,

1,

1,

1

11,1

,)(11

mmmm

m

nFWn

n

.

)(00

0)(0

00)(

)( 2

1

m

s

m

mm

mm

mm

mng

ng

ng

nF

Page 23: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

23

Recurrence Relation

The recurrence relation for the sensitivity

The sensitivities are propagated backward through the network from the last layer to the first layer.

.))((

ˆ))((

ˆˆ

1T1

11

1

T1

mmmm

mTmm

mm

m

mm

sWnF

n

FWnF

n

F

n

n

n

Fs

.121 ssss MM

Page 24: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

24

Backpropagation Algorithm

At the final layer:

.)(2

)()()(ˆ

1

2

T

Mi

iiiM

i

S

jjj

Mi

Mi

Mi n

aat

n

at

nn

Fs

M

atat

)()( M

iM

Mi

Mi

M

Mi

Mi

Mi

i ngn

nf

n

a

n

a

)()(2 Mi

Mii

Mi ngats

))((2 atnFs MMM

Page 25: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

25

Summary

The first step is to propagate the input forward through the network:

The second step is to propagate the sensitivities backward through the network:Output layer:Hidden layer:

The final step is to update the weights and biases:

Maa

1,...,2,1,0 ),( 1111 Mmmmmmm baWfapa 0

))((2 atnFs MMM 1,2,...,1 ,))(( 1T1 Mmmmmmm sWnFs

T1)()()1( mmmm kk asWW mmm kk sbb )()1(

Page 26: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

26

BP Neural Network

m

jS mw,1

mjw ,1

mjiw ,

Layer m

1

j

mS

1

i

Layer m-1

1mS

mw 1,1

miw 1,

m

S mw1,1

m

SS mmw,1

m

S mw,1

m

Si mw,

1

k

MS

Layer MMa1

Mka

M

S Ma

Layer 1

1p

2p

Rp

11,1w

11,2w

1

1,1Sw

1

,1 RSw

1

2

1S

Page 27: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

27

Ex: Function Approximation

g p 14---p sin+=

1-2-1Network

+

t

ep

Page 28: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

28

Network Architecture

1-2-1Network

ap

Page 29: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

29

G(p) = 1+sin(/4 * p) for -2<=p <=2

P =1

Page 30: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

30

Initial Values

W10 0.27–

0.41–= b1

0 0.48–

0.13–= W2

0 0.09 0.17–= b20 0.48=

Network ResponseSine Wave

-2 -1 0 1 2-1

0

1

2

3

Initial Network Response:

p

2a

Page 31: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

31

Forward Propagation

a0

p 1= =

a1 f1 W1a0 b1+ l ogsig 0.27–

0.41–1

0.48–

0.13–+

logsig 0.75–

0.54–

= = =

a2

f2 W2a1 b2

+ purelin 0.09 0.17–0.321

0.3680.48+( ) 0.446= = =

e t a– 1 4---p sin+

a2– 1 4---1 sin+

0.446– 1.261= = = =

a1

1

1 e0.75+--------------------

1

1 e0.54+--------------------

0.321

0.368= =

Initial input:Output of the 1st layer:

Output of the 2nd layer:

error:

Page 32: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

32

Transfer Func. Derivatives

))(1(1

1

1

11

)1(1

1)(

11

21

aaee

e

e

edn

dnf

nn

n

n

n

1)()(2 ndn

dnf

Page 33: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

33

Backpropagation

The second layer sensitivity:

The first layer sensitivity:522.2261.112

)]([2))((2 22222

enfn atFs

0997.0

0495.0

522.217.0

09.0

368.0)368.01(0

0321.0)321.01(

))(1(0

0))(1())(( 2

22,1

21,1

12

12

11

1122111 ssWnFs

w

w

aa

aaT

Page 34: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

34

Weight Update

Learning rate 1.0

0772.0171.0

368.0321.0]522.2[1.017.009.0

)()0()1( 1222

TasWW

]732.0[]522.2[1.0]48.0[)0()1( 222 sbb

420.0

265.0]1[

0997.0

0495.01.0

41.0

27.0

)()0()1( 0111 TasWW

140.0

475.0

0997.0

0495.01.0

13.0

48.0)0()1( 111 sbb

Page 35: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

35

ex

The network transfer functions are 21 )()( nnf , n

nf1

)(2 ,

and the input / target pair is given to be ))1(),1(( tp .

Perform one iteration of backpropagation with 1 .

For the network shown in the figure below, the initial weights and biases are chosen

to be 1)0(,1)0(,2)0(,1)0( 2211 bwbw .

Page 36: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

36

Choice of Network Structure

Multilayer networks can be used to approximate almost any function, if we have enough neurons in the hidden layers.

We cannot say, in general, how many layers or how many neurons are necessary for adequate performance.

Page 37: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

37

Illustrated Example 1

g p 1i 4----- p sin+=

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

1-3-1 Network 1i 2i

4i 8i

Page 38: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

38

Illustrated Example 2

g p 164

------ p sin+=

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

1-5-1

1-2-1 1-3-1

1-4-1

22 p

Page 39: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

39

Convergence

g p 1 p sin+=

-2 -1 0 1 2-1

0

1

2

3

1

23

4

5

0

-2 -1 0 1 2-1

0

1

2

3

1

2

34

5

0

22 p

Convergence to Global Min. Convergence to Local Min.The numbers to each curve indicate the sequence of iterations.

Page 40: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

40

Generalization

In most cases the multilayer network is trained with a finite number of examples of proper network behavior:

This training set is normally representative of a much larger class of possible input/output pairs.

Can the network successfully generalize what it has learned to the total population?p1 t1{ , } p2 t2{ , } pQ tQ{ , }

Page 41: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

41

Generalization Example

g p 14---p sin+= p 2– 1.6– 1.2– 1.6 2 =

-2 -1 0 1 2-1

0

1

2

3

-2 -1 0 1 2-1

0

1

2

3

1-2-1 1-9-1

For a network to be able to generalize, it should have fewer parameters than there are data points in the training set.

Generalize well Not generalize well

Page 42: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

42

Objectives

The neural networks, trained in a supervised manner, require a target signal to define correct network behavior.

The unsupervised learning rules give networks the ability to learn associations between patterns that occur together frequently.

Associative learning allows networks to perform useful tasks such as pattern recognition (instar) and recall (outstar).

Page 43: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

43

What is an Association?

An association is any link between a system’s input and output such that when a pattern A is presented to the system it will respond with pattern B.

When two patterns are link by an association, the input pattern is referred to as the stimulus and the output pattern is to referred to as the response.

Page 44: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

44

Classic Experiment

Ivan Pavlov He trained a dog to salivate at the sound of a bell, by ringing the bell

whenever food was presented. When the bell is repeatedly paired with the food, the dog is conditioned to salivate at the sound of the bell, even when no food is present.

B. F. Skinner He trained a rat to press a bar in order to obtain a food pellet.

Page 45: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

45

Associative Learning

Anderson and Kohonen independently developed the linear associator in the late 1960s and early 1970s.

Grossberg introduced nonlinear continuous-time associative networks during the same time period.

Page 46: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

46

Simple Associative Network

Single-Input Hard Limit AssociatorRestrict the value of p to be either 0 or 1, indicating whether a stimulus is

absent or present.

The output a indicates the presence or absence of the network’s response.

p w

1

b

n a

stimulus. no ,0

stimulus. ,1p

response. no ,0response. ,1

)5.0()( wphardlimbwphardlima

Page 47: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

47

Two Types of Inputs

Unconditioned Stimulus

Analogous to the food presented to the dog in Pavlov’s experiment.

Conditioned Stimulus

Analogous to the bell in Pavlov’s experiment.

The dog salivates only when food is presented. This is an innate that does not have to be learned.

Page 48: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

48

Banana Associator

An unconditioned stimulus (banana shape) and a conditioned stimulus (banana smell)

The network is to associate the shape of a banana, but not the smell.

p w

1

b

n a0p 0w5.0 ,0 ,10 bww

detected.not shape ,0

detected. shape ,10p

detected.not smell ,0

detected. smell ,1p )( 00 bwppwhardlima

Page 49: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

49

Associative Learning

Both animals and humans tend to associate things occur simultaneously.If a banana smell stimulus occurs simultaneously with a banana concept

response (activated by some other stimulus such as the sight of a banana shape), the network should strengthen the connection between them so that later it can activate its banana concept in response to the banana smell alone.

Page 50: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

50

Unsupervised Hebb Rule

Increasing the weighting wij between a neuron’s input pj and output ai in proportion to their product:

Hebb rule uses only signals available within the layer containing the weighting being updated. Local learning rule

Vector form:

Learning is performed in response to the training sequence

)()()1()( qpqaqwqw jiijij

)()()1()( qqqq paww

)(...,),2(),1( Qppp

Page 51: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

51

Ex: Banana Associator

Initial weights:

Training sequence:

Learning rule:

0)0(,10 ww

}...1)3(,0)3({},1)2(,1)2({},1)1(,0)1({ 000 pppppp

1),()()1()( qpqaqwqw

p w

1

b

n a0p 0w

Shape Smell

Fruit

Network

Banana ?

Banana ?

Smell

Sight

Page 52: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

52

Ex: Banana Associator

First iteration (sight fails):

(no response)

Second iteration (sight works):

(banana)

0)5.01001(

)5.0)1()0()1(()1( 00

hardlim

pwpwhardlima

0100)1()1()0()1( paww

1)5.01011(

)5.0)2()1()2(()2( 00

hardlim

pwpwhardlima

1110)2()2()1()2( paww

Page 53: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

53

Ex: Banana Associator

Third iteration (sight fails):

(banana)

From now on, the network is capable of responding to bananas that are detected either sight or smell. Even if both detection systems suffer intermittent faults, the network will be correct most of the time.

1)5.01101(

)5.0)3()2()3(()3( 00

hardlim

pwpwhardlima

2111)3()3()2()3( paww

Page 54: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

54

Problems of Hebb Rule

Weights will become arbitrarily large

Synapses cannot grow without bound.

There is no mechanism for weights to decrease

If the inputs or outputs of a Hebb network experience ant noise, every weight will grow (however slowly) until the network responds to any stimulus.

Page 55: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

55

Hebb Rule with Decay

, the decay rate, is a positive constant less than one.

This keeps the weight matrix from growing without bound, which can be found by setting both ai and pj to 1, i.e.,

The maximum weight value is determined by the decay rate .

)()()1()1(

)1()()()1()(

qqq

qqqqqT

T

paW

WpaWW

max)1( ijjiijij wpaww

maxijw

Page 56: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

56

Ex: Banana Associator

First iteration (sight fails): no response

Second iteration (sight works): banana

Third iteration (sight fails): banana

.1.0,1),()()1()1()( qpqaqwqw

0)5.0)1()0()1(()1( 00 pwpwhardlima

001.0100)0(1.0)1()1()0()1( wpaww

1)5.0)2()1()2(()2( 00 pwpwhardlima

101.0110)1(1.0)2()2()1()2( wpaww

1)5.0)3()2()3(()3( 00 pwpwhardlima

9.111.0111)2(1.0)3()3()2()3( wpaww

Page 57: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

57

Ex: Banana Associator

101.0

1max

ijw

0 10 20 300

10

20

30

0 10 20 300

2

4

6

8

10

Hebb Rule Hebb with Decay

Page 58: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

58

0 10 20 300

1

2

3

Prob. of Hebb Rule with Decay

Associations will decay away if stimuli are not occasionally presented.

If ai = 0, then

If = 0.1, this reducesto

The weight decays by10% at each iterationfor which ai = 0(no stimulus)

)1()1()( qwqw ijij

)1(9.0)( qwqw ijij

Page 59: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

59

Instar (Recognition Network)

A neuron that has a vector input and a scalar output is referred to as an instar.

This neuron is capable of pattern recognition.

Instar is similar to perceptron, ADALINE and linear associator.

1

b

n a1p

2p

Rp

1,1w

Rw ,1

Page 60: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

60

Instar Operation

Input-output expression:

The instar is active when or

where is the angle between two vectors.

If , the inner product is maximized when the angle is 0.

Assume that all input vectors have the same length (norm).

)()( 1 bhardlimbhardlima T pwWp

bT pw1 bT cos11 pwpw

wp 1

Page 61: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

61

Vector Recognition

If , then the instar will be only active when = 0.

If , then the instarwill be active for a range of angles.

The larger the value of b, the more patterns there will be that can activate the instar, thus making it the less discriminatory.

Forgetting problem in Hebb rule with decay: it requires stimuli to be repeated or associations would be lost.

pw1b

pw1bw1

Page 62: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

62

Instar Rule

Hebb rule:

Hebb rule with decay:

Instar rule: a decay term, the forgetting problem, is add that is proportion to

(a rule allow weight decay only when the instar is active (a !=0)

If

)()()1()( qpqaqwqw jiijij

)()()1()1()( qpqaqwqw jiijij

)(qai

)1()()()()1()( qwqaqpqaqwqw ijijiijij

)]1()()[()1()(

)]1()()[()1()(

qqqaqq

qwqpqaqwqw

iiii

ijjiijij

wpww

Page 63: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

63

Graphical Representation

For the case where the instar is active( ),

For the case where the instaris inactive ( ),

1ia

)()1()1(

)]1()([)1()(

qq

qqqq

i

iii

pw

wpww

)(qp

)1( qi w

)(qi w0ia

)1()( qq ii ww

Page 64: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

64

Ex: Orange Recognizer

The elements of p will be contained to 1 values.

1

2b

n a1p

2p

3p

1,1w

3,1w

0p30 w

Sight of orange

Measured shape

Measured texture

Measured weight

Orange?

weight

texture

shape

p

detectednot orange ,0

visuallydetected orange ,10p

Sight

Fruit

Network

Orange ?

Measure

)( 00 bpwhardlima Wp

b = -2, a value slightly more positive than –||p||2 = -3

Page 65: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

65

Initialization & Training

Initial weights:

The instar rule (=1):

Training sequence:

First iteration:

000)0()0(,3 10 Tw wW

)]1()()[()1()( 111 qqqaqq wpww

,1

1

1

)2(,1)2(,

1

1

1

)1(,0)1( 00

pp pp

response) (no0)2)1()1(()1( 00 Wppwhardlima

Ta 000)]0()1()[1()0()1( 111 wpww

Page 66: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

66

Second Training Iteration

Second iteration:

The network can now recognition the orange by its measurements.

(orange)12

1

1

1

00013

)2)2()2(()2( 00

hardlim

pwhardlima Wp

1

1

1

0

0

0

1

1

1

1

0

0

0

)]1()2()[2()1()2( 111 wpww a

Page 67: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

67

Third Training Iteration

Third iteration:

(orange)12

1

1

1

11103

)2)3()3(()3( 00

hardlim

pwhardlima Wp

1

1

1

1

1

1

1

1

1

1

1

1

1

)]2()3()[3()2()3( 111 wpww a

Orange will now be detected if either set of sensors works.

Page 68: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

68

P13.5

Consider the instar network shown in slide 64, the reaining sequence for this network will consist of following inputs:

{p0(1) = 0, p(1)= }, {p0(2) = 1, p(2)= },

These two sets of inputs are repeatedly presented to the network until the weight matrix w converges.

1. Perform the first four iterations of the instar rule, with learning rate =0.5, Assume that the initial w maytrix is set to all zeros.

2. Display the results of each iteration of the instar rule in graphical form.

1

1

1

1

Page 69: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

69

Kohonen Rule

Kohonen rule:

Learning occurs when the neuron’s index i is a member of the set X(q).

The Kohonen rule can be made equivalent to the instar rule by defining X(q) as the set of all i such that

The Kohonen rule allows the weights of a neuron to learn an input vector and is therefore suitable for recognition applications.

)()],1()([)1()( 111 qXiqqqq wpww

1)( qai

Page 70: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

70

Ourstar (Recall Network)

The outstar network has a scalar input and a vector output.

It can perform pattern recall by associating a stimulus with a vector response.

p

1,1w

1,2w

1,Sw

1n

2n

Sn

1a

2a

Sa

Page 71: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

71

Outstar Operation

Input-output expression:

If we would like the outstar network to associate a stimulus (an input of 1) with a particular output vector a*, set W = a*.

If p = 1, a = satlins(Wp) = satlins(a*p) = a* Hence, the pattern is correctly recalled.

The column of a weight matrix represents the pattern to be recalled.

)( psatlinsa W

Page 72: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

72

Outstar Rule

In instar rule, the weight decay term of Hebb rule is proportional to the output of network, ai.

In outstar rule, the weight decay term of Hebb rule is proportional to the input of network, pj.

If = ,

Learning occurs whenever pj is nonzero (instead of ai). When learning occurs, column wj moves toward the output vector. (complimentary to instar rule)

)1()()()()1()( qwqpqpqaqwqw ijjjiijij

)()]1()([)1()(

)()]1()([)1()(

qpqqqq

qpqwqaqwqw

jjjj

jijiijij

waww

Page 73: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

73

Ex: Pineapple Recaller

Any set of p0 (with 1 values) will be copied to a.

Sight

Fruit

Network

Measurement?

Measure

1n

2n

1a

2a

Measured shape

Measured texture

Measured weight

Identified Pineapple 3n 3a

11,1w

12,2w

13,3w

21,1w

23,3w

11p

12p

13p

2p

Recalled shape

Recalled texture

Recalled weight

)( 00 ppsatlins WWa

100

010

0010W

weight

texture

shape0p

otherwise ,0seen becan pineapple a if ,1

p

Page 74: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

74

Initialization

The outstar rule (=1):

Training sequence:

Pineapple measurements:

)()]1()([)1()( qpqqqq jjjj waww

,1)2(,

1

1

1

)2(,1)1(,

0

0

0

)1( 00

pppp

1

1

1pineapplep

Page 75: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

75

First Training Iteration

First iteration:

response) (no

0

0

0

)1

0

0

0

0

0

0

(

)1()1()1( 00

satlins

ppsatlins WWa

0

0

0

1

0

0

0

0

0

0

0

0

0

)1()]0()1([)0()1( 111 pwaww

Page 76: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

76

Second Training Iteration

Second iteration:

The network forms an association between the sight and the measurements.

given) nts(measureme

1

1

1

)1

0

0

0

1

1

1

()2(

satlinsa

1

1

1

1

0

0

0

1

1

1

0

0

0

)2()]1()2([)1()2( 111 pwaww

Page 77: 1 Back-Propagation. 2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation.

77

Third Training Iteration

Third iteration:

Even if the measurement system fail, the network is now able to recall the measurements of the pineapple when it sees it.

recalled) nts(measureme

1

1

1

)1

1

1

1

0

0

0

()3(

satlinsa

1

1

1

1

1

1

1

1

1

1

1

1

1

)3()]2()3([)2()3( 111 pwaww