Download - Hopfield.ppt [modalità compatibilità] - Dipartimento …cancelli/retineu11_12/Hopfield.pdfHopfield NN for Optimization yOptimization problems (like Traveling Salesman) can be encoded

NN 5 2.09.98

Neural Networks 1

Associative Memories We consider now NN models for unsupervised learning problems called auto‐association problems

HOP

problems, called auto association problems. Association is the task of mapping patterns to patterns.In an associative memory the stimulus of an incomplete or corrupted pattern leads to the response of a stored pattern that corresponds in some manner to the input pattern.A neural network model most commonly used for (auto‐)

i ti bl i th H fi ld t kassociation problems is the Hopfield network.

NN 5 2

NN 5 2.09.98

Neural Networks 2

Example HOP

• States Bit maps

Intermediate state of Hopfield net

Corrupted image Output

NN 5 3

States Bit maps.• Attractors Prototype patterns.• Input An arbitrary pattern

(e.g. picture with noise).• Output The best prototype for that pattern.

HOPFIELD NETWORKSThe Hopfield network implements a so‐called content addressable memory

HOP

addressable memory. A collection of patterns called fundamental memories is stored in the NN by means of weights.Each neuron represents a component of the input.The weight of the link between two neurons measures the correlation between the two corresponding components correlation between the two corresponding components over the fundamental memories. If the weight is high then the corresponding components are often equal in the fundamental memories.

NN 5 4

NN 5 2.09.98

Neural Networks 3

ARCHITECTURE: recurrent

HOP

z-1

z-1

z-1

Multiple-loop feedback systemwith no self-feedback

NN 5 5

z 1

unit-delay operator

Hopfield discrete NN Input vectors values are in {‐1,1} (or {0,1}).The number of neurons is equal to the input dimension.

HOP

q pEvery neuron has a link from every other neuron (recurrent architecture) except itself (no self‐feedback).The neuron state at time n is its output value.The network state at time n is the vector of neurons states. The activation function used to update a neuron state is the i f ti b t if th i t f th ti ti f ti i sign function but if the input of the activation function is 0 then the new output (state) of the neuron is equal to the old one.Weights are symmetric:

NN 5 6

jiij ww =

NN 5 2.09.98

Neural Networks 4

Notation

HOP

N: input dimension.M: number of fundamental memories.

i‐th component of the fundamental memory.State of neuron i at time n.

if μ μ)(nxi

NN 5 7

Weights computation 1. Storage. Let f1, f2, … , fM denote a known set of N-dimensional fundamental memories. The synaptic weights of

HOP

the network are:

⎪⎩

⎪⎨

⎧

=

=≠= ∑

=

ij

Nji jffw

M

jiMji

0

.......1, 1

,,1

μμμ

NN 5 8

where wji is the weight from neuron i to neuron j. The elements of the vectors fμ are in {-1,+1}. Once computed, the synaptic weights are fixed.

NN 5 2.09.98

Neural Networks 5

NN Execution

HOP

2. Initialisation. Let x probe denote an input vector (probe) presented to the network. The algorithm is initialised by setting:

Njxx jbj 1)0( ==where xj(0) is the state of neuron j at time n = 0, and x probe,j is the j-th element of the probe vector x probe.

3. Iteration Until Convergence. Update the elements of network state vector x(n) asynchronously (i.e. randomly and one at the time) according to the rule

NjnxwsignnxN

ijij , ... 2, ,1 )()1( =⎥⎦

⎤⎢⎣

⎡=+ ∑

, ... , Njxx jprobej 1 )0( ,

NN 5 9

4. Outputting. Let denote the fixed point (or stable state, that is such

that x(n+1)=x(n)) computed at the end of step 3.

The resulting output y of the network is:

Repeat the iteration until the state vector x remains unchanged.i

jj1

⎥⎦

⎢⎣∑=

fixedxy =

fixedx

Example 1 (1, 1, 1)(-1, 1, 1)

(-1, 1, -1)(1, 1, -1)

1+

weight

HOP

(-1, -1, -1)

(-1, -1, 1)(1, -1, 1)

(1, -1, -1)

neuron2 3

--

-

-

+

+

-1 1 -1stable states

NN 5 10

attraction basin 1 attraction basin 2-1 -1 -1

1 -1 -1

1 1 -1

-1 -1 1 1 1 1

-1 1 1

1 -1 1

NN 5 2.09.98

Neural Networks 6

Example 2 Separation of patterns using the two fundamental memories (‐1 ‐1 ‐1) ( 1 1 1):

HOP

1 1 1) ( 1 1 1): Find weights to obtain the following behavior:

-1 -1 -1

-1 -1 1

-1 1 1

-1 1 -1

1 -1 1 1 1 -1

1 -1 -11

NN 5 11

-1 1 1 1 -1 1 1 1 -1

1 1 1 2 3

wij = ?

CONVERGENCEEvery stable state x is at an “energy minimum”. (A state x is stable if x(n+1)=x(n))

HOP

What is “energy”?Energy is a function (Lyapunov function):

E: States →

such that every firing (change of its output value) of a neuron decreases the value of E:

NN 5 12

∑−=

−=

ji

T xWxxE

,jiji2

121

xxw

)(x → x’ implies E(x’) < E(x)

NN 5 2.09.98

Neural Networks 7

CONVERGENCEA stored pattern establishes a correlation between pairs of neurons: neurons tend to have the same state or opposite state according to their values in the pattern.

HOP

according to their values in the pattern.If wji is large, this expresses an expectation that neurons i and j are positively correlated similarly. If it is small (very negative) this indicates a negative correlation.

• will thus be large for a state x, which is a stored pattern (since wji will be positive if xi xj > 0 and negative if xi xj )

∑ji

jiij xxw,

NN 5 13

< 0).

• The negative of the sum will thus be small.

Energy DecreasingClaim:

Firing a neuron decreases the energy E.

f

HOP

Proof:Let l be the neuron that fires:

Either:

1. xl goes from -1 to 1 which implies

(above threshold)01

>∑=

N

iii xwl

2 f 1 t 1 hi h i li

NN 5 14

2. xl goes from 1 to -1 which implies

(below threshold)01

<∑=

N

iii xwl

Thus in both cases: 0)'(1

>− ∑=

N

iii xwxx lll

value after

NN 5 2.09.98

Neural Networks 8

ProofNow we try to “factor” xl from the energy function E:

∑∑−== =

xxw1 1

jiji21 N

i

N

jE

HOP

∑ ∑−∑−=

∑ ∑−=

≠= =

≠=

= =

= =

xwxxw

xwx

111

1 1jjii2

1

1jj2

1

1 1jjii2

1

1 1

N NNN

N

ii

N

j

N

jj

N

i

N

j

i j

xll

ll

pulled out the i = l case

can be added since wll = 0

NN 5 15

+∑−=

∑ ∑−∑−∑−=

≠=

≠=

≠=

≠=

≠=

xw

xwxxxw

1jj

1 1jjii2

1

1i2

1

1jj2

1

N

jj

ii

jj

ii

i

jj

x

xwx

l

ll

l l

l

l

l

l

ll

Term independent of xl

ProofHow does E change with xl changing?

=E change before Eof value

HOP

∑−−=

∑+∑−=−

=

≠=

≠=

≠=

N

jj

N

jj

N

jj

xx

xxEE

E

l

lll

l

ll

l

ll

1jj

1jj

1jj

xw)'(

xwxw''

changeafter Eof value'

NN 5 16

≠j l

we showed in previous slide that this quantity is > 0(without ‘-’ sign)

So E’-E < 0 always.

NN 5 2.09.98

Neural Networks 9

Convergence result

HOP

We have shown that the energy E decreases with each neuron firing.The overall number of states is finite ⊆ {+1, ‐1}N.

ThenThe energy cannot decrease forever.Firing cannot continue forever.

NN 5 17

Computer experiment

We illustrate the behavior of the discrete Hopfield network as a content

HOP

We illustrate the behavior of the discrete Hopfield network as a content addressable memory.n = 120 neurons (⇒n2 ‐ n = 14280 weights).The network is trained to retrieve 8 black and white patterns. Each pattern contains 120 pixels. The inputs of the net assume value +1 for black pixels and ‐1 for white pixels.Retrieval 1: the fundamental memories are presented to the network to test its ability to recover them correctly from the information stored in the its ability to recover them correctly from the information stored in the weights. In each case, the desired pattern was produced by the network after 1 iteration.

NN 5 18

NN 5 2.09.98

Neural Networks 10

Computer experiment Computer experiment

NN 5 19

These patterns are used as fundamentalThese patterns are used as fundamentalmemories to create the weight matrix.memories to create the weight matrix.

Computer experiment Retrieval 2: to test error‐correcting capability of the network, a pattern of interest is distorted by p yrandomly and independently reversing each pixel of the pattern with a probability of 0.25. The distorted pattern is used as a probe for the network.The average number of iterations needed to recall, averaged over the eight patterns, is about 31. The net behaves as expected.

NN 5 20

NN 5 2.09.98

Neural Networks 11

Computer experiment

NN 5 21

Computer experiment

NN 5 22

NN 5 2.09.98

Neural Networks 12

Spurious attractor states

bl f l l h k hProblem of incorrect retrieval. For example the network with input corrupted pattern “2” converges to “6”.

Spurious attractor states: the next slide shows 108 spurious attractors found in 43097 tests of randomly selected digits corrupted with the probability of flipping a bit at 0.25.

NN 5 23

Spurious attractor states There are some local minima of E not correlated with any of the fundamentalwith any of the fundamental memories, because of weights definition.

⎪

⎪⎨

⎧=≠

= ∑=

Nji jffw

M

jiMji

.......1, 1

,,1

μμμ

NN 5 24

Combination of digit 1, digit 4 and digit 9

⎪⎩ = ij 0

NN 5 2.09.98

Neural Networks 13

Storage CapacityStorage capacity: the quantity of information that can be stored in a network in such a way that it can be retrieved correctly.Definition of storage capacity:

networkin theneuronsofnumber patterns lfundamenta ofnumber

=C

NN 5 25

network in the weightsofnumber patterns lfundamenta ofnumber

=C

or equivalently

Storage Capacity: boundsTheorem: The maximum storage capacity of a discrete Hopfield NN is defined as C=M/N and is bounded above by the quantity

Defining Max as the storage capacity almost without error, i.e. the largest number of fundamental memories that can be

NC

ln21

=

the largest number of fundamental memories that can be stored in the network with correctly recalling of most of them we have:

NN 5 26

)ln( NN/Max 2=

NN 5 2.09.98

Neural Networks 14

Hopfield NN for Optimization

Optimization problems (like Traveling Salesman) can be encoded into Hopfield Networksp

Objective function corresponds to the energy of the network

Good solutions are stable states of the networkGood solutions are stable states of the network

NN 5 27

Travelling Salesman ProblemTravelling Salesman Problem (TSP):Given N cities with distances dij .What is the shortest tour?

NN 5 28

NN 5 2.09.98

Neural Networks 15

EncodingConstruct a Hopfield network with N 2 nodes.Semantics: iff town i is visited at step a1=nSemantics: iff town i is visited at step aConstraints:

ina

ia ∀=∑ ,1ani

ia ∀=∑ ,1

1=ian

NN 5 29

The town distancies are encoded by weights, i.e.

ijijab dw =

Hopfield Network for TSPplacecity 1 2 3 4A 0 1 0 0A 0 1 0 0B 1 0 0 0C 0 0 0 1D 0 0 1 0

NN 5 30

Configuration not allowed

Configuration allowed

NN 5 2.09.98

Neural Networks 16

Energy and Weights∑−=

bajijbiaijab nnwE

,,,21

Nodes within each row ( i=j ) connected with weights

Nodes within each column ( a=b ) connected with weights

γ−=ijabw

j ,,,

γ−=wweights

Each node is connected to nodes in columns left and right with weight

NN 5 31

γ=ijabw

ijijab dw −=