Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks...

33
Internet Engineering Jacek Mazurkiewicz, PhD Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Transcript of Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks...

Page 1: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Internet EngineeringJacek Mazurkiewicz, PhD

Softcomputing

Part 3: Recurrent Artificial Neural Networks

Self-Organising Artificial Neural Networks

Page 2: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Recurrent Artificial Neural NetworksFeedback signals between neurons

Dynamic relations

Single neuron change is transmitted to whole net

Stable state is reached after the set of temporary states

Stable state is available if strict assumptions are fixed to weights

Recurrent artificial neural networks are equipedby symmetric inter-neurons connections

Page 3: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Associative Memory

computer „memory” – as close as possible to human memory:

associative memory – to store „patterns”

auto-associative: Smoth – Smith

learning procedure – to inprint the set of patterns

retrieving phase– output the stored pattern closest to the actual input signal

hetero-associative: Smith – Smith’s face (Smith’s feature)

Page 4: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hopfield Network (1)

Hamming distance for a binary input:

=

−+−=n

i

iiiiH yxyxd1

])1()1([

Hamming distance equals to zero if:

y = x

Hamming distance is a numberof not equal bits

wij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

Page 5: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Retrieving Phase (1)each neuron performs the following two steps:

p pjj

N

j pk w v k ( ) ( )+ = −=

11

– computes the coproduct:

– updates the state:

p

p

p p

p

v k

k

v k k

k

for

for

for

( )

( )

( ) ( )

( )

+

+

+

− +

=

=

11 1 0

1 0

1 1 0

wij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

where:

wpj – weight related to feedback signal

vi(k) – feedback signal

p – bias

Page 6: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

initial condition:

process is repeated until convergence,which occurs when none of the elements changes state during any iteration:

Retrieving Phase (2)wij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

p p pv x =( )0

p p p pv k v k y + = =( ) ( )1

converged state of Hopfield net means thatnet has already reached one of attractorsattractor - point of a local minimum of the energy function (Liapunov function):

E x w x x xij ij

N

i

N

j ii

N

i( ) = − +== =

1

2 11 1

E x x W x xT T

( ) = − +1

2

Page 7: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hebbian Learningtraining patterns are presented one by onein a fitted time intervals

convergence condition:

wij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

during each interval input data is communicated to neuron’s neighbours N times

ij

i

m

j

m

m

M

wx x i j

i jN

dla

dla

=

=

=

1

1

0

( ) ( )

p pp p j pj jpw w w = =0

algorithm: easy, fast, low memory capacity:

NM 138.0max=

Page 8: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

correct weight values means:– input signal generates itself as output– converged state available at once:

one of possible solutions is:

Pseudoinverse Learningwij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

XXW =

( )W X X X XT T

=

−1

algorithm: sophisticated, high memory capacity:

maxM N=

Page 9: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Delta-Rule Learningwij

vN

v1

v2

vN-1

xN

x1

x2

xN-1

y1

neuron

y2

yN-1

yN

weights are tuned step by step using all learning signals, presented in a sequence:

W W x W x xNi i i

T

= + − ( ) ( ) ( )

07 09. ., – learning rate

algorithm is quite similar to gradient methods used for Multilayer Perceptron learning

algorithm: sophisticated, high memory capacity:

maxM N=

Page 10: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Retrieving Phase - ProblemsInput signals heavily corrupted by noise can follow to a false answer

– net output is far from learned/stored patterns

Energy function value for symmetric states is identical(+1,+1,-1) == (-1,-1,+1)– both solutions offer the same „acceptance factor”

Learning algorithms can produce additional local minima– as linear combination of learning patterns

Additional minima are not fixed to any learning pattern– strongly important if the number of learning patterns is significant

Page 11: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Example of Answers

10 digits, 7x7 pixels

Hebbian learning:– 1 correct answer

Pseudoinverse & Delta-rule learning:– 7 correct answers– 9 answers with 1 wrong pixel– 4 answers with 2 wrong pixels

Page 12: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hamming Network (1)

Page 13: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hamming Network (2)

Hamming net – maximum likelihood classifierfor binary inputs corrupted by noise

Lower Sub Net calculates N minus the Hamming distanceto M exemplar patterns

Upper Sub Net selects that node with the maximum output

All nodes use threshold logic nonlinearities– the outputs of these nonlinearities never saturate

Thresholds and weights in the Maxnet are fixed

All thresholds are set to zero, weights from each node to itself are 1

Weights between nodes are inhibitory

Page 14: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hamming Network (3)

weights and offsets of the Lower Sub Net:

weights in the Maxnet are fixed as:

ji

i

j

jwx N

= =2 2 for i N and j M0 1 0 1 − −

==

1

11

kif

kifwlk

for l k M andM

01

,

all thresholds in the Maxnet are kept zero

Page 15: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hamming Network (4)outputs of the Lower Sub Net are obtained as:

weights in the Maxnet are fixed as:

for i N and j M0 1 0 1 − −

Maxnet does the maximisation by evaluating:

j ji i ji

N

w x = −=

0

1

( ) ( )j t j

y f0 = for j M0 1 −

( ) ( ) ( )j t j k

k j

y f y yt t t+ = −

1 for j k M0 1 −,

this process is repeated until convergence

Page 16: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Introduction

learning without a teacher – data overload

unsupervised learning:– similarity– PCA algorithms– classification– archetype finding– feature maps

Page 17: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Pavlov Experiment

FOOD (UCS) SALIVATION (UCR)

BELL (CS) SALIVATION (CR)

FOOD + BELL (UCS + CS) SALIVATION (CR)

CS – conditioned stimulus CR – conditioned reflexUCS – unconditioned stimulus UCR – unconditioned reflex

Page 18: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Fields of Usingsimilarity

– single-output net– how close is input signal to „mean-learned-pattern”

PCA– multi-output net, each output = single principal component– principal components responsible for similarity– actual output vector – correlation level

classification– binary multi-output with 1 of n code – class of closest data

stored patterns finding– associative memory

coding– data compression

Page 19: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Hebbian Rule (1949)

if neuron A is activated in a cyclic way by neuron B– neuron A is more and more sensitive to activation from neuron B

f(a) is any function– linear for example

f(ai)

X1

Xm

Wi1

Wim

ui yi

X2

Wi2)()()1( kijijij wkwkw +=+

)()( kykxw ijij =

Page 20: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

General Hebbian Rule

Problem:– unlimited weight growth

Solution:– set limitations (Linsker)– Oja’s rule

Limitations:

Oja’s rule:– Hebbian rule + normalisation– additional requirements

),( jiij yxFw =

=

=m

j

jiji kxwky0

)()(

0;0;0

)()(

=

ijij

ijij

wyx

kykxw ],[ +− iii www

)]()()()[()( kwkykxkykw ijijiij −=

Page 21: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Principal Component Analysis - PCA

Statistic loss compression in telecommunication– Karhuenen-Loeve approach

Linear conversion into output space with reduced dimensions– preserves the most important features of stochastic process x

First component estimation– weights vector – using Oja’s rule:

Other principal components– by Sanger’s rule:

Wxy =NK

Rx N

NK

K

RW

Ry

+

)()()(0

111 kxWkxWkyN

j

jj

T =

==

=

=N

j

jiji kxWky0

)()(

Page 22: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Neural Networks for PCA

Oja’s rule - 1989

Sanger’s rule - 1989

nj

ki

wyxywk

l

ijijiij

,...,1

,...,1

1

=

=

−=

=

nj

ki

wyxywi

l

ijijiij

,...,1

,...,1

1

=

=

−=

=

Page 23: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Rubner & Tavan Network – 1989 (1)

Single-layer

One-way connections

Weights:– input layer – calculation layer according to the Hebbian rule

Internal connections within calculation layer– according to the anti-Hebb rule

ijij yxw =

ijij yyv −=

Page 24: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Rubner & Tavan Network – 1989 (2)

x1 x2 x3 x4 x5

y1 y2 y3 y4

v21 v32 v43

v41

v31v42

w11 w45

Page 25: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Picture Compression for PCA

Large amount of input data substitutedby lower amount combined in vector y and Wi

Level of compression – number of PCA components– main factor of the restored picture quality

More principal components– better quality– lower compression level

Picture restored based on:– 2 principal components– compression level: 28

Page 26: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Self-Organising Artificial Neural NetworksInter-neurons action

Goal: input signals mapped into output signals

Similar input data are grouped

Groups are separated

Kohonen neural network – leader!

T. Kohonen from Finland!

Page 27: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Concurrent Learning

WTA – Winner Takes All WTM – Winner Takes Most

W

Y X

Page 28: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

WTA (1)

Single layer of working neurons

The same input signals xj are loaded to all competitive neurons

Starting weight values are random

Each neuron calculates the product:

The winner is … the neuron with a maximum output!

Neuron the winner – final output equals to 1

Other neurons set output values to 0

=j

jiji xwu

Page 29: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

WTA (2)

First presentation of learning vectors is the base to pointthe winner neuron

Weights are modified by the Grossberg rule

If the learning vectors are similar the same winner neuron,the winner’s weights are the mean values of input signals

X

W

Page 30: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

WTM (1)

Winner selection like in WTA

Winner’s output is maximum

Winner activates the neighbourhood neurons

Distance from the winner drives the level of activation

Level of activation is a part of weight tuning algorithm

All weights are modified during learning algorithm

Page 31: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Neurons Neighbourhood (1)

Neurons as nodes of regular network

Central neuron – in the middle of the region

Neighbourhood neurons in the closest columns and rows

simple neighbourhood sophisticated neighbourhood

Page 32: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Neurons Neighbourhood (2)

2-D neighbourhood

1-D neighbourhood

Neighbourhood function h(r)

distance function betweeneach neuron and the winner

defines the necessary parametersfor weights tuning

rrh

1)( =

2

)( rerh −=

r – distance between the winnerand neurons in the neighbourhood

or

Page 33: Jacek Mazurkiewicz, PhD Softcomputing · Softcomputing Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks . Recurrent Artificial Neural Networks

Grossberg Ruleneighbourhood around the wining neuron,

size of neighbourhood decreases with iteration,

modulation of learning rate by frequency sensitivity.

Neighbourhood function = Mexican Hat:

a - neighbourhood parameter,r - distance from winner neuron

to each single neuron

=

=

rvaluesotherfora

rforar

arrfor

jijih ww

0

)2

,0()sin(

01

),,,(

The Grossberg rule:

))()(,,,()()()1( kwxjijihkkwkw lijl

ww

lijlij −+=+

k - iteration index, - learning rate function, xl - component of input learning vectorwlij - weight associated with proper connection, h - neighbourhood function,

(iw ,jw) - indexes related to the winner neuron, (i, j) - indexes related to a single neuron