Emilio Kropff Thesis presentation September 19, 2007

47
Statistical and dynamical properties of large cortical network models: insights into semantic memory and language Emilio Kropff Thesis presentation September 19, 2007

description

Statistical and dynamical properties of large cortical network models: insights into semantic memory and language. Emilio Kropff Thesis presentation September 19, 2007. Potts Networks – Latching – Correlated patterns. Thesis presentation. September 19 th , 2007. - PowerPoint PPT Presentation

Transcript of Emilio Kropff Thesis presentation September 19, 2007

Statistical and dynamical properties of large cortical network models: insights into semantic memory and language

Emilio Kropff

Thesis presentation

September 19, 2007

Cerebral cortex – Braitenberg & Schüz, 1991

# of neurons >> # of input fibers

No prefered direction in the connections

Mostly excitatory synapses

Great convergence & divergence

Connections are very weak

Modifiable synapses

Two-level associative memory with formation of cell assemblies

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

- Pattern #2 active

- No activity

- Pattern #1 active

- Pattern #3 active

Hebbian Learning !!

Auto-associative memories

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Testing the memory

- Pattern #2 active

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Testing the memory

Network damage

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Two-level associative memory with formation of cell assemblies

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• Anatomical studies – Braitenberg & Schüz

• Embodied theories of semantic memory

• Feature representation

The model: the trick

state 1

state 0

state 3state S

state 2

• S: number of alternative states of a unit

• a (global sparseness): average number of units that are active in a global memory

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• S: number of alternative features of a unit

• a (global sparseness): average number of features describing a concept

• The state of the network is set to some initial value (e.g: random or some stored memory if we want to test its stability).

• A unit i is picked randomly and the fields hik are calculated.

Testing Kanter’s Potts network (Kanter,1988)

• The S states of unit i are updated following:

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• The patterns ξ are constructed randomly and stored in the network by modifying J.

Np

2

)1(

)1(erf2

)erf(12

1)2log(1

2max

SSp

S

Reviewing and extending the results of Kanter, 1988

• Kanter’s result for low S is

• We find high S behaviour is

)1(max SSp

)log(

)1(max S

SSp

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

))(1()ˆ1(20)~1)(~1( 22 CaS

aCkk

kCaaS

qP Uvzmvh k

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Adding a zero state and sparseness a

Sparse Hopfield (Buhmann, 1989, Tsodyks, 1988)

Potts without sparseness (Kanter, 1988)

Sparse Potts (Kropff & Treves, 2005)

αc ~ 1/a ~ 0.14 S(S-1) S2/a

• Two units speak to each other with probability cM/N

• Two states speak to each other with probability e

))(1()ˆ1(20)~1)(~1( 22 CaS

aCkk

kCaaS

qP Uvzmvh k

))(1( 0)~1( Uvzmvh kkk

aS

qPkeff

ecp

eff M

Highly diluted approximation

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

4

1

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• If well defined, a sparse Potts network reaches an optimal storage capacity

• The Kanter result for Potts networks has a logarithmic correction for high S.

•In highly diluted networks this result applies, with = p/(cM e) .

•These results are in line with the conjecture of a limit in the amount of information per synapse that a network can store.

100 200 300 400Timecycles

0

0.2

0.4

0.6

0.8

1

spalrevO

neewteb

ehtetats

dnaeht

snrettap

12345

llarevO

ytivitca+ adaptation

retrieval

+ correlation

latching!

100 200 300 400Timecycles

0

0.2

0.4

0.6

0.8

1

spalrevO

neewteb

ehtetats

dnaeht

snrettap

12345

llarevO

ytivitca

100 200 300 400Timecycles

0

0.2

0.4

0.6

0.8

1

spalrevO

neewteb

ehtetats

dnaeht

snrettap

12345

llarevO

ytivitca

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Dynamics of the network

time

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• In addition, the transition matrix is not symmetric

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Dynamics: latching of 2 patterns

200 400 600 800Timecycles

0

0.2

0.4

0.6

0.8

1

spalrevO

neewteb

ehtetats

dnaeht

snrettap

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

time

0 20 40 60 80 100First PatternMaximumat t1

0

20

40

60

80

100

0 20 40 60 80 100FirstPattern Maximumat t1

0

20

40

60

80

100

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100First PatternMaximumat t1

0

20

40

60

80

100

0 20 40 60 80 100FirstPattern Maximumat t1

0

20

40

60

80

100

0

0.2

0.4

0.6

0.8

1

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1

• Units active in one of the two patterns

• Unitactive in both in the same state

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1 • Units active in one of the two patterns

• Unitsactive in both in the same state

• Weakly: units active in both patternsin different states

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1

•Units active in both patterns in different states

Dynamics: latching of 2 patterns

c: s

hare

d un

its

d: shared features

‘alternative features’

‘pathological case’

‘shared features’

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Does latching present a natural rudimentary grammar?

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• However, the transition matrix is not symmetric, which means that there are other important factors.

• Correlation seems to be at least one of the main properties determining latching.

• The equilibrium between global inhibition and local self excitation can control the complexity of symbolic chains.

• Three types of latching transition exist, each in a restricted region of parameters. Do they organize in time?

Category specific deficits• Patients were found with a significant impairment in their knowledge about living things (animals + foodstuffs) as opposed to manmade artifacts (Warrington & Shallice, 1984).

• Impairment for nonliving has also been reported → double dissociation. Current ratio: 23% vs 77% (Capitani, 03)

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• The selective impairments respond to differences in the networks representing different categories: sensory/functional theory (Warrington & Shallice, 84), domain-specific hypothesis (Caramazza & Shelton, 98).

• The network sustaining semantic memory is quite homogeneous but different categories have different typical correlation properties (McRee et al, 97; Tyler & Moss, 01; Sartori & Lombardi, 04).

Theoretical accounts

Hopfield memories

• Solution #1: Orthogonalize the patterns before feeding the network. (i.e: Dentate Gyrus in Hippocampus)

• If patterns are randomly correlated (Tsodyks,88),

• However, if patterns have a non-trivial structure of correlations, the storage capacity colapses.

• In semantic memory correlation between stored patterns seems to play a major role.

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Solution #2 ??

• We assume that pattern 1 is being retrieved

• We split hi into the contribution of pattern 1 (signal) and the rest (noise)

• We minimize the noise

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Classical result: hebbian learning supports uncorrelated memories

Classical result: catastrophe associated to correlated memories

New result: a modification that supports correlated memories

New result: the performance is the same with uncorrelated memories

Jij= Σμ (ξiμ – a ).(ξj

μ – a )

Jij= Σμ (ξiμ - ai).(ξj

μ - aj)

popularity: ak= 1/p Σμ ξkμ

Propeties with finite α, C ≈ ln(N)

GAUSSIAN noise (If there is independence between

neurons i and j).

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Propeties with finite α, C ≈ ln(N)

GAUSSIAN noise (If there is independence between

neurons i and j).

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Propeties with finite α, C ≈ ln(N)

...

If F(x) decays fast enough

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

(uncorrelated patterns)

F(x)

Propeties with finite α, C ≈ ln(N)

If F(x) decays fast enough

If F(x) decays exponentially

If F(x) decays as a power law

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

F(x)

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Storing memories

Damaging the network

C (# of afferent connections per neuron)

Pe

rfo

rman

ce

0

1

Cc

Storage capacity ~ fixed p

0.2 0.4

10

20

30

40

50

60

70

Popularity ai

Nu

mb

er

of n

eu

rons

Cc1 Cc2

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

Entropy Sf= Σi ai (1-ai)summed over active neurons in the pattern

Connectivity

Entropy Sf of objects

Pro

bab

ility

Dis

trib

utio

n

Category specific effects

living

non livingMcRae feature norms

• 541 concepts described in terms of 2526 features

• i=1 if feature i is

included in the description of concept and i

=0 otherwise

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• The robustness of a memory is inverse to how informative it is (Sf).

• When memories are correlated, they have variable degrees of ressistance to damage.

• In addition, popular neurons affect negatively the general performance (decay of F(x)).

• These results show how the current trend in category specific deficits (‘living’ weaker than ‘non living’) could emerge even in a purely homogeneous network.

• A singel cortical network with Potts units including addaptation and storing correlated patterns of activity in its long range synapses, presents all the properties studied in this thesis.

Thank you!

McRae’s feature norms• In the semantic memory literature, auto-associative networks are often presented as weak models. Why?

• To convince psychologists one must show an auto-associative memory that is able to store feature norms.

popularity

# o

f fe

atu

res

McRae’s feature norms

theoretical prediction

simulations

Size of the subgroup of patterns

Pe

rfo

rman

ce o

f th

e n

etw

ork

McRae’s feature norms

a –

aver

age

spar

sene

ssI f

– av

erag

e in

form

atio

n

Number of patterns p in the subgroup

McRae’s feature norms

theoretical prediction

simulations

Size of the subgroup of patterns

Pe

rfo

rman

ce o

f th

e n

etw

ork

McRae’s feature norms

Why the real network performs poorly?

• Independence between features is not valid (e.g: beak and wings). Is this effect strong enough? In case it is, there would be a storage capacity colapse.

• The system works but the approximation of diluted connectivity is not good.

McRae’s feature norms: the full solution

+ 2+ 3+ ...

McRae’s feature norms: the full solution

highly diluted

simulations

full solution

Size of the subgroup of patterns

Pe

rfo

rman

ce o

f th

e n

etw

ork

McRae’s feature norms: strategies to store more patterns

1- add unpopular neurons 2- eliminate popular neurons

Total number of neurons% o

f pa

ttern

s re

trie

ved

3000 4000 5000 60000

20

40

60

80

100

a b20

40

60

80

100

2490 2500 2510 2520

McRae’s feature norms: strategies to store more patterns

3- recombination4- popularity deppendent connectivity

neurons i and j have high popularity: their coincidence will be less popular. If applied massively, this principle could change the whole distribution.

The (episodic) memory pyramid

Primary cortex: sensory motor areas

Unimodal and polymodal association areas

Perirhinal and parahippocampal cortex

Entorhinal cortex

Hippo-campus

Cod

ing

Consolidation

popularity calculation

orthogonalization

final storage

Propeties with α ≈ 0, C ≈ ln(N)

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

ac

Propeties with α ≈ 0, C ≈ ln(N)

ac

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

• If you want to be an attractor, you should pick at least some unpopular units.

• Lowering U can make any pattern retrievable -> ATTENTION

Propeties with α ≈ 0, C ≈ ln(N)

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA

0 20 40 60 80 100First PatternMaximumat t1

0

20

40

60

80

100

0 20 40 60 80 100FirstPattern Maximumat t1

0

20

40

60

80

100

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80 100First PatternMaximumat t1

0

20

40

60

80

100

0 20 40 60 80 100FirstPattern Maximumat t1

0

20

40

60

80

100

0

0.2

0.4

0.6

0.8

1

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1

• Units active in one of the two patterns

• Unitactive in both in the same state

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1 • Units active in one of the two patterns

• Unitsactive in both in the same state

• Weakly: units active in both patternsin different states

100 200 300 400 500 600

-0.2

0.2

0.4

0.6

0.8

1

•Units active in both patterns in different states

Dynamics: latching of 2 patterns

c: a

ltern

ativ

e fe

atur

es

d: shared features

‘alternative features’

‘pathological case’

‘shared features’

Thesis presentation. September 19th, 2007

Potts Networks – Latching – Correlated patterns

Emilio Kropff, LIMBO-CNS-SISSA