Statistical physics on random graphs and its applications...n 1 = 7/12 n 2 = 5/12 p 11 = p 22 = 0.39...

Post on 03-Aug-2020

4 views 0 download

Transcript of Statistical physics on random graphs and its applications...n 1 = 7/12 n 2 = 5/12 p 11 = p 22 = 0.39...

Statistical physics on random graphs and its applications

Lenka Zdeborová

Currently: CR2 IPhT, Saclay

2008-2010: director’s postdoctoral fellow Los Alamos National Laboratory

2005-2008: PhD LPTMS, University Paris-Sud Orsay, supervisor Marc Mezard

1999-2004: master degree in theoretical physics Charles University in Prague

Lenka’s four lines cv

Statistical Physics of Complex Systems

Systems: composed of a large number of interacting elements often disordered, living on random or real networks, out of equilibrium (e.g. matter, optimization problems, power grid, internet, market, neurons ...)

Goal: Understand and predict global properties based on local interactions (or vice versa).

Methods: Analytic, rigorous, toy models, + simulations & algorithms

Lenka ZdeborováFlorent Krzakala (ESPCI)Aurelien Decelle (LPTMS)

Detection of functional modules from network topology

Popular example62 bottlenose dolphins living in Doubtful Sound in New Zealand observed by Lusseau (2003).

Edge if the pair is seen together more often than expected by chance.

The group separated in two groups after one dolphin left the place.

The structure of the two groups can be predicted from the network topology (Arenas, Fernandez, Gomez’2008)

State of artHundreds of papers on the topic (Newman, Girvan’02, ...........)

Focus on communities: Nodes of the same kind tend to be together. Not useful in many cases, e.g. food-web, adjacency of words in text.Current methods are unable to tell that a random graph does not have any communities. E.g.: Ising model on random graphs of degree 3, in the best bisection only about 11.4% of edges between the two groups.

Missing measures of significance, estimate of probability of error.

Need for more fundamental and formal approach!

Block modelq groups, N nodes

proportion of nodes in group

probability that an edge present between node from group a and another from group b

na a = 1, . . . , q

Generate a random network as follows:

pab =cab

N

n1 = 7/12 n2 = 5/12

p11 = p22 = 0.39

p12 = p21 = 0.14

Block modelq groups, N nodes

proportion of nodes in group

probability that an edge present between node from group a and another from group b

na a = 1, . . . , q

Generate a random network as follows:

Question 1: Given what is the best possible guess for the original group assignment?

q, {na}, {pab}

Question 2: Given only the graph, what is the best guess for q, {na}, {pab}

pab =cab

N

Question 2: Given only the graph, what is the best guess for {na, pab}

P ({na, pab}|G) =P ({na, pab})

P (G)P (G|{na, pab})

=P ({na, pab})

P (G)

{qi}

P (G, {qi}|{na, pab})

Question 2: Given only the graph, what is the best guess for {na, pab}

P ({na, pab}|G) =P ({na, pab})

P (G)P (G|{na, pab})

=P ({na, pab})

P (G)

{qi}

P (G, {qi}|{na, pab})

P (G, {qi}|{na, pab}) =N!

i=1

nqi

!

ij

pAijqiqj

(1! pqiqj )1!Aij

Question 2: Given only the graph, what is the best guess for {na, pab}

P ({na, pab}|G) =P ({na, pab})

P (G)P (G|{na, pab})

=P ({na, pab})

P (G)

{qi}

P (G, {qi}|{na, pab})

P (G, {qi}|{na, pab}) =N!

i=1

nqi

!

ij

pAijqiqj

(1! pqiqj )1!Aij

Z({na, pab}) !!

{qi}

P (G, {qi}|{na, pab})Maximize to learn{na, pab}

Equilibrium statistical physics of!H({qi}) =

N!

i=1

log nqi +!

ij

"Aij log pqiqj + (1!Aij) log (1! pqiqj )

#

=N!

i=1

log nqi +!

(ij)∈E

logpqiqj

1! pqiqj

+q!

a,b=1

NaNb log (1! pab)

Equilibrium statistical physics of

Partition function maximized if and only if:

1N

!"

i

!a,qi

#= na

1N2

!"

(ij)∈E

!a,qi!b,qi

#= pabnanb

quenched energy = annealed energy

Nishimori condition

!H({qi}) =N!

i=1

log nqi +!

ij

"Aij log pqiqj + (1!Aij) log (1! pqiqj )

#

=N!

i=1

log nqi +!

(ij)∈E

logpqiqj

1! pqiqj

+q!

a,b=1

NaNb log (1! pab)

Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =

TAP equations = cavity method) faster

Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =

TAP equations = cavity method) faster

!i!jqi

=1

Zi!jnqie

"hqi

!

k#!i\j

"#

qk

cqkqi!k!iqk

$

hqi =1N

!

k

!

qk

cqkqiψkqk pab =

cab

N

Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =

TAP equations = cavity method) faster(2) Update parameters as

(3) Repeat till convergence.

1N

!"

i

!a,qi

#= na

1N2

!"

(ij)∈E

!a,qi!b,qi

#= pabnanb

Question 1: Given what is the best possible guess for the original group assignment?

{na}, {pab}

Question 1: Given what is the best possible guess for the original group assignment?

{na}, {pab}

Bayes optimal inference (in error correcting codes by Nishimori’93, Sourlas’94): (1) Compute marginals (local magnetizations)(2) For each node take the most probable value.

Question 1: Given what is the best possible guess for the original group assignment?

{na}, {pab}

Bayes optimal inference (in error correcting codes by Nishimori’93, Sourlas’94): (1) Compute marginals (local magnetizations)(2) For each node take the most probable value.

Ove

rlap

This overlap is maximized at the

right value of {na}, {pab}

0.914

0.915

0.916

0.1 0.15 0.2 0.25 0.3 0.35 0.4

Example I na =

1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout

q = 4, c = 16 ferromagnet = communitiesOve

rlap

cout

cin

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

rand

om

sepa

rate

d

ferro para

Example I na =

1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout

q = 4, c = 16 ferromagnet = communitiesOve

rlap

cout

cin

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

rand

om

sepa

rate

d

ferro para

Paramagnetic phase: Random graph was created (Achlioptas,

Coja-Oghlan’08). Zero overlap between an equilibrium configuration and the original one. Learning impossible.

Example I na =

1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout

q = 4, c = 16 ferromagnet = communitiesOve

rlap

cout

cin

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

rand

om

sepa

rate

d

ferro para

Ferromagnetic phase: Network contains information about modules, equilibration easy

(Nishimori line no RSB, no glass).

Example I na =

1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout

q = 4, c = 16 ferromagnet = communities

Ove

rlap

cout

cin

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

detectioneasy

detectionimpossible

|cin ! cout| " q#

cde Almeida-Thouless

condition

Example II anti-ferromagnet = coloring

q = 5, na =1q, caa = 0, ca!=b =

cq

q ! 1,

Ove

rlap

c 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

12 13 14 15 16 17

easyhardimpo

ssib

leImpossible - random

graph created, paramagnetic phase

Easy - planted configuration attractive

beyond AT conditioncAT = 16

cK = 13.23

Values of phase transitions the same as in random graph coloring (Zdeborova, Krzakala’07)

Ove

rlap

c 0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

12 13 14 15 16 17

easyhardimpo

ssib

le

Hard - equilibrium solution correlated with the planted configuration, but hidden in an 1RSB

phase

Example II anti-ferromagnet = coloring

Values of phase transitions the same as in random graph coloring (Zdeborova, Krzakala’07)

ConclusionUsing basic properties of (planted) Potts and spin glass models gives us fundamental approach and new algorithms for module detection in networks.

Currently (with Mark Newman and Cris Moore): Using a little more realistic model (correcting for the observed degree distribution) for analysis of real networks.