Statistical physics on random graphs and its applications...n 1 = 7/12 n 2 = 5/12 p 11 = p 22 = 0.39...
Transcript of Statistical physics on random graphs and its applications...n 1 = 7/12 n 2 = 5/12 p 11 = p 22 = 0.39...
Statistical physics on random graphs and its applications
Lenka Zdeborová
Currently: CR2 IPhT, Saclay
2008-2010: director’s postdoctoral fellow Los Alamos National Laboratory
2005-2008: PhD LPTMS, University Paris-Sud Orsay, supervisor Marc Mezard
1999-2004: master degree in theoretical physics Charles University in Prague
Lenka’s four lines cv
Statistical Physics of Complex Systems
Systems: composed of a large number of interacting elements often disordered, living on random or real networks, out of equilibrium (e.g. matter, optimization problems, power grid, internet, market, neurons ...)
Goal: Understand and predict global properties based on local interactions (or vice versa).
Methods: Analytic, rigorous, toy models, + simulations & algorithms
Lenka ZdeborováFlorent Krzakala (ESPCI)Aurelien Decelle (LPTMS)
Detection of functional modules from network topology
Popular example62 bottlenose dolphins living in Doubtful Sound in New Zealand observed by Lusseau (2003).
Edge if the pair is seen together more often than expected by chance.
The group separated in two groups after one dolphin left the place.
The structure of the two groups can be predicted from the network topology (Arenas, Fernandez, Gomez’2008)
State of artHundreds of papers on the topic (Newman, Girvan’02, ...........)
Focus on communities: Nodes of the same kind tend to be together. Not useful in many cases, e.g. food-web, adjacency of words in text.Current methods are unable to tell that a random graph does not have any communities. E.g.: Ising model on random graphs of degree 3, in the best bisection only about 11.4% of edges between the two groups.
Missing measures of significance, estimate of probability of error.
Need for more fundamental and formal approach!
Block modelq groups, N nodes
proportion of nodes in group
probability that an edge present between node from group a and another from group b
na a = 1, . . . , q
Generate a random network as follows:
pab =cab
N
n1 = 7/12 n2 = 5/12
p11 = p22 = 0.39
p12 = p21 = 0.14
Block modelq groups, N nodes
proportion of nodes in group
probability that an edge present between node from group a and another from group b
na a = 1, . . . , q
Generate a random network as follows:
Question 1: Given what is the best possible guess for the original group assignment?
q, {na}, {pab}
Question 2: Given only the graph, what is the best guess for q, {na}, {pab}
pab =cab
N
Question 2: Given only the graph, what is the best guess for {na, pab}
P ({na, pab}|G) =P ({na, pab})
P (G)P (G|{na, pab})
=P ({na, pab})
P (G)
∑
{qi}
P (G, {qi}|{na, pab})
Question 2: Given only the graph, what is the best guess for {na, pab}
P ({na, pab}|G) =P ({na, pab})
P (G)P (G|{na, pab})
=P ({na, pab})
P (G)
∑
{qi}
P (G, {qi}|{na, pab})
P (G, {qi}|{na, pab}) =N!
i=1
nqi
!
ij
pAijqiqj
(1! pqiqj )1!Aij
Question 2: Given only the graph, what is the best guess for {na, pab}
P ({na, pab}|G) =P ({na, pab})
P (G)P (G|{na, pab})
=P ({na, pab})
P (G)
∑
{qi}
P (G, {qi}|{na, pab})
P (G, {qi}|{na, pab}) =N!
i=1
nqi
!
ij
pAijqiqj
(1! pqiqj )1!Aij
Z({na, pab}) !!
{qi}
P (G, {qi}|{na, pab})Maximize to learn{na, pab}
Equilibrium statistical physics of!H({qi}) =
N!
i=1
log nqi +!
ij
"Aij log pqiqj + (1!Aij) log (1! pqiqj )
#
=N!
i=1
log nqi +!
(ij)∈E
logpqiqj
1! pqiqj
+q!
a,b=1
NaNb log (1! pab)
Equilibrium statistical physics of
Partition function maximized if and only if:
1N
!"
i
!a,qi
#= na
1N2
!"
(ij)∈E
!a,qi!b,qi
#= pabnanb
quenched energy = annealed energy
Nishimori condition
!H({qi}) =N!
i=1
log nqi +!
ij
"Aij log pqiqj + (1!Aij) log (1! pqiqj )
#
=N!
i=1
log nqi +!
(ij)∈E
logpqiqj
1! pqiqj
+q!
a,b=1
NaNb log (1! pab)
Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =
TAP equations = cavity method) faster
Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =
TAP equations = cavity method) faster
!i!jqi
=1
Zi!jnqie
"hqi
!
k#!i\j
"#
qk
cqkqi!k!iqk
$
hqi =1N
!
k
!
qk
cqkqiψkqk pab =
cab
N
Learning of parameters(1) Compute the averages:➡ With Monte Carlo (detailed balance)➡ With belief propagation (= Bethe-Peierls =
TAP equations = cavity method) faster(2) Update parameters as
(3) Repeat till convergence.
1N
!"
i
!a,qi
#= na
1N2
!"
(ij)∈E
!a,qi!b,qi
#= pabnanb
Question 1: Given what is the best possible guess for the original group assignment?
{na}, {pab}
Question 1: Given what is the best possible guess for the original group assignment?
{na}, {pab}
Bayes optimal inference (in error correcting codes by Nishimori’93, Sourlas’94): (1) Compute marginals (local magnetizations)(2) For each node take the most probable value.
Question 1: Given what is the best possible guess for the original group assignment?
{na}, {pab}
Bayes optimal inference (in error correcting codes by Nishimori’93, Sourlas’94): (1) Compute marginals (local magnetizations)(2) For each node take the most probable value.
Ove
rlap
This overlap is maximized at the
right value of {na}, {pab}
0.914
0.915
0.916
0.1 0.15 0.2 0.25 0.3 0.35 0.4
Example I na =
1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout
q = 4, c = 16 ferromagnet = communitiesOve
rlap
cout
cin
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
rand
om
sepa
rate
d
ferro para
Example I na =
1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout
q = 4, c = 16 ferromagnet = communitiesOve
rlap
cout
cin
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
rand
om
sepa
rate
d
ferro para
Paramagnetic phase: Random graph was created (Achlioptas,
Coja-Oghlan’08). Zero overlap between an equilibrium configuration and the original one. Learning impossible.
Example I na =
1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout
q = 4, c = 16 ferromagnet = communitiesOve
rlap
cout
cin
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
rand
om
sepa
rate
d
ferro para
Ferromagnetic phase: Network contains information about modules, equilibration easy
(Nishimori line no RSB, no glass).
Example I na =
1q, caa = cin, ca!=b = cout, cq = cin + (q ! 1)cout
q = 4, c = 16 ferromagnet = communities
Ove
rlap
cout
cin
0
0.2
0.4
0.6
0.8
1
0 0.2 0.4 0.6 0.8 1
detectioneasy
detectionimpossible
|cin ! cout| " q#
cde Almeida-Thouless
condition
Example II anti-ferromagnet = coloring
q = 5, na =1q, caa = 0, ca!=b =
cq
q ! 1,
Ove
rlap
c 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
12 13 14 15 16 17
easyhardimpo
ssib
leImpossible - random
graph created, paramagnetic phase
Easy - planted configuration attractive
beyond AT conditioncAT = 16
cK = 13.23
Values of phase transitions the same as in random graph coloring (Zdeborova, Krzakala’07)
Ove
rlap
c 0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
12 13 14 15 16 17
easyhardimpo
ssib
le
Hard - equilibrium solution correlated with the planted configuration, but hidden in an 1RSB
phase
Example II anti-ferromagnet = coloring
Values of phase transitions the same as in random graph coloring (Zdeborova, Krzakala’07)
ConclusionUsing basic properties of (planted) Potts and spin glass models gives us fundamental approach and new algorithms for module detection in networks.
Currently (with Mark Newman and Cris Moore): Using a little more realistic model (correcting for the observed degree distribution) for analysis of real networks.