Information Theory of Complex Networks: on evolution and architectural constraints Ricard V. Solè...

Information Theory of Complex Networks:

on evolution and architectural constraints

Ricard V. Solè and Sergi Valverde

Prepared by Amaç Herdağdelen

Introduction

• Complex systems as complex networks of interactions

• Metabolic networks, software class diagrams, electronic circuits

• Describing complex networks by quantitative measures:– Degree distribution (exponential, power law, “normal”)– Statistical properties (average degree, clustering,

diameter)

Problem

• The space of possible networks are much more complex

• Average statistics lack capturing all essential features and providing insight

• Need for additional measures to analyze and classify complex networks

Possible Measures

• Heterogeneity– How heterogeneous the nodes are (based on

degree)

• Randomness– Is there an underlying order?

• Modularity– Is there a hierarchical organization

Zoo of Complex Networks

Notation

• G = (V,E), classical graph representation• k(i) degree of node i• P(k): Degree distribution (as probability,

summing to 1)• q(k): “Remaining degrees”:

– Choose a random edge. q(k) is the probability that the edge goes out of a node with (k+1) degree.

• <k> = Average degree

Notationk P(k) (out) q(k) (out)

0 0 0.5 = 8/16

1 0.89 = 8/9 0

2 0 0

3 0 0

4 0 0

5 0 0

6 0 0

7 0 0.5 = 8/16

8 0.11 = 1/9 0

q(k) = [(k + 1) * P(k+1)] / <k>

Degree vs. Remaining DegreeClassical Degree Remaining Degree

Random Graph

Evenly distributed degrees

An Example Measure

• Assortative Mixing (AM)– High degree nodes tend to link to high degree

nodes– Found in social networks

• Disassortative Mixing (DM)– The reverse, high degree nodes tend to link to

low degree nodes– Found in biological networks

An Example Measure

• qc(i,j) = The probability that a randomly chosen edge will be between two nodes with “remaining degrees” i and j.

• For no assortative case (no AM/DM):– qc(i,j) = q(i) * q(j) (Both degrees are

independent)

• Assortativeness measure r: – Related to the value E(qc(i,j)) – E(q(i) * q(j))– Normalized such that -1 < r < 1– -1 means highly DM, +1 means highly AM

An Example Measure

a b

q(a) = i q(b) = ?

High AM (r > 0) High DM (r < 0) No AM/DM (r = 0)

k(b) With high probability ~i

With high probability different than i (either higher or lower)

No conclusion can be drawn

Entropy and Information

• Entropy is defined in several domains. The relevant ones are:– Thermodynamic Entropy (Clausius): Measure

of the amount of energy in a physical system which cannot be used to do work

– Statistical Entropy (Boltzmann): A measure of how ordered a system is:

– Information Entropy (Shannon): A measure of how random a signal or random event is

Information

• Information of a message is a measure of the decrease of uncertainty at the receiver:

Sender(M = 5)

ReceiverM = ? (1? 2? .. 100?)

Message: [M = 5]Sender(M = 5)

ReceiverM = 5!

Information Entropy

• The more uncertainty the more information• Let x be the result of a toss (x = H or x = T)

– Unbiased coin (P(H)=½, P(T)=½), x carries 1 bit of information (knowing x gives 1 bit information)

– Biased coin (P(H)=0.9, P(T)=0.1) x does not contain that much information. (the decrease of uncertainty at the receiver is low, compare it with the possible values for M in the previous example)

• The more uncertain (random) a message to the outsider is, the more information it carries!

Information Entropy

• Information ~ Uncertainty and Information Entropy is a measure of randomness of an event

• Entropy = Information carried by an event• High entropy corresponds to more informative, random

events• Low entropy corresponds to less informative, ordered

events• Consider Turkish. A Turkish text of 10 letters does not

contain “10-letters of information” (try your fav. compression algorithm on a Turkish text, for English it is found that a letter carries out 1.5 bits of information)

Information Entropy

• Formally:

• H(x) = Entropy of an event x (eg. a message)• i = [1..n] all possible outcomes for x• p(i): Probability that i. outcome will occur• The more random the event (probabilities are equal) the

higher entropy• Highest possible entropy = log(n)

Information Entropy

• For a Bernoulli trial (X = {0,1}) the graph of entropy vs. Pr(X = 1). The highest H(X) = 1 = log(2)

http://en.wikipedia.org/wiki/Image:Binary_entropy_plot.png

Example Entropy Calculations

H = 3.1036

H = 1.2764

H = 3.3219

H = 2.7251

So What?

Any questions so far?

So What?

• Apply the information theory and the entropy as a measure of the “orderedness” of a graph

• Remember assortativeness? It is a measure of correlation but only works when there is a linear relation between two variables (qc(i,j)).

• Mutual Information between two variables is a more general measure which captures non-linear relation– When I know about X, how much do I know about Y?

Measures• (Network Entropy) Heterogeneity of the degrees of nodes

• (Noise) Entropy of the probability distribution of observing a node with remaining degree k given that the node at the other end of the chosen edge has k’ leaving edges

• (Information Transfer) Mutual information between degrees of two neighbor nodes

Results

Noise versus network entropy, the line consists of points where information transfer is 0 (H(q) = H(q|q’))

Results

• Low information transfer means knowing a degree of a node does not tell us much about the degrees of its neighbors: Small assortativeness

• Looks like many (if not all) complex networks are heterogeneous (high entropy) and have low degree correlations

• Are degree correlations irrelevant? Or are they non-existent for some reason?

Results

• Maybe there is a selective pressure that favors the networks with heterogeneous distribution and low assortativeness when a complexity limit is reached

• A Monte Carlo search by simulated annealing is performed to provide evidence which suggests this is NOT the case

Monte Carlo Search

• Search is done in the multi-dimensional space of all networks with N nodes and E edges

• 2 dimensional parameter space for networks:– H: Entropy and Hc: Noise– For every random sample. Find corresponding point (H,Hc) – Perfomr a Monte Carlo search to minimize following potential

function for a candidate graph Ω:

– By looking at the error for Ω (ε(Ω)) we can calculate a likelihood for Ω. This value gives us a measure of how likely it is to reach Ω from a given random point.

Results

• The candidate graphs that occupy the same region with the observed real graphs appeared as the most likely graphs.

• Note that for a very large portion of the theoretically possible network space it is almost impossible to obtain graphs located in that area.

• The area with high likelihood is the place where the scale-free networks reside.

Discussion

• The authors claim that the observed lack of degree correlation and high heterogeneity is not a result of adaptation or parameter selection but a result of higher-level limitations (!) on network architectures

• Without assuming a particular network growth model they showed that a very specific domain of all possible networks are attainable by an optimization algorithm, outside this domain it is not feasible find graphs that satisfies the complexity constraints

• These results and formulation might be a step towards explaining why so different networks operating/evolving under so different conditions have many common properties

Thank you for your attention

?

• "In many different domains with different constraints, the systems usually end up with networks that fall in the specific area we found in the monte-carlo simulations. This is not only because of some evolutionary (or other) constraints which favor the networks in that area but also because most of the networks actually reside in that area. We mean, even if there was a constraint in the system that favors the networks whose entropy/noise values fall outside of the mentioned area, the system would be unsuccesful most of the time in its search/evolution/development for such a network (just as our monte-carlo search did)".

Information Theory of Complex Networks: on evolution and architectural constraints Ricard V. Solè...

Documents

Transcript of Information Theory of Complex Networks: on evolution and architectural constraints Ricard V. Solè...