NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth...

45
NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007

Transcript of NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth...

Page 1: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

NETWORKS BASICS

Danail Bonchev

Center for the Study of Biological Complexity

Virginia Commonwealth University

Singapore, July 9-17, 2007

Page 2: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Recommended Literature

1. Linked: The New Science of Networks. Albert-László Barabási. Perseus Publisher, 2002. ISBN: 0-738-20667-9, 304 pp., Price: $ 15.00

2. The Structure and Dynamics of Networks. Mark Newman, Albert-László Barabási, and Duncan J. Watts, Princeton University Press, 2006 | $49.50 / ISBN: 0-691-11357-2; 624 pp.

3. Evolution of Networks. From Biological Nets to the Internet and WWW. Serguei N Dorogovtsev, Jose Fernando Ferreira Mendes, and A F Ioffe Oxford University Press, 2003, ISBN: 0198515901, $95.00, 344 pp.

4. An introduction to Systems Biology: Design Principles of Biological Circuits, Uri Alon, Chapman & Hall/CRC, Taylor and Francis Group, 2006, ISBN:1584886420.

Page 3: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

A branch of science that seeks to integrate different levels of information to understand how biological systems function.

It is not the number and properties of system elements but their relations!!

L. Hood: “Systems biology defines and analyses the interrelationships of all of the elements in a functioning system in order to understand how the system works.”

Systems Biology. What Is It?

Page 4: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

More on Systems BiologyMore on Systems Biology

Essence of living systems is flow of mass,energy, and information in space and time.

The flow occurs along specific networks

Flow of mass and energy (metabolic networks)

Flow of information involving DNA (transcriptional regulation networks)

Flow of information not involving DNA (signaling networks)

The Goal of Systems Biology: To understand the flow of mass, energy, and information in living systems.

Page 5: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Networks and the Core Concepts of Systems Biology

(i) Complexity emerges at all levels of the hierarchy of life

(ii) System properties emerge from interactions of components

(iii) The whole is more than the sum of the parts.

(iv) Applied mathematics provides approaches to modeling biological systems.

Page 6: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

How to Describe a System As a Whole?

Networks - The Language of Complex Systems

Page 7: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

What is a Network?

Network is a mathematical structure composed of points connected by lines

Network Theory <-> Graph Theory

Network Graph

Nodes Vertices (points)

Links Edges (Lines)

A network can be build for any functional system

F. Harary, Graph Theory, Addison Wesley, Reading, MA, 1969Gross & Yellen, Handbook of Graph Theory, CRC Press, Boca Raton, FL, 2004

System vs. Parts = Networks vs. Nodes

Page 8: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Networks As Graphs Networks can be undirected or directed, depending on whether the interaction between two neighboring nodes proceeds in both directions or in only one of them, respectively.

The specificity of network nodes and links can be quantitatively characterized by weights

2.5

2.5

7.3 3.3 12.7

8.1

5.4

Vertex-Weighted Edge-Weighted

1 2 3 4 5 6

Page 9: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Networks As Graphs - 2

Networks having no cycles are termed trees. The more cycles the

network has, the more complex it is.

A network can be connected (presented by a single component) or

disconnected (presented by several disjoint components).

connected disconnected

trees

cyclic graphs

Page 10: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Networks As Graphs - 3Some Basic Types of Graphs

Paths

Stars

Cycles

Complete Graphs

Bipartite Graphs

Page 11: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Air Transportation Network

Page 12: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

The World Wide Web

Page 13: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Fragment of a Social Network(Melburn, 2004)

Page 14: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Biological NetworksA. Intra-Cellular Networks

Protein interaction networks

Metabolic Networks

Signaling NetworksGene Regulatory Networks

Composite networksNetworks of Modules, Functional Networks Disease networks

B. Inter-Cellular NetworksNeural Networks

C. Organ and Tissue Networks

D. Ecological Networks

E. Evolution Network

Page 15: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

protein-gene interactions

protein-protein interactions

PROTEOME

GENOME

Citrate Cycle

METABOLISM

Bio-chemical reactions

L-A Barabasi

miRNAregulation?

- -

_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Page 16: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

The Protein Network of Drosophila

CuraGen Corporation Science, 2003

Page 17: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Source: ExPASy

Metabolic Networks

Page 18: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

FAS-L FAS-R FADD

CASP10

CASP8

CASP6

CASP3

CASP7

DFF45 DFF40 Deathactivator

DISC

Death-Inducing Signaling Complex

Heterodimer DFF

InitiatorCaspases

Executor Caspases

Start DNA Fragmentation

Cleavage of Caspase Substrates

Membrane protein

Apoptosis Pathway - 1

D. Bonchev, L.B. Kier, C. Cheng, Lecture Series on Computer and Computational Sciences 6, 581-591 (2006).

Apoptosis is a mechanism of controlled cell death critically important in many biological processes

Page 19: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Gene Regulation Networks

Page 20: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

The Longevity Gene-Protein Network (LGPN)

T. Witten, D. Bonchev,in press

C. elegans

Page 21: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Network of Interacting Pathways (NIP)

A.Mazurie D.Bonchev G.A. Buck, 2007

381 organisms

Page 22: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Functional Networks

Number of shared proteins

20125

55740

43221

33419

28692

12160

20147

7

83 65

41 49

172 75

321

24 260

11

596

35

103

Cell Cycle Cell Polarity & Structure

Intermediateand EnergyMetabolism

Protein Synthesisand Turnover

Protein RNA / Transport

RNAMetabolism

Signaling

Transcription/DNAMaintenance/Chromatin Structure

Number of protein complexes

Number of proteins 13111

8 6125 40

77 19 14

97

30 16 27

11

75 299

53

37

19

7 15

22187

33 73

13

94

MembraneBiogenesis &Turnover

Yeast: 1400 proteins, 232 complexes, nine functional groups of complexes

(Data A.-M. Gavin et al. (2002) Nature 415,141-147)

D. Bonchev, Chemistry & Biodiversity 1(2004)312-326

Page 23: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Summary

All complex networks in nature and technology have common features.

They differ considerably from random networks of the same size

By studying network structure and dynamics, and by using comparative network analysis, one can get answers of important biological questions.

Page 24: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Fundamental Biological Questions to Answer

(i) Which interactions and groups of interactions are likely to have equivalent functions across species? (ii) Based on these similarities, can we predict new functional information about proteins and interactions that are poorly characterized?(iii) What do these relationships tell us about the evolution of proteins, networks and whole species? (iv) How to reduce the noise in biological data: Which interactions represent true binding events?

False-positive interaction is unlikely to be reproduced across the interaction maps of multiple species.

Page 25: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

All Complex Dynamic NetworksHave Similar Structure and Common Properties

Scale-Freeness

Small-Worldness

Centrality

Robustness/Fragility

Hubs

Page 26: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

How To Characterize

a Network?

Page 27: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Quantifying Networks

A1. Connectivity-based:

A. Graph-Theoretical (Topological) Descriptors

A2. Distance-based

B. Information-Theoretic Descriptors

B2. Structural

C. Complexity Measures

C3. Walk Count

B1, Compositional

C1. Subgraph Count

C2. Overall Connectivity

C4. Small-World Connectivity

Page 28: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Connectivity-Based Topological DescriptorsAdjacency Matrix

Adjacency relation, aij

aij = 1 (neighbors)

aij = 0 (otherwise)

Adjacency Matrix

0 0 1 0 0 1 0 0 1 0 0 1 1 1 0 1 0 3 0 0 1 0 1 2 0 0 0 1 0 1

12345

1 2 3 4 5 ai

A(G) =

ai – node degree

3 2 4 5

1

V = 5

E = 4G

random node numbering

3 1 2 1

1

node degrees

Page 29: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Connectivity DescriptorsLocal (node) descriptors: vertex (node) degrees, ai

Global (Network) descriptors: total adjacency, A

neighborsj ij

V

jiji aaa

1

V

i

V

jij

V

ii aaGA

1 11

)(

Page 30: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Average and Normalized Descriptors

Average vertex (node) degree:

VGAai /)(

Network connectedness (density)

2)(;

)1(

2

)1()(

V

AGnCon

VV

E

VV

AGConn

Connectivity Descriptors-2

3 1 2 1

1

<ai> = 8/5 = 1.6

A = 1+1+3+2+1 = 8

Conn = 8/5.4 = 0.4 = 40%

Example

Page 31: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Adjacency in Directed GraphsAdjacency relation, aij

aij = -1 (incoming edge(arc)) aij = +1 (outgoing edge(arc)) aij = 0 (otherwise)

In-degree = -1 Out-degree = +1

3 2 4 5

1

0 0 1 0 0 +1 0 0 1 0 0 +1 1 0 0 0 0 +1 0 0 1 0 0 +1 0 0 0 1 0 +1ai(in) -1 0 -3 -1 0

12345

1 2 3 4 5 ai(out)

A(DG) =

-1,1

-3,1 0,1 -1,1 0,1

Page 32: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Distance-Based Topological DescriptorsDistance Matrix

Distance relation: dij = 1 for i,j - neighbors

The distance between two non-neighboring nodes is equal to the number of edges along the shortest path that connects them.

3 2 4 5

1

0 2 1 2 3 8 2 0 1 2 3 8 1 1 0 1 2 5 2 2 1 0 1 6 3 3 2 1 0 9

12345

1 2 3 4 5 di

D(G) =

di – node distance(node distance degree)

32

6

4

5

7

d26 = ? d57 =?

Page 33: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Distance DescriptorsNode descriptors:

Network descriptors:

V

i

V

jij

V

ii ddGD

1 11

)(Network distance, D(G)

Network diameter, Diam(G) )()( ijdMaxGDiam

Node eccentricity, ei

e ei = Max(dij)

V

jiji dd

1node distance, di

Network radius, Rad(G) ))(()()( iji dMaxMineMinGRad

Page 34: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Distance Descriptors-2Average and Normalized Descriptors

Average node distance, <di>

VGDdi /)(

Average network distance, <d>(average degree of separation, average path length)

)1(/ VVDd

5 8 6 9

8

<di> = 36/5 = 7.2

D = 8+8+5+6+9 = 36

<d > = 36/5.4 = 1.8

Example

Page 35: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Distances in Directed Networks

Some distances in directed graphs are equal to infinity !!

How to calculate D and <d>?

In-distances and out-distances

0, 2 -4, 0

0,3

-2,1

3 4

1

2

d21=? d13=?

D(in) = D(out) = 6

<d> = 6/(4x3) = 0.5 ???

Page 36: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Network Node Accessibility Acc (G) = Nd(DG)/Nd

Distances in Directed Networks - 2

Adjusted Average Network DistanceAcc

dDGAD

)(

Example:

12

8

9

9

6 8G

D = 52, <d(G)> = 52/(6x5) = 1.73

9

01

9

8

7

DG

D = 34, <d(DG)> = 34/20 = 1.70 < D(G)??

AC = 20/30 = 0.667 AD = 1.70/0.667 = 2.55 > D(G)

Page 37: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Shannon’s Information Theory

References1. Shannon, C.; Weaver, W. Mathematical Theory of Communications.University of Illinois Press: Urbana, MI, 1949.2. Bonchev, D. Information‑Theoretic Indices for Characterization of Chemical Structures. Research Studies Press: Chichester, UK,1983.

The more diverse the distribution of system elements, the larger its information content.

Information is a measure of system’s diversity

How to Measure Information?

What Is Information?

Wiener: Information is neither matter, nor energy.

Forget about meaning!

Information is contained in any system, the elements of which can be grouped according to one or more criteria.

The more complex the system, the larger its information content.

Information is a measure of system complexity

Page 38: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Shannon’s Information Theory Basic Equations

k

iii ppI

12log)(Mean Information:

, bits/element

k

iiitot NNNNI

1

loglog)(Total Information: , bits

Finite Probability Scheme: System of N elements and k equivalence classes with equivalence criterion α:

class number of elements probability 1 N1 p1

2 N2 p2 ………………………………………………………………... k Nk pk

where pi = Ni / N, and Σ pi = 1.

I

Normalized Information:NN

NN

NN

II

V

iii

totnorm

2

12

2 log

log1

log

)()(

1)(0 normI

Page 39: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Network Information Descriptors

Information on the system elements equivalence, eI

12

8

6 8

9

9

distances

3

2

2

4

2 1

degrees eItot(deg) = 6log26 – 3log23 – 3x1log1 = 10.75 bits

Vertex degree equivalence distribution: 6{3, 1, 1, 1}

eI(deg) = -(3/6)2log(3/6) – 3x(1/6)log2(1/6) = 1.79 bits/node

eInorm(deg) = 10.75/(6log26) = 0.693

eItot(dist) = 6log26 – 2x2log22 – 2x1log1 = 11.51 bits

eI(deg) = - 2x(2/6)log2(2/6) – 2x(1/6)log2(1/6) = 1.92 bits/node

eI (deg) = 11.51/(6log26) = 0.742

Vertex distance equivalence distribution: 6{ 2, 2, 1, 1}

Composition distribution: 6{2,2,1,1}

Page 40: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Information on the system elements weight (or magnitude) , mI weighted information descriptors (indices)

12

8

6 89

9distances

D = 52

degrees3

2

2

4

2 1

A = 14

Network Information Descriptors - 2

mI(dist) = 52log252–12log212–2x9log29–2x8log28-6log26 = 132.83 bits

mI(deg) = -(12/52)log2(12/52) – 2x(9/52)log2(9/52) – 2x(8/52)log2(8/52) – (6/52)log2(6/52) = 2.55 bits/node

mInorm(deg) = 132.83/(52log252) = 0.448

mItot(deg) = 14log214 – 4log24 – 3log23 -3x2log22 -1log21 = 34.55 bits

mI(deg) = -(4/14)log2(4/14) –(3/14)log2(3/14) – 3x(2/14)log2(2/14) – (1/14)log2(1/14) = 2.47 bits/node

mInorm(deg) = 34.55/(14log214) = 0.648

Distance magnitude distribution: 52 {12, 2x9, 2x8, 6}

Vertex degree distribution: 14 {4, 3, 3x2, 1}

Page 41: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Network Complexity Descriptors - 1

Subgraph Count, eSC

SC = 17 (5, 4, 4, 3, 1) OC = 76 ( 8, 16, 23, 21, 8)

V = 5, E = 4

e=0

e=1

1 2 1 3

1

1 3 3 3 1 1 2 2

e=2

1 1 2

3

11

3 3 3 2 2 1

e=3

1 1 2 3 1 2 3

1

1 3

1

2

e=4

1 2 1 3

1

0SC = 5 0OC = 8

1SC = 4 1OC = 16

2SC = 4 2OC = 23

3SC = 3 3OC = 21

4SC=1 4OC = 8

Example e = number of edges

Overall Connectivity, eOC

Page 42: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Network Complexity Descriptors - 2

Walk Count, WCExample

5 4 1 3

2

WC = 106 ( 8, 16, 28, 54) 1 3

l = 1

3

l=2

1

The three complexity measures, SC, OC, and WC, can discriminate very subtle complexity features.

1

2

SC 28(5,8,9,5,1) 30(5,9,10,5,1)

OC(in) 111(12,28,41,25,5) 135(16,40,49,25,5)

WC 15(5,5,5) 21(5,7,9)

1 3 4

l=2

For networks use only complexity measures with e = 1, 2, and 3!!

Page 43: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Small-World Connectivity

Network Complexity Descriptors - 3

Network complexity increases with connectivity

Network complexity increases with the decrease in its radius

Can one unite the two patterns into a single complexity measure?

D. Bonchev and G. A. Buck, Quantitative Measures of Network Complexity. In: Complexity in Chemistry, Biology and Ecology, D. Bonchev and D. H. Rouvray, Eds., Springer, New York, 2005, p. 191-235.

D

AB 1

V

ii

V

i i

i bd

aB

11

2

bi is a measure for node centrality

Page 44: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

3 4

5

6

SC = 11 17 20 26OC = 32 76 100 160 WC = 58 106 140 150B1 = 0.2 0.222 0.250 0.333B2 = 1.105 1.294 1.571 1.6667

11

1213

14

15

7

8

9

10

SC = 29 31 54 57 OC = 190 212 482 522 WC = 178 214 300 350B1 = 0.313 0.313 0.429 0.400B2 = 1.6774 1.783 2.200 2.211

SC = 61 114 119 477 973OC = 566 1316 1396 7806 18180WC = 337 538 638 1200 1700A/D = 0.429 0.538 0.538 0.818 1B2 = 2.410 2.867 2.943 4.200 5

Examples of Increasing Complexity: N = 5

Page 45: NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth University Singapore, July 9-17, 2007.

Thank You for

Your Attention!!!