Russia and the Commonwealth The Commonwealth of Independent States.
NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth...
-
Upload
naomi-summers -
Category
Documents
-
view
218 -
download
3
Transcript of NETWORKS BASICS Danail Bonchev Center for the Study of Biological Complexity Virginia Commonwealth...
NETWORKS BASICS
Danail Bonchev
Center for the Study of Biological Complexity
Virginia Commonwealth University
Singapore, July 9-17, 2007
Recommended Literature
1. Linked: The New Science of Networks. Albert-László Barabási. Perseus Publisher, 2002. ISBN: 0-738-20667-9, 304 pp., Price: $ 15.00
2. The Structure and Dynamics of Networks. Mark Newman, Albert-László Barabási, and Duncan J. Watts, Princeton University Press, 2006 | $49.50 / ISBN: 0-691-11357-2; 624 pp.
3. Evolution of Networks. From Biological Nets to the Internet and WWW. Serguei N Dorogovtsev, Jose Fernando Ferreira Mendes, and A F Ioffe Oxford University Press, 2003, ISBN: 0198515901, $95.00, 344 pp.
4. An introduction to Systems Biology: Design Principles of Biological Circuits, Uri Alon, Chapman & Hall/CRC, Taylor and Francis Group, 2006, ISBN:1584886420.
A branch of science that seeks to integrate different levels of information to understand how biological systems function.
It is not the number and properties of system elements but their relations!!
L. Hood: “Systems biology defines and analyses the interrelationships of all of the elements in a functioning system in order to understand how the system works.”
Systems Biology. What Is It?
More on Systems BiologyMore on Systems Biology
Essence of living systems is flow of mass,energy, and information in space and time.
The flow occurs along specific networks
Flow of mass and energy (metabolic networks)
Flow of information involving DNA (transcriptional regulation networks)
Flow of information not involving DNA (signaling networks)
The Goal of Systems Biology: To understand the flow of mass, energy, and information in living systems.
Networks and the Core Concepts of Systems Biology
(i) Complexity emerges at all levels of the hierarchy of life
(ii) System properties emerge from interactions of components
(iii) The whole is more than the sum of the parts.
(iv) Applied mathematics provides approaches to modeling biological systems.
How to Describe a System As a Whole?
Networks - The Language of Complex Systems
What is a Network?
Network is a mathematical structure composed of points connected by lines
Network Theory <-> Graph Theory
Network Graph
Nodes Vertices (points)
Links Edges (Lines)
A network can be build for any functional system
F. Harary, Graph Theory, Addison Wesley, Reading, MA, 1969Gross & Yellen, Handbook of Graph Theory, CRC Press, Boca Raton, FL, 2004
System vs. Parts = Networks vs. Nodes
Networks As Graphs Networks can be undirected or directed, depending on whether the interaction between two neighboring nodes proceeds in both directions or in only one of them, respectively.
The specificity of network nodes and links can be quantitatively characterized by weights
2.5
2.5
7.3 3.3 12.7
8.1
5.4
Vertex-Weighted Edge-Weighted
1 2 3 4 5 6
Networks As Graphs - 2
Networks having no cycles are termed trees. The more cycles the
network has, the more complex it is.
A network can be connected (presented by a single component) or
disconnected (presented by several disjoint components).
connected disconnected
trees
cyclic graphs
Networks As Graphs - 3Some Basic Types of Graphs
Paths
Stars
Cycles
Complete Graphs
Bipartite Graphs
Air Transportation Network
The World Wide Web
Fragment of a Social Network(Melburn, 2004)
Biological NetworksA. Intra-Cellular Networks
Protein interaction networks
Metabolic Networks
Signaling NetworksGene Regulatory Networks
Composite networksNetworks of Modules, Functional Networks Disease networks
B. Inter-Cellular NetworksNeural Networks
C. Organ and Tissue Networks
D. Ecological Networks
E. Evolution Network
protein-gene interactions
protein-protein interactions
PROTEOME
GENOME
Citrate Cycle
METABOLISM
Bio-chemical reactions
L-A Barabasi
miRNAregulation?
- -
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
The Protein Network of Drosophila
CuraGen Corporation Science, 2003
Source: ExPASy
Metabolic Networks
FAS-L FAS-R FADD
CASP10
CASP8
CASP6
CASP3
CASP7
DFF45 DFF40 Deathactivator
DISC
Death-Inducing Signaling Complex
Heterodimer DFF
InitiatorCaspases
Executor Caspases
Start DNA Fragmentation
Cleavage of Caspase Substrates
Membrane protein
Apoptosis Pathway - 1
D. Bonchev, L.B. Kier, C. Cheng, Lecture Series on Computer and Computational Sciences 6, 581-591 (2006).
Apoptosis is a mechanism of controlled cell death critically important in many biological processes
Gene Regulation Networks
The Longevity Gene-Protein Network (LGPN)
T. Witten, D. Bonchev,in press
C. elegans
Network of Interacting Pathways (NIP)
A.Mazurie D.Bonchev G.A. Buck, 2007
381 organisms
Functional Networks
Number of shared proteins
20125
55740
43221
33419
28692
12160
20147
7
83 65
41 49
172 75
321
24 260
11
596
35
103
Cell Cycle Cell Polarity & Structure
Intermediateand EnergyMetabolism
Protein Synthesisand Turnover
Protein RNA / Transport
RNAMetabolism
Signaling
Transcription/DNAMaintenance/Chromatin Structure
Number of protein complexes
Number of proteins 13111
8 6125 40
77 19 14
97
30 16 27
11
75 299
53
37
19
7 15
22187
33 73
13
94
MembraneBiogenesis &Turnover
Yeast: 1400 proteins, 232 complexes, nine functional groups of complexes
(Data A.-M. Gavin et al. (2002) Nature 415,141-147)
D. Bonchev, Chemistry & Biodiversity 1(2004)312-326
Summary
All complex networks in nature and technology have common features.
They differ considerably from random networks of the same size
By studying network structure and dynamics, and by using comparative network analysis, one can get answers of important biological questions.
Fundamental Biological Questions to Answer
(i) Which interactions and groups of interactions are likely to have equivalent functions across species? (ii) Based on these similarities, can we predict new functional information about proteins and interactions that are poorly characterized?(iii) What do these relationships tell us about the evolution of proteins, networks and whole species? (iv) How to reduce the noise in biological data: Which interactions represent true binding events?
False-positive interaction is unlikely to be reproduced across the interaction maps of multiple species.
All Complex Dynamic NetworksHave Similar Structure and Common Properties
Scale-Freeness
Small-Worldness
Centrality
Robustness/Fragility
Hubs
How To Characterize
a Network?
Quantifying Networks
A1. Connectivity-based:
A. Graph-Theoretical (Topological) Descriptors
A2. Distance-based
B. Information-Theoretic Descriptors
B2. Structural
C. Complexity Measures
C3. Walk Count
B1, Compositional
C1. Subgraph Count
C2. Overall Connectivity
C4. Small-World Connectivity
Connectivity-Based Topological DescriptorsAdjacency Matrix
Adjacency relation, aij
aij = 1 (neighbors)
aij = 0 (otherwise)
Adjacency Matrix
0 0 1 0 0 1 0 0 1 0 0 1 1 1 0 1 0 3 0 0 1 0 1 2 0 0 0 1 0 1
12345
1 2 3 4 5 ai
A(G) =
ai – node degree
3 2 4 5
1
V = 5
E = 4G
random node numbering
3 1 2 1
1
node degrees
Connectivity DescriptorsLocal (node) descriptors: vertex (node) degrees, ai
Global (Network) descriptors: total adjacency, A
neighborsj ij
V
jiji aaa
1
V
i
V
jij
V
ii aaGA
1 11
)(
Average and Normalized Descriptors
Average vertex (node) degree:
VGAai /)(
Network connectedness (density)
2)(;
)1(
2
)1()(
V
AGnCon
VV
E
VV
AGConn
Connectivity Descriptors-2
3 1 2 1
1
<ai> = 8/5 = 1.6
A = 1+1+3+2+1 = 8
Conn = 8/5.4 = 0.4 = 40%
Example
Adjacency in Directed GraphsAdjacency relation, aij
aij = -1 (incoming edge(arc)) aij = +1 (outgoing edge(arc)) aij = 0 (otherwise)
In-degree = -1 Out-degree = +1
3 2 4 5
1
0 0 1 0 0 +1 0 0 1 0 0 +1 1 0 0 0 0 +1 0 0 1 0 0 +1 0 0 0 1 0 +1ai(in) -1 0 -3 -1 0
12345
1 2 3 4 5 ai(out)
A(DG) =
-1,1
-3,1 0,1 -1,1 0,1
Distance-Based Topological DescriptorsDistance Matrix
Distance relation: dij = 1 for i,j - neighbors
The distance between two non-neighboring nodes is equal to the number of edges along the shortest path that connects them.
3 2 4 5
1
0 2 1 2 3 8 2 0 1 2 3 8 1 1 0 1 2 5 2 2 1 0 1 6 3 3 2 1 0 9
12345
1 2 3 4 5 di
D(G) =
di – node distance(node distance degree)
32
6
4
5
7
d26 = ? d57 =?
Distance DescriptorsNode descriptors:
Network descriptors:
V
i
V
jij
V
ii ddGD
1 11
)(Network distance, D(G)
Network diameter, Diam(G) )()( ijdMaxGDiam
Node eccentricity, ei
e ei = Max(dij)
V
jiji dd
1node distance, di
Network radius, Rad(G) ))(()()( iji dMaxMineMinGRad
Distance Descriptors-2Average and Normalized Descriptors
Average node distance, <di>
VGDdi /)(
Average network distance, <d>(average degree of separation, average path length)
)1(/ VVDd
5 8 6 9
8
<di> = 36/5 = 7.2
D = 8+8+5+6+9 = 36
<d > = 36/5.4 = 1.8
Example
Distances in Directed Networks
Some distances in directed graphs are equal to infinity !!
How to calculate D and <d>?
In-distances and out-distances
0, 2 -4, 0
0,3
-2,1
3 4
1
2
d21=? d13=?
D(in) = D(out) = 6
<d> = 6/(4x3) = 0.5 ???
Network Node Accessibility Acc (G) = Nd(DG)/Nd
Distances in Directed Networks - 2
Adjusted Average Network DistanceAcc
dDGAD
)(
Example:
12
8
9
9
6 8G
D = 52, <d(G)> = 52/(6x5) = 1.73
9
01
9
8
7
DG
D = 34, <d(DG)> = 34/20 = 1.70 < D(G)??
AC = 20/30 = 0.667 AD = 1.70/0.667 = 2.55 > D(G)
Shannon’s Information Theory
References1. Shannon, C.; Weaver, W. Mathematical Theory of Communications.University of Illinois Press: Urbana, MI, 1949.2. Bonchev, D. Information‑Theoretic Indices for Characterization of Chemical Structures. Research Studies Press: Chichester, UK,1983.
The more diverse the distribution of system elements, the larger its information content.
Information is a measure of system’s diversity
How to Measure Information?
What Is Information?
Wiener: Information is neither matter, nor energy.
Forget about meaning!
Information is contained in any system, the elements of which can be grouped according to one or more criteria.
The more complex the system, the larger its information content.
Information is a measure of system complexity
Shannon’s Information Theory Basic Equations
k
iii ppI
12log)(Mean Information:
, bits/element
k
iiitot NNNNI
1
loglog)(Total Information: , bits
Finite Probability Scheme: System of N elements and k equivalence classes with equivalence criterion α:
class number of elements probability 1 N1 p1
2 N2 p2 ………………………………………………………………... k Nk pk
where pi = Ni / N, and Σ pi = 1.
I
Normalized Information:NN
NN
NN
II
V
iii
totnorm
2
12
2 log
log1
log
)()(
1)(0 normI
Network Information Descriptors
Information on the system elements equivalence, eI
12
8
6 8
9
9
distances
3
2
2
4
2 1
degrees eItot(deg) = 6log26 – 3log23 – 3x1log1 = 10.75 bits
Vertex degree equivalence distribution: 6{3, 1, 1, 1}
eI(deg) = -(3/6)2log(3/6) – 3x(1/6)log2(1/6) = 1.79 bits/node
eInorm(deg) = 10.75/(6log26) = 0.693
eItot(dist) = 6log26 – 2x2log22 – 2x1log1 = 11.51 bits
eI(deg) = - 2x(2/6)log2(2/6) – 2x(1/6)log2(1/6) = 1.92 bits/node
eI (deg) = 11.51/(6log26) = 0.742
Vertex distance equivalence distribution: 6{ 2, 2, 1, 1}
Composition distribution: 6{2,2,1,1}
Information on the system elements weight (or magnitude) , mI weighted information descriptors (indices)
12
8
6 89
9distances
D = 52
degrees3
2
2
4
2 1
A = 14
Network Information Descriptors - 2
mI(dist) = 52log252–12log212–2x9log29–2x8log28-6log26 = 132.83 bits
mI(deg) = -(12/52)log2(12/52) – 2x(9/52)log2(9/52) – 2x(8/52)log2(8/52) – (6/52)log2(6/52) = 2.55 bits/node
mInorm(deg) = 132.83/(52log252) = 0.448
mItot(deg) = 14log214 – 4log24 – 3log23 -3x2log22 -1log21 = 34.55 bits
mI(deg) = -(4/14)log2(4/14) –(3/14)log2(3/14) – 3x(2/14)log2(2/14) – (1/14)log2(1/14) = 2.47 bits/node
mInorm(deg) = 34.55/(14log214) = 0.648
Distance magnitude distribution: 52 {12, 2x9, 2x8, 6}
Vertex degree distribution: 14 {4, 3, 3x2, 1}
Network Complexity Descriptors - 1
Subgraph Count, eSC
SC = 17 (5, 4, 4, 3, 1) OC = 76 ( 8, 16, 23, 21, 8)
V = 5, E = 4
e=0
e=1
1 2 1 3
1
1 3 3 3 1 1 2 2
e=2
1 1 2
3
11
3 3 3 2 2 1
e=3
1 1 2 3 1 2 3
1
1 3
1
2
e=4
1 2 1 3
1
0SC = 5 0OC = 8
1SC = 4 1OC = 16
2SC = 4 2OC = 23
3SC = 3 3OC = 21
4SC=1 4OC = 8
Example e = number of edges
Overall Connectivity, eOC
Network Complexity Descriptors - 2
Walk Count, WCExample
5 4 1 3
2
WC = 106 ( 8, 16, 28, 54) 1 3
l = 1
3
l=2
1
The three complexity measures, SC, OC, and WC, can discriminate very subtle complexity features.
1
2
SC 28(5,8,9,5,1) 30(5,9,10,5,1)
OC(in) 111(12,28,41,25,5) 135(16,40,49,25,5)
WC 15(5,5,5) 21(5,7,9)
1 3 4
l=2
For networks use only complexity measures with e = 1, 2, and 3!!
Small-World Connectivity
Network Complexity Descriptors - 3
Network complexity increases with connectivity
Network complexity increases with the decrease in its radius
Can one unite the two patterns into a single complexity measure?
D. Bonchev and G. A. Buck, Quantitative Measures of Network Complexity. In: Complexity in Chemistry, Biology and Ecology, D. Bonchev and D. H. Rouvray, Eds., Springer, New York, 2005, p. 191-235.
D
AB 1
V
ii
V
i i
i bd
aB
11
2
bi is a measure for node centrality
3 4
5
6
SC = 11 17 20 26OC = 32 76 100 160 WC = 58 106 140 150B1 = 0.2 0.222 0.250 0.333B2 = 1.105 1.294 1.571 1.6667
11
1213
14
15
7
8
9
10
SC = 29 31 54 57 OC = 190 212 482 522 WC = 178 214 300 350B1 = 0.313 0.313 0.429 0.400B2 = 1.6774 1.783 2.200 2.211
SC = 61 114 119 477 973OC = 566 1316 1396 7806 18180WC = 337 538 638 1200 1700A/D = 0.429 0.538 0.538 0.818 1B2 = 2.410 2.867 2.943 4.200 5
Examples of Increasing Complexity: N = 5
Thank You for
Your Attention!!!