SI 614 Network subgraphs (motifs) Biological networks
description
Transcript of SI 614 Network subgraphs (motifs) Biological networks
![Page 1: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/1.jpg)
School of InformationUniversity of Michigan
SI 614Network subgraphs (motifs)
Biological networks
Lecture 11
Instructor: Lada Adamic
![Page 2: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/2.jpg)
Outline
motifs motif detection (software & Pajek) review of network characteristics
used to compare model with real-world network one more: degree assortativity
biological networks types characteristics hierarchical modularity model
![Page 3: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/3.jpg)
Schematic view of network motif detection
![Page 4: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/4.jpg)
Motifs can overlap in the network
http://mavisto.ipk-gatersleben.de/frequency_concepts.html
motif matches in the target graph
motif to be foundgraph
![Page 5: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/5.jpg)
Examples of network motifs (3 nodes)
Feed forward loop Found in neural networks Seems to be used to neutralize
“biological noise”
Single-Input Module e.g. gene control networks
![Page 6: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/6.jpg)
All 3 node motifs
![Page 7: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/7.jpg)
Examples of network motifs (4 nodes)
Parallel paths Found in neural networks Food webs
W
X Y
Z
![Page 8: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/8.jpg)
4 node subgraphs (computational expense increases with the size of the graph!)
![Page 9: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/9.jpg)
Network motif detection
Some motifs will occur more often in real world networks than random networks
Technique: construct many random graphs with the same number of nodes
and edges (same node degree distribution?) count the number of motifs in those graphs calculate the Z score: the probability that the given number of
motifs in the real world network could have occurred by chance
Software available: http://www.weizmann.ac.il/mcb/UriAlon/
![Page 10: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/10.jpg)
What the Z score means
mean number of times the motifappeared in the random graph
# of times motif
appeared in random graph
zx=x - x
x
standard deviationthe probability observing a Z
score of 2 is 0.02275
In the context of motifs:
Z > 0, motif occurs more often
than for random graphs
Z < 0, motif occurs less often
than in random graphs
|Z| > 1.65, only a 5% chance of
random occurence
![Page 11: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/11.jpg)
Finding classes on graphs based on their motif “profiles”
![Page 12: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/12.jpg)
Finding motifs (cliques and subgraphs) in Pajek
Create a second network that is the subgraph you are looking for e.g. an undirected triad
*Vertices 3
1 "v1"
2 "v2"
3 "v3"
*Arcs
*Edges
2 3 1
1 2 1
1 3 1
![Page 13: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/13.jpg)
finding motifs with Pajek
Use the two drop down menus in the ‘networks’ list to specify two networks:
Then run Nets>Fragment (1 in 2)>Find under Net>Fragment (1 in 2)>Options
can select ‘induced’ subnetwork containing only overlapping fragments
in
![Page 14: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/14.jpg)
finding motifs with Pajek (cont’d)
Now we have just the triads:
Creates a hierarchy object with the membership of each triad listed
![Page 15: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/15.jpg)
Comparing network models with the real thing
check for structural similarity between the artificial network (the model) and the real world network degree distribution assortativity
do high degree nodes connect to other high degree nodes? average shortest path
dependence on size of network clustering coefficient
compare to a randomized version conserving node degree dependence on node degree dependence on size of network
motif profile
![Page 16: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/16.jpg)
How can we randomize a network whilepreserving the degree distribution?
Stub reconnection algorithm (M. E. Newman, et al, 2001, also known in mathematical literature since 1960s)
Break every edge in two “edge stubs”AB to A B
Randomly reconnect stubs Problems:
Leads to multiple edges Cannot be modified to preserve additional topological
properties
![Page 17: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/17.jpg)
Local rewiring algorithm
Randomly select and rewire two edges (Maslov, Sneppen, 2002, also known in mathematical literature since 1960s)
Repeat many times Preserves both the number of upstream and downstream
neighbors of each node
![Page 18: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/18.jpg)
Conserving additional low-level topological properties
In addition to ki one may also conserve: The exact numbers of loops or other motifs The size and numbers of components: Internet – all nodes have
to be connected to each other
Metropolis algorithm: two edges are rewired based on E=(Nactual-Ndesired)2/Ndesired
If E0 rewiring step is always accepted If E>0 rewiring step is accepted with p=exp(-E/T)
![Page 19: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/19.jpg)
Assortativity
Social networks are assortative: the gregarious people associate with other gregarious people the loners associate with other loners
The Internet is disassortative:
Assortative:
hubs connect to hubs
Random Disassortative:
hubs are in the
periphery
![Page 20: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/20.jpg)
Correlation profile of a network
Detects preferences in linking of nodes to each other based on their connectivity
Measure N(k0,k1) – the number of edges between nodes with connectivities k0 and k1
Compare it to Nr(k0,k1) – the same property in a properly randomized network
Very noise-tolerant with respect to both false positives and negatives
![Page 21: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/21.jpg)
Correlation profiles give complex networks unique identities
InternetProtein interactions
slide by Sergei Maslov
2D picture
![Page 22: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/22.jpg)
Correlation profiles give complex networks unique identities
InternetProtein interactions
Sergei Maslov: 2D histogram
![Page 23: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/23.jpg)
Correlation profiles -cont’d
Pastor-Satorras and Vespignani: 2D plot
average degree
of the node’s neighbors
degree of node
![Page 24: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/24.jpg)
Correlation profiles -cont’d
Newman: single number
-0.189
internet degree correlation coefficient
The Pearson correlation coefficient of nodes on each
side on an edge
![Page 25: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/25.jpg)
Other examples of assortative mixing
Assortativity is not limited to degree-degree correlations other attributes social networks: race, income, gender, age food webs: herbivores, carnivores internet: high level connectivity providers, ISPs, consumers
Tendency of like individuals to associate: ‘homophily’ Scott Feld paper
![Page 26: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/26.jpg)
Biological networks
In biological systems nodes and edges can represent different things nodes
protein, gene, chemical edges
mass transfer, regulation
Can construct bipartite or tripartite networks: e.g. genes and proteins
![Page 27: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/27.jpg)
GENOME
PROTEOME
METABOLISM
bio-chemical reactions
protein-protein interactions
protein-gene interactions
slide after Reka Albert
![Page 28: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/28.jpg)
Cellular processes form networks on many levels
metabolic reaction networks (tri-partite)
slide after Reka Albert
Node types: metabolites (substrates or products), open rectangles metabolite-enzyme complexes (black rectangles) enzymes (open ovals)
Edges substrate to complex or complex to product symmetrical edges
![Page 29: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/29.jpg)
regulatory networks
nodes: genes, proteins
edges: translation
regulation: activating
inhibitingslide after Reka Albert
![Page 30: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/30.jpg)
the yeast two-hybrid method
Activation and binding domains are separated and each attached to a different protein
If the proteins interact, the two domains will be brought together and activate the transcription of a reporter gene
Can do simultaneous genome-wide experiments
slide after Reka Albert
![Page 31: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/31.jpg)
Resulting interaction network
slide after Reka Albert
![Page 32: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/32.jpg)
Properties and problems of resulting networks
Properties giant component exists power law distribution with an
exponential cutoff longer path length than
randomized higher incidence of short loops
than randomized
Problems false positives false negatives only 20% overlap between
different studies
![Page 33: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/33.jpg)
Implications
Robustness resilient to random breakdowns mutations in hubs can be
deadly
Evolution most connected hubs
conserved across organisms (important)
gene duplication hypothesis new gene still has same output
protein, but no selection pressure because the original gene is still present. So some interactions can be added or dropped
leads to scale free topology
![Page 34: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/34.jpg)
Metabolic networks: how to represent them
Can consider the one-mode projection of substrate interactions (undirected)
slide after Reka Albert
![Page 35: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/35.jpg)
Metabolic networks are scale-free
In the bi-partite graph: the probability that
a given substrate participates in k reactions is k
indegree: = 2.2
outdegree: = 2.2
(a) A. fulgidus (Archae) (b) E. coli (Bacterium) (c) C. elegans (Eukaryote), (d) averaged over 43 organisms
![Page 36: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/36.jpg)
Modularity
No modularity
Modularity
Hierarchical modularity
E. Ravasz et al., Science 297, 1551 -1555 (2002) (Pajek!)
![Page 37: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/37.jpg)
How do we know that metabolic networks are modular?
clustering decreases with degree as C(k)~ k-1
randomized networks (which preserve the power law degree distribution) have a clustering coefficient independent of degree
![Page 38: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/38.jpg)
How do we know that metabolic networks are modular?
clustering coefficient is the same across metabolic networks in different species with the same substrate
corresponding randomized scale free network:C(N) ~ N-0.75 (simulation, no analytical result)
bacteria
archaea (extreme-environment single cell organisms)
eukaryotes (plants, animals, fungi, protists)
scale free network of the same size
![Page 39: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/39.jpg)
review: what would the clustering coefficient of a random network be
assume average degree of node is k probability of one neighbor linking to another is ~ k/N scales as N-1
![Page 40: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/40.jpg)
Constructing a hierarchically modular network
RSMOB model Start from a fully
connected cluster of nodes
Create 4 identical replicas of the cluster, linking the outside nodes of the replicas to the center node of the original (N = 25 nodes)
This process can repeated indefinitely
(initial number of nodes can be different than 5)
![Page 41: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/41.jpg)
Properties of the hierarchically modular model
RSMOB model Power law exponent = 2.26 (in agreement with real
world metabolic networks) C ≈ 0.6, independent of network size (also
comparable with observed real-world values) C(k) ≈ k-1, as in real world network
How to test for hierarchically arranged modules in real world networks perform hierarchical clustering on the topological overlap
map (we’ll cover hierarchical clustering in a few weeks…) can be done with Pajek
![Page 42: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/42.jpg)
Topological overlap
A: Network consisting of nested modules B: Topological overlap matrix
hierarchical
clustering
![Page 43: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/43.jpg)
Hubs may act within a module, or connect modules
Party hub: simultaneous interactions tends to be within the same
module
Date hub: sequential interactions connect different modules
Han et al, Nature 443, 88 (2004)
slide after Reka Albert
![Page 44: SI 614 Network subgraphs (motifs) Biological networks](https://reader036.fdocuments.in/reader036/viewer/2022062423/568143db550346895db06912/html5/thumbnails/44.jpg)
some matching motifs frequently overlap (e.g. feed forward loop)
Zhang et al, J. Biol 4, 6 (2005)