Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory...

36
Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York

Transcript of Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory...

Page 1: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Why bacteria run Linux while eukaryotes run

Windows?

Sergei MaslovBrookhaven National Laboratory

New York

Page 2: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

2

Physical vs. Biological Laws Physical Laws are often discovered

by finding simple common explanation for very different phenomena

Newton’s Law: Apples fall to the ground Planets revolve around the Sun

Discovery of Biological Laws is slowed down by us having cookie-cutter explanation in terms of natural selection:

Page 3: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Drawing from Facebook group: Trust me, I'm a "Biologist"'

Page 4: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Genes encoded in bacterial genomes

Packages installed on Linux computers

~

Page 5: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Complex systems have many components Genes (Bacteria) Software packages (Linux OS)

Components do not work alone: they need to be assembled to work

In individual systems only a subset of components is installed Genome (Bacteria) – collection of

genes Computer (Linux OS) – collection of

software packages Components have vastly

different frequencies of installation

Page 6: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Justin Pollard, http://www.designboom.com

IKEA kits have many components

Page 7: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Justin Pollard, http://www.designboom.com

They need to be assembled to work

Page 8: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Different frequencies of use

vs

Common Rare

Page 9: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

What determines the frequency of installation/use of a

gene/package?

Popularity: AKA preferential attachment Frequency ~ self-amplifying popularity Relevant for social systems: WWW links,

facebook friendships, scientific citations Functional role:

Frequency ~ breadth or importance of the functional role

Relevant for biological and technological systems where selection adjusts undeserved popularity

Page 10: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Empirical data on component frequencies

Bacterial genomes (eggnog.embl.de): 500 sequenced prokaryotic genomes 44,000 Orthologous Gene families

Linux packages (popcon.ubuntu.com): 200,000 Linux packages installed on 2,000,000 individual computers

Binary tables: component is either present or not in a given system

Page 11: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Frequency distributions

P(f)~ f-1.5 except the top √N “universal” components with f~1

Cloud

ShellCore

ORFans

TY Pang, S. Maslov, PNAS (2013)

Page 12: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

How to quantify functional importance?

We want to check Frequency ~ Importance

Usefulness=Importance ~ Component is needed for proper functioning of other components

Dependency network A B means A depends on B for its function Formalized for Linux software packages For metabolic enzymes given by upstream-

downstream positions in pathways Frequency ~ dependency degree, Kdep

Kdep = the total number of components that directly or indirectly depend on the selected one

Page 13: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

13TY Pang, S. Maslov, PNAS (2013)

Page 14: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Correlation coefficient ~0.4 for both Linux and genesCould be improved by using weighted dependency

degree

Frequency is positively correlated with functional importance

TY Pang, S. Maslov, PNAS (2013)

Page 15: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Warm-up: tree-like metabolic network

Kdep=5

Kdep=15

TCA cycle

TY Pang, S. Maslov, PNAS (2013)

Page 16: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Dependency degree distribution on a critical branching tree

P(K)~K-1.5 for a critical branching tree

Paradox: Kmax-0.5 ~ 1/N Kmax=N2>N

Answer: parent tree size imposes a cutoff:there will be √N “core” nodes with Kmax=N present in almost all systems (ribosomal genes

or core metabolic enzymes)

Need a new model: in a tree D=1, while in real systems D~2>1

Page 17: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Bottom-down model of dependency network evolution

Components added gradually over evolutionary time

New component directly depends on D previously existing components selected randomly

Versions: D is drawn from some distribution

same as above Recent components are preferentially

selectedcitations

There is a fixed probability to connect to anypreviously existing componentsfood webs

Page 18: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

18

• p(t,T) –probability that component added at time T

directly or indirectly depends on one added at time t

Page 19: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

19

Page 20: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

20

Kdep and Kout degree distributions

Page 21: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Kdep decreases layer number

Linux Model with D=2

TY Pang, S. Maslov, PNAS (2013)

Page 22: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Zipf plot for Kdep distributions

Metabolic enzymesvs

Model

Linuxvs

Model

TY Pang, S. Maslov, PNAS (2013)

Page 23: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Frequency distributions

P(f)~ f-1.5 except the top √N “universal” components with f~1

Shell

Core

ORFans

Cloud

TY Pang, S. Maslov, PNAS (2013)

Page 24: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

What experiments does P(f) help to interpret?

Page 25: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Pan-genome of E. coli strains

M Touchon et al. PLoS Genetics (2009)

Page 26: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Metagenomes

The Human Microbiome Project Consortium, Nature (2012)

Page 27: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

27

Pan-genome scaling

Page 28: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Pan-genome of all bacteria

Slope=-0.4 predictions of the toolbox model (-0.5)

P. LapierreJP Gogarten TIG 2009

(# of genes in pan-genome) ~ (# of sequenced genomes)0.5

(# of new genes added to pan-genome) ~ (# of sequenced genomes)-0.5

Page 29: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Bacterial genome evolution happens in cooperation with

phages

+ =

Page 30: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Comparative genomics of E. coliimplicates phages for BitTorrent

Phage capacity: 20kbOther strains up to

40kb

K-12 to B comparison

1kb: gene length

Page 31: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Phage-Bacteria Infection NetworkData from Flores et al 2011

experiments by Moebus,Nattkemper,1981

WWW from AT&T website circa 1996 visualized by Mark Newman

Page 32: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Why eukaryotes run windows? Dependency network = reuse of

components Bacteria do not keep redundant genes

after HGT Linux developers rely on previous efforts Pros: smaller genomes, open source,

economies of scale Cons: less specialized, potentially unstable,

“dependency hell” Eukaryotes are like Windows or Mac OS

X Keep redundant components Proprietary software

Page 33: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Figure adapted from S. Maslov, TY Pang, K. Sneppen, S. Krishna, PNAS (2009)

# of genes

# o

f p

ath

ways

(or

their

reg

ula

tors

)

Page 34: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

101

102

103

104

105

100

101

102

103

104

105

# of installed packages

# o

f se

lect

ed p

acka

ges

100

102

104

1.6

1.7

1.8

Linux data

slope 1.7

Nselected packages ~ Ninstalled packages1.7

Software packages for Linux

Page 35: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

35

Collaborators: Tin Yau Pang, Stony Brook University

Support:

Office of Biological and Environmental Research

Page 36: Why bacteria run Linux while eukaryotes run Windows? Sergei Maslov Brookhaven National Laboratory New York.

Thank you!