Olivier Elemento, Tavazoie lab

25
Ab initio genotype- phenotype association reveals intrinsic modularity in genetic networks (in bacteria) Olivier Elemento, Tavazoie lab

description

Ab initio genotype-phenotype association reveals intrinsic modularity in genetic networks (in bacteria). Olivier Elemento, Tavazoie lab. Some bacterial phenotypes …. Motility. Spore formation. Gram-staining. Hyper-thermophily. Can we find the genes underlying these phenotypes ?. - PowerPoint PPT Presentation

Transcript of Olivier Elemento, Tavazoie lab

Page 1: Olivier Elemento, Tavazoie lab

Ab initio genotype-phenotype association reveals intrinsic

modularity in genetic networks (in bacteria)

Olivier Elemento, Tavazoie lab

Page 2: Olivier Elemento, Tavazoie lab

Motility Spore formation

Gram-staining Hyper-thermophily

Some bacterial phenotypes …

Page 3: Olivier Elemento, Tavazoie lab

Can we find the genes underlying these phenotypes ?

Page 4: Olivier Elemento, Tavazoie lab

http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi

Page 5: Olivier Elemento, Tavazoie lab
Page 6: Olivier Elemento, Tavazoie lab

Motility in bacteria

• Some (but not all) bacteria are motile

• Motile bacteria may share genes involved in motility

• These genes may be absent from non-motile bacteria

Page 7: Olivier Elemento, Tavazoie lab

Motility

present absent

B. s

ubtil

isB

. ant

hrax

C. j

ejeu

ni

~200bacterial genomes

E. c

oli

(Levesque et al, 2003; Jim, Parmar, Singh and Tavazoie, 2004)

M. t

uber

culo

sis

S. a

ureu

s

M. L

epra

eS.

Pne

umon

ae

… …

Page 8: Olivier Elemento, Tavazoie lab

Motility

B. s

ubtil

isB

. ant

hrax

C. j

ejeu

ni

~200bacterial genomes

E. coli Gene XE

. col

i

(Levesque et al, 2003; Jim, Parmar, Singh and Tavazoie, 2004)

M. t

uber

culo

sis

S. a

ureu

s

M. L

epra

eS.

Pne

umon

ae

… …

present absent

Page 9: Olivier Elemento, Tavazoie lab

Motility

B. s

ubtil

isB

. ant

hrax

C. j

ejeu

ni

~200bacterial genomes

E. coli Gene X

E. c

oli

E. coli Gene Y

(Levesque et al, 2003; Jim, Parmar, Singh and Tavazoie, 2004)

M. t

uber

culo

sis

S. a

ureu

s

M. L

epra

eS.

Pne

umon

ae

… …

Highcorrelation

Gene Y is likely involved in motility

present absent

Page 10: Olivier Elemento, Tavazoie lab

Motility

B. s

ubtil

isB

. ant

hrax

C. j

ejeu

ni

~200bacterial genomes

E. c

oli

(Levesque et al, 2003; Jim, Parmar, Singh and Tavazoie, 2004)

M. t

uber

culo

sis

S. a

ureu

s

M. L

epra

eS.

Pne

umon

ae

… …

B. subtilis gene Z

present absent

(e.g. CheV)

Page 11: Olivier Elemento, Tavazoie lab

• Calculate a phylogenetic profile for all 600,000 genes in bacteria (~1.2x10^8 BLASTs)

• Collect the genes most correlated to the phenotype in all bacteria that have the phenotype (~3,000 for motility)

• Merge homologous genes (based on sequence similarity)

Page 12: Olivier Elemento, Tavazoie lab

~ 3,000 motility genes

Merging homologous (orthologous/paralogous) genes

75 groups of homologs (Generic Genes)

~ 3

,000

mot

ility

gen

es

Page 13: Olivier Elemento, Tavazoie lab

E. coli Gene Y

B. subtilis Gene Y

B. anthrax Gene Y

C. jejeuni Gene Y

Generic Gene Y

Motility

Page 14: Olivier Elemento, Tavazoie lab
Page 15: Olivier Elemento, Tavazoie lab

Can we recover such modules ?

Generic Gene V

Generic Gene W

Generic Gene Y

Generic Gene Z

Motility

Page 16: Olivier Elemento, Tavazoie lab

Can we recover such modules ?

Generic Gene V

Generic Gene W

Generic Gene Y

Generic Gene Z

Module 1

Module 2

Page 17: Olivier Elemento, Tavazoie lab

Can we recover such modules ?

• Cluster Generic Gene profiles 1,000 times using Iclust with different random initializations (obtain slightly different clusters)

• Group together genes which almost always end up in the same cluster

Iclust: Slonim et al, 2006

Page 18: Olivier Elemento, Tavazoie lab

GG-3 flagellar biosynthetic protein flhBGG-4 flagellar biosynthetic protein flhAGG-5 flagellar biosynthetic protein fliPGG-22 flagellar biosynthetic protein fliRGG-56 flagellar biosynthetic protein fliQGG-6 flagellar hook flgE/F/GGG-7 flagellar motor switch fliGGG-10 flagellar basal-body rod flgCGG-12 flagellar MS-ring fliFGG-13 flagellar hook-associated protein 1 flgKGG-18 flagellar motor switch fliNGG-21 flagellar motor switch fliMGG-27 flagellar hook-associated protein 3 flgLGG-29 flagellar hook-associated protein 2 fliDGG-8 flagellin fliCGG-17 motility protein A motAGG-74 flagellar protein fliSGG-20 motility protein B motBGG-1 methyl-accepting chemotaxis protein

GG-11 chemotaxis protein cheAGG-45 methyl-accepting chemotaxis proteinGG-73 methyl-accepting chemotaxis proteinGG-38 chemotaxis protein cheVGG-15 chemotaxis protein cheWGG-2 chemotaxis methyltransferase cheRGG-30 glutamate methylesterase cheB

GG-32 flagellar L-ring protein precursor flgHGG-36 flagellar P-ring protein precursor flgI

GG-9 RNA-polymerase sigma-54 factorGG-14 transcription factor, sigma-54-dependent

Motility GG index

Moti

lity G

G index

These results are based on no prior knowledge, apart from genome sequences along with their phenotypic annotations

Page 19: Olivier Elemento, Tavazoie lab

Phylogenetic profiles / modules for motility

Page 20: Olivier Elemento, Tavazoie lab

Motility

fliI, cheY

fliO, cheZ

E. coli chemotaxis and flagellum modules

Some E. coli genes are not recovered. Why ?

Page 21: Olivier Elemento, Tavazoie lab

GG-9 PAL peptidoglycan-associated lipoproteinGG-10 tolQ/exbB proteinGG-12 tolB proteinGG-72 lipid A biosynthesis lauroyl acyltransferase

GG-2 3-deoxy-manno-octulosonate cytidylyltransferaseGG-3 UDP-3-O glucosamine N-acyltransferaseGG-4 lipid-A-disaccharide synthaseGG-5 polysialic acid capsule expression proteinGG-7 UDP-3-O N-acetylglucosamine deacetylaseGG-8 3-deoxy-D-manno-octulosonic-acid transferaseGG-11 tetraacyldisaccharide 4'-kinaseGG-1 outer membrane protein yaeT

GG-68 glutaredoxin 3GG-29 2-octaprenyl-6-methoxyphenol hydroxylaseGG-31 glutathione synthetaseGG-18 glutaredoxin-related proteinGG-73 coproporphyrinogen III oxidase, aerobicGG-107 hydroxyacylglutathione hydrolase

GG-20 HlyD family secretion proteinGG-96 HlyD family secretion proteinGG-53 HlyD family secretion proteinGG-111 membrane fusion protein (MFP)GG-15 pyridoxal phosphate biosynthetic proteinGG-52 pyridoxal phosphate biosynthetic proteinGG-35 ABC transporter, permease

Phylogenetic profiles / modules for Gram-staining

Page 22: Olivier Elemento, Tavazoie lab

GG-8 sporulation-blocking protein yabPGG-130 sporulation sigma-E factor processing peptidaseGG-58 stage III sporulation protein ACGG-6 stage III sporulation protein ADGG-3 stage III sporulation protein D

GG-63 spore-cortex-lytic enzymeGG-87 spore germination proteinGG-104 spore proteaseGG-136 spore protease relatedGG-71 stage III sporulation protein ABGG-103 stage III sporulation protein AEGG-132 stage III sporulation protein AGGG-95 stage II sporulation protein EGG-137 stage II sporulation protein MGG-11 stage II sporulation protein PGG-134 stage II sporulation protein RGG-135 stage IV sporulation proteinGG-76 stage IV sporulation protein AGG-46 stage IV sporulation protein BGG-40 stage V sporulation protein ACGG-34 stage V sporulation protein ADGG-15 stage V sporulation protein AFGG-37 translocation-enhancing proteinGG-94 hypothetical membrane proteinGG-127 hypothetical membrane protein

GG-49 small acid-soluble spore protein I sspIGG-69 spoVID-dependent spore coat assembly factorGG-101 spore coat proteinGG-52 spore coat protein EGG-99 spore coat related, putativeGG-97 spore cortex biosynthesis, putativeGG-84 spore germination proteinGG-90 spore germination proteinGG-55 spore germination protein C1GG-62 sporulation initiation phosphotransferaseGG-113 stage III sporulation protein AFGG-64 stage IV sporulation protein FAGG-91 stage VI sporulation protein DGG-54 abi, CAAX amino terminal proteaseGG-42 cytochrome C-550/C-551GG-53 cytochrome C oxidase subunit IVGG-36 menaquinol-cytochrome C reductase qcrCGG-50 lipoprotein, putativeGG-18 prespore-specific transcriptional regulator GG-66 putative lipoproteinGG-56 putative ribonuclease HGG-26 reductase ribT / acetyltransferase gnaTGG-124 hypothetical membrane proetinGG-118 hypothetical membrane proteinGG-29 hypothetical cytosolic proteinGG-38 hypothetical cytosolic proteinGG-120 hypothetical cytosolic proteinGG-24 hypothetical proteinGG-27 hypothetical proteinGG-28 hypothetical proteinGG-30 hypothetical proteinGG-31 hypothetical proteinGG-32 hypothetical proteinGG-33 hypothetical proteinGG-41 hypothetical proteinGG-43 hypothetical proteinGG-47 hypothetical proteinGG-60 hypothetical proteinGG-61 hypothetical proteinGG-65 hypothetical proteinGG-67 hypothetical proteinGG-68 hypothetical proteinGG-70 hypothetical proteinGG-72 hypothetical proteinGG-73 hypothetical proteinGG-83 hypothetical proteinGG-88 hypothetical protein, HD domainGG-100 hypothetical protein (ecsc)GG-114 hypothetical proteinGG-116 hypothetical proteinGG-117 hypothetical protein

Focused hypotheses for experimental validation

Page 23: Olivier Elemento, Tavazoie lab

• Community sequencing

Page 24: Olivier Elemento, Tavazoie lab

Conclusion

• Systematic association of genotype / phenotype for several phenotypes

• Clustering reveals robust modules that corresponds to protein complexes, signal transduction pathways, enzymatic pathways

• Many predictions that can be verified experimentally

Page 25: Olivier Elemento, Tavazoie lab

Acknowledgements

• Saeed Tavazoie

• Noam Slonim

• Tavazoie lab members