Center for Biological Sequence Analysis The Technical University of Denmark DTU Comparative...
-
Upload
tamsin-lindsey -
Category
Documents
-
view
217 -
download
1
Transcript of Center for Biological Sequence Analysis The Technical University of Denmark DTU Comparative...
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Towards a “Systems Microbiology” of E. coli
Dave UsseryBiological Sequence AnalysisDTU course # 2780315 May, 2006
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Part the first:A Brief History of Systems Biology
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
The ability to observe, quantitatively define, quantitatively simulate, and rationally manipulate, the complete gene-expression patterns and molecular-interaction networks of a microbe has created an entirely new scientific discipline: "systems microbiology."
What is Systems Microbiology?
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
"Systems Biology" articles in PubMed
157 as of 14 May, 2006
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
PNAS, vol. 102 no. 48, pages 17296 –17301 29 November, 2005
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
PNAS, vol. 102 no. 48, pages 17302 –17307 29 November, 2005
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
PNAS, vol. 102 no. 48, pages 17302 –17307 29 November, 2005
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Part the second:Gene expression in E. coli
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Mol. Genet. Genomics, 267:721-729, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Genome Biology, 5:252, (2004).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
1. What is Regulated?
2. How is it Regulated?
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
http://string.embl.de/
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
1. What is Regulated?
2. How is it Regulated?
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Three levels of Regulation
~100,000 proteins/cell
~1,000 proteins/cell
1. Global - chromatin
2. Sigma factors
3. Transcription factors10 to 100 proteins/cell
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
1. Global - chromatin
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
2. Sigma factors
-35
-10 TATA
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
3. Transcription factors
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Comparison of 20 E. coli genomes
Interlude:
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Predicted vs. Published Genesstrain pathotype size contigs genesa genesb
O157 EDL93 EHEC 5528445 1 4886 5324 O157 RIMD EHEC 5498450 1 4987 5253 E22 EPEC 5516160 109 4943 4788d
E110019 EPEC 5384084 119 4839 4746 B171 EPEC 5299753 159 4780 4467 53638 EIEC 5289471 119 4779 4783 E2348/69 EPEC 5157864 12 4673 n/ad CFT073 UPEC 5231428 1 4653 5379 UTI89 UPEC 5065741 1 4466 5066 H10407 ETEC 5347466 95 4808 4808 B7A ETEC 5202558 198 4646 4637 E24377A ETEC 4980187 1 4407 4254 F11 ExPEC 5206906 88 4593 4467 NMEC RS18 NMEC 5089235 1 4896 n/ag
O42 EAEC 5355323 1 4713 n/ad 101-1 EAEC 4880380 70 4353 4238e
HS apathogenic 4643538 1 4126 3689 K-12 W3110 apathogenic 4641433 1 4122 4227f K-12 MG1655 apathogenic 4639675 1 4133 4237 B03 4629810 1 4076 4387 average 5129395 4647 4593
a predicted with EasyGene (Larsen and Krogh, 2003) publishedb GenBank (NCBI) information, genome project database contiguousc 16S RNAd TIGRe NARA Institutef U Wisconsin
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Codon usage of selected strains
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
20 “public” genomes at handstrain pathotype size GC% contigs genesa genesb rRNAc tRNA O157 EDL93 EHEC 5528445 50 1 4886 5324 7 100 O157 RIMD EHEC 5498450 50 1 4987 5253 7 103 E22 EPEC 5516160 50 109 4943 4788d 3 84 E110019 EPEC 5384084 50 119 4839 4746 3 70 B171 EPEC 5299753 50 159 4780 4467 0 70 53638 EIEC 5289471 50 119 4779 4783 2 80 E2348/69 EPEC 5157864 50 12 4673 n/ad 7 91 CFT073 UPEC 5231428 51 1 4653 5379 7 89 UTI89 UPEC 5065741 50 1 4466 5066 7 102 H10407 ETEC 5347466 50 95 4808 4808 9 84 B7A ETEC 5202558 50 198 4646 4637 2 70 E24377A ETEC 4980187 50 1 4407 4254 6 97 F11 ExPEC 5206906 50 88 4593 4467 2 88 NMEC RS18 NMEC 5089235 50 1 4896 n/ag 7 103 O42 EAEC 5355323 50 1 4713 n/ad 7 112 101-1 EAEC 4880380 50 70 4353 4238e 0 65 HS apathogenic 4643538 50 1 4126 3689 6 89 K-12 W3110 apathogenic 4641433 50 1 4122 4227f 7 88 K-12 MG1655 apathogenic 4639675 50 1 4133 4237 7 88 B03 4629810 51 1 4076 4387 6 86 average 5129395 50 4647 4593 5 88
a predicted with EasyGene (Larsen and Krogh, 2003) publishedb GenBank (NCBI) information, genome project database contiguousc 16S RNAd TIGRe NARA Institutef U Wisconsin
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
E. coliCore
E. coli strain-specific genes
Total of 8027 different E. coli gene families
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
• DNA curvature • Chromatin structure
Part the last:
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
J. Mol. Biol., 299:907-930, (2000).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Promoter Structural profile
+1
CDS
DNA curvature,flexibility importanthere
meltsrigid
cruciform
-10-35
mRNACDS
ββ’
-10-35
+1
σα
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Biochimie, 83:201-212, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Biochimie, 83:201-212, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Biochimie, 83:201-212, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Biochimie, 83:201-212, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Biochimie, 83:201-212, (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
plectonemic supercoils
Curved DNA
RNA polymerase
torroidal supercoiling
topological boundry
Genome Biology, 5:252, (2004).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group
Ecoli genomic landscape
CHRO2003Data kindly provided by Alain Stintzi
protein synthesis
electron transport
DNA replication,flagella biosynthesis
anaerobic growth,alcohol metabolism,acid response
stress
membrane, secretion
aerobic growth, oxidation
cell cyclecontrol
Genomic landscape of E. coli kindly provided by Josh Stuart, Stanford; Science, 293:2087-2092 (2001).
Cen
ter fo
r Bio
log
ical S
eq
uen
ce A
naly
sis Th
e T
ech
nica
l Un
iversity
of D
en
mark D
TU
Comparative Microbial Genomics group