A historical account of RegulonDB
description
Transcript of A historical account of RegulonDB
A historical account of RegulonDB“Swiss-Prot 20 years meeting”, Fortaleza Brazil,
August 3rd, 2006Julio Collado-Vides
Center for Genomic Sciences, UNAM, México
Campus Morelos U N A M
S1, S2, S3, Sn… T1, T2, T3…Tn
Specific interactions map (i,j) dependency relationships
Collado-Vides (1991) Comput. Applic. Biosci.
Long distance dependencies:(i,j)works, (i, m) does not work, (k, m) works
Collado-Vides J., Magasanik B. and Gralla J.D. (1991) "Control site location and transcriptional regulation in Escherichia coli" Microbiol. Reviews. 55:371-394
The collection of sigma 54 promoters
Universals of regulation or stamp collection?
Pr'''
Pr''
Pr'
Op'
D-Op
Op(r)Op(p)
Op(R)
Pr
I'
I(R)D - I
I(p)I(r)Op(r)
D-Op
Op
di = -93
I
cj2= -62.5
CRP
Pr
lac
Op
ci = +9
LacI
Op
di= +402
A final derivation for a transcription unit
Collado-Vides J. (1992) "Grammatical model of the regulation of gene expression" Proc.Natl.Acad.Sci.USA 89:9405-9409
Predictions. A Linear set of combinations
Rosenblueth D.A., Thieffry D., Huerta A.M., Salgado H., and Collado-Vides J. (1996) "Syntactic recognition of regulatory regions in Escherichia coli" CABIOS 12: 415-422.
Map of the complete E.coli sequence.
0
10
20
30
40
50
60
70
80
90
0 0-10 10 20 20-30 30-40 40-50 50-60 60-70 70-80 80-90 90-100 100-101
AllActivatorRepressorDualUnknown
Pérez-Rueda E., and Collado-Vides J. (2000) " Nucleic Acids Research 28: 1838-1847
HTH Position and Function of Regulators
Distances:between genes inside operons between transcription units
Prediction of operons
Salgado et al., (2000) Proc.Natl.Acad.Sci.USA 97: 6652-6657
Prediction of operons
0
0.2
0.4
0.6
0.8
1
-1.5 -1 -0.5 0 0.5 1 1.5
Escherichia coli
SensitivitySpecificityAccuracy
Threshold
0
0.2
0.4
0.6
0.8
1
-1.5 -1 -0.5 0 0.5 1 1.5
Bacillus subtilis
SensitivitySpecificityAccuracy
ThresholdMoreno-Hagelsieb (2001) Trends in Genetics)
Prediction of operons
Promoter prediction. Dense Regions of Promoter-like Signals
Huerta A.M. and Collado-Vides J. (2003) Transcription in Dense Regions of Promoter-like SignalsJ.Mol.Biol. 333:261-278
http://www.ccg.unam.mx/Computational_Genomics/regulondb/ http://www.ecocyc.org/
Annotation (Regulation, operon organization)
Summary of RegulonDB information by year.Object 1997 1998 1999 2000 2001 2002 2003 2004 2005 20064
Regulons 99 83 83 165 166 172 179 183 266
Regulatory Interactions 533 433 433 642 935 990 1119 1303 1811 1930
TFs Binding Sites 406 469 750 812 950 1129 1230 1326
Products:
RNAs 11 5 115 115 115 116 117 117 117
polypeptides 4207 4290 4290 4290 4292 4290 4290 4290
Transcriptional Factors 80 83 83 165 166 172 179 1831 139 150
Genes 542 456 4405 4405 4405 4405 4408 4408 4408 4408
Transcription units 292 230 374 528 657 694 747 803 1230 1304
Promoters 200 239 432 624 746 783 860 939 1160 1285
Effectors 35 36 36 36 66 66 67 68 73 73
External References 2050 2011 4394 4704 4943 5053 5224 19522 27872 29912
Synonyms 681 3525 3525 3525 3544 3578 3612 11945 16430
Terminators 40 86 106 108 118 122 128 130
RBSs 59 98 133 134 153 158 179 179
Conformation of Transcriptional Regulators
83 201 203 221 234 1503 160
Evidences 39 39
External DataBases 11 14
Operons 770 899
1. There is a total of 322 transcriptional DNA-binding regulators, from which 183 have experimental evidence and the rest have been predicted based on their helix-turn-helix DNA binding motif.
2. Only MedLine References.3. Only de active conformations.4. RegulonDB Relase 5.1
Summary of RegulonDB information by year.
0
500
1000
1500
2000
1997 1998 1999 2000 2001 2002 2003 2004 2005
year
Total
Transcription Unit
Promoter
Terminator
Regulon
TF
Conformation
RI
Site
Effector
RBS
TRN and the Global and Local TFs
Martínez-Antonio A. Current Opinion in Microbiology, 2003
Classifying Tfs based on their signal metabolite
• TFs were classified on the basis of the cellular location of the signal sensed:
• (a) External sensing - signals are localized outside the periplasm,– two-components systems (E-TC) and those using metabolites
transported into the cell (E-TM).
• (b) Internal sensing: f the signal is inside the periplasmic region (synthesized by the cell or redox)
– - synthesized by the cell or redox state (I-SM) – - DNA-bending (I-DB).
• (c) Hybrid TFs: The signal is synthesized in the cell, and can be also transported into the cell
Internal sensing regulators govern regulation of regulators
• (a) Venn diagram showing the overlap in the transcriptional regulation of regulated genes affected by Internal (32 TFs, 27%), External (62 TFs, 52%) and Hybrid (26TFs, 22%) classes.
• (d) Regulation of regulators. Continuous lines -regulation, Dashed lines autoregulation.
Active annotation in Regulon DB
*Transcription start site*Promoter assignation*Sigma factor assignation
Resultsand
data analysis
Experimental mapping procedure:5’ RACE (rapid amplification cDNA ends)
modificated for high-throughput
Magallanes Project (with Enrique Morett at IBT): Global Transcription Initiation Mapping and Active Promoter Annotation
Transcriptionalunits database
Microarraydatabase
Primerdesign
First genomic project of global transcriptional mapping start site for a prokariotic organisms
Magallanes Project. Examples: Hypothetical genes.
Transcriptional units experimentally explored: 421
Transcriptional units experimentally resolved: 164 (145 new data)
Experimental growth conditions tested: 3
Data up to date
b0119 (yeaL)
Genomic sequence
AACGACACAATGCCCGGTGAATGAGATTCCCGGGCATTTTTTTATTTCTAAACCATCGCCGTTCCGCTGTTTTTCTCCG
GTAAGGCTGCGATAATTACATCAATGGCGCAATGCGATTTCGGTGCATTGCCGGGAGCAGAGGAACACACTATGGATTAP1 70
-10 +1
-35
b0391 (yaiE)
TACTGCAAAAACAGCTCGATGTTCCTGTCTTGCTGTCTAACGTATTGATTGCACGGCTGGCTGCGGAATTACTGGTGTA
ATTTTGCGTGACAGCCAGCGCCTCTGGCCCCTATAGTGAAGTAGATGTTCAACTACCAAACAGGGCCAGTTTATGCTTCP1 70
-10 +1-35
Genomic sequence
Computational Genomics Group
Araceli HuertaErnesto Pérez RuedaDavid RosenbluethDenis ThieffryGabriel MorenoEnrique Morett
The Lord of The Rings: Lessons for Genomics and Bioinformatics
1. Achievement of communities of diversity of beings
2. Ambitious projects
GETools
GETools
Full genome prediction of regulatory binding sites
Full genome prediction of regulatory binding sites