Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004.
-
Upload
darcy-barrett -
Category
Documents
-
view
216 -
download
1
Transcript of Riboswitches: the oldest regulatory system? Mikhail Gelfand December 2004.
Riboswitches: the oldest regulatory system?
Mikhail Gelfand
December 2004
Riboflavin biosynthesis pathway
ribAribA
ribA ribB
G TP cyclohydrolase II
ribD
ribD
ribG
ribG
P yrim id ine deam inase
3,4-D HB P synthase P yrim id ine reductase
ribHribH R ibo flavin synthase, -cha in
ribEribB
ypaA
R ibo flavin synthase, -chain
GTP
2,5-diam ino-6-hydroxy-4-(5`-phosphoribosylamino)pyrim idine
ribulose-5-phosphate
PENTOSE-PHOSPHATE PATHWAY
PU RINE BIO SYNTHESIS PATHWAY
3,4-dihydroxy-2-butanone-4-phosphate 5-am ino-6-(5`-phosphoribitylam ino)uracil
5-am ino-6-(5`-phosphoribosylamino)uracil
6,7-dimethyl-8-ribityllumazine
Riboflavin
5’ UTR regions of riboflavin genes from various bacteria 1 2 2’ 3 Add. 3’ Variable 4 4’ 5 5’ 1’ =========> ==> <== ===> -><- <=== -> <- ====> <==== ==> <== <========= BS TTGTATCTTCGGGG-CAGGGTGGAAATCCCGACCGGCGGT 21 AGCCCGTGAC-- 8 4 8 -----TGGATTCAGTTTAA-GCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAT BQ AGCATCCTTCGGGG-TCGGGTGAAATTCCCAACCGGCGGT 19 AGTCCGTGAC-- 8 5 8 -----TGGATCTAGTGAAACTCTAGGGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATATG BE TGCATCCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGATTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGGATGCC HD TTTATCCTTCGGGG-CTGGGTGGAAATCCCGACCGGCGGT 19 AGTCCGTGAC-- 10 4 10 ----–TGGACCTGGTGAAAATCCGGGACCGACAGTGAA-AGTCTGGAT-GGGAGAAGGAAACG Bam TGTATCCTTCGGGG-CTGGGTGAAAATCCCGACCGGCGGT 23 AGCCCGTGAC-- 8 4 8 ----–TGGATTCAGTGAAAAGCTGAAGCCGACAGTGAA-AGTCTGGAT-GGGAGAAGGATGAG CA GATGTTCTTCAGGG-ATGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCAA--- 3 4 3 ------AGATCCGGTTAAACTCCGGGGCCGACAGTTAA-AGTCTGGAT-GAAAGAAGAAATAG DF CTTAATCTTCGGGG-TAGGGTGAAATTCCCAATCGGCGGT 2 AGCCCGCG---- 7 6 7 --------ATTTGGTTAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GGAAGAAGATATTT SA TAATTCTTTCGGGG-CAGGGTGAAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGTTAA-AGTCTGGAT-GGGAGAAAGAATGT LLX ATAAATCTTCAGGG-CAGGGTGTAATTCCCTACCGGCGGT 2 AGCCCGCGA--- 4 4 4 -----ATGATTCGGTGAAACTCCGAGGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAATA PN AACTATCTTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGA--- 3 4 3 -----ATGATTTGGTGAAATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAAGATAAAA TM AAACGCTCTCGGGG-CAGGGTGGAATTCCCGACCGGCGGT 3 AGCCCGCGAG-- 5 4 5 ----–TTGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAGAGCGTGA DR GACCTCTTTCGGGG-CGGGGCGAAATTCCCCACCGGCGGT 15 AGCCCGCGAA-- 8 12 9 ----–CCGATGCCGCGCAACTCGGCAGCCGACGGTCAC-AGTCCGGAC-GAAAGAAGGAGGAG TQ CACCTCCTTCGGGG-CGGGGTGGAAGTCCCCACCGGCGGT 3 AGCCCGCGAA-- 5 4 5 -----CCGACCCGGTGGAATTCCGGGGCCGACGGTGAA-AGTCCGGAT-GGGAGAAGGAGGGC AO AATAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGCGGT 2 AGTCCGCGA--- 7 7 7 -----AGGAACCGGTGAGATTCCGGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGATGAAA DU TTTAATCTTCAGGG-CAGGGTGAAATTCCCGATCGGTGGT 2 AGTCCGCGA--- 13 4 12 -----AGGAACTAGTGAAATTCTAGTACCGACAGT-AT-AGTCTGGAT-GGAAGAAGAGCAGA CAU GAAGACCTTCGGGG-CAAGGTGAAATTCCTGATCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGACCCGGTGTGATTCCGGGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTCGGC FN TAAAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 5 4 5 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GGGAGAAGAATTAG TFU ACGCGTGCTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TGGAACCGGTGAAACTCCGGTACCGACGGTGAA-AGTCCGGAT-GGGAGGTAGTACGTG SX -AGCGCACTCCGGG-GTCGGTGAAAGTCCGAACCGGCGGT 3 AGTCCGCGAC-- 8 5 8 -----TTGACCAGGTGAAATTCCTGGACCGACGGTTAA-AGTCCGGAT-GGGAGGCAGTGCGCG BU GTGCGTCTTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 30 AGCCCGCGAGCG 137 GTCAGCAGATCTGGTGAGAAGCCAGAGCCGACGGTTAG-AGTCCGGAT-GGAAGAAGATGTGC BPS GTGCGTCTTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCTGGTCCGATGCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGATGTGC REU TTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 31 AGCCCGCGAGCG 7 5 7 GTCAGCAGATCTGGTGAGAGGCCAGGGCCGACGGTTAA-AGTCCGGAT-GAAAGAAGATGGGC RSO GTACGTCTTCAGGG-CGGGGTGGAATTCCCCACCGGCGGT 21 AGCCCGCGAGCG 11 3 11 GTCAGCAGATCCGGTGAGATGCCGGGGCCGACGGTCAG-AGTCCGGAT-GGAAGAAGATGTGC EC GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 8 4 8 GACAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAG-AGTCCGGAT-GGGAGAGAGTAACG TY GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 67 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGGGTAACG KP GCTTATTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 20 AGCCCGCGAGCG 8 4 8 GTCAGCAGATCCGGTGTAATTCCGGGGCCGACGGTTAA-AGTCCGGAT-GGGAGAGAGTAACG HI TCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 2 AGCCCACGAGCG 26 9 30 GTCAGCAGATTTGGTGAAATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAAAGAGAATAAAA VK GCGCATTCTCAGGG-CAGGGTGAAATTCCCTACCGGTGGT 14 AGCCCACGAGCG 11 9 11 GTCAGCAGATTTGGTGAGAATCCAAAGCCGACAGT-AT-AGTCTGGAT-GAAAGAGAATAAGC VC CAATATTCTCAGGG-CGGGGCGAAATTCCCCACCGGTGGT 13 AGCCCACGAGCG 5 4 5 GTCAGCAGATCTGGTGAGAAGCCAGGGCCGACGGTTAC-AGTCCGGAT-GAGAGAGAATGACA YP GCTTATTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 40 AGCCCGCGAGCG 16 6 16 GTCAGCAGACCCGGTGTAATTCCGGGGCCGACGGTTAT-AGTCCGGAT-GGGAGAGAGTAACG AB GCGCATTCTCAGGG-CAGGGTGAAAGTCCCTACCGGTGGT 25 AGCCCACGAGCG 16 4 27 GTCAGCAGATTTGGTGCGAATCCAAAGCCGACAGTGAC-AGTCTGGAT-GAAAGAGAATAAAA BP GTACGTCTTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 18 AGCCCGCGAGCG 10 4 10 GTCAGCAGACCTGGTGAGATGCCAGGGCCGACGGTCAT-AGTCCGGAT-GAGAGAAGATGTGC AC ACATCGCTTCAGGG-CGGGGCGTAATTCCCCACCGGCGGT 16 AGCCCGCGAGCA 10 3 11 ---CGCAGATCTGGTGTAAATCCAGAGCCGACGGT-AT-AGTCCGGAT-GAAAGAAGACGACG Spu AACAATTCTCAGGG-CGGGGTGAAACTCCCCACCGGCGGT 34 AGCCCGCGAGCG 6 6 6 GTCAGCAGATCTGGTG 52 TCCAGAGCCGACGGT 31 AGTCCGGAT-GGAAGAGAATGTAA PP GTCGGTCTTCAGGG-CGGGGTGTAAGTCCCCACCGGCGGT 13 AGCCCGCGAGCG 7 3 7 GTCAGCAGATCTGGTGCAACTCCAGAGCCGACGGTCAT-AGTCCGGAT-GAAAGAAGGCGTCA AU GGTTGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 17 AGCCCGCGAGCG 7 9 7 GTCAGCAGATCCGGTGAGAGGCCGGAGCCGACGGT-AT-AGTCCGGAT-GGAAGAGGACAAGG PU AAACGTTCTCAGGG-CGGGGTGCAATTCCCCACCGGCGGT 19 AGCCCGCGAGCG 19 4 18 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAC-AGTCCGGATGAAGAGAGAACGGGA PY TAACGTTCTCAGGG-CGGGGTGCAACTCCCCACCGGCGGT 19 AGCCCGCGAGCG 15 4 16 GTCAGCAGACCCGGTGTGATTCCGGGGCCGACGGTCAT-AGTCCGGATGAAGAGAGAGCGGGA PA TAACGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 19 AGCCCGCGAGCG 14 4 13 GTCAGCAGACCCGGTGCGATTCCGGGGCCGACGGTCAT-AGTCCGGATAAAGAGAGAACGGGA MLO TAAAGTTCTCAGGG-CGGGGTGAAAGTCCCCACCGGCGGT 16 AGCCCGCGAGCG 8 5 8 GTCAGCAGATCCGGTGTGATTCCGGAGCCGACGGTTAG-AGTCCGGAT-GAAAGAGGACGAAA SM AAGCGTTCTCAGGG-CGGGGTGAAATTCCCCACCGGCGGT 34 AGCCCGCGAGCG 8 3 8 GTCAGCAGATCCGGTCGAATTCCGGAGCCGACGGTTAT-AGTCCGGAT-GGAAGAGAGCAAGC BME GCTTGTTCTCGGGG-CGGGGTGAAACTCCCCACCGGCGGT 17 AGCCCGCGAGCG 10 15 10 GTCAGCAGATCCGGTGAGATGCCGGAGCCGACGGTTAA-AGTCCGGAT-GGAAGAGAGCGAAT BS ATCAATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 18 AGCCCGCGA--- 5 4 5 -----AGGATTCGGTGAGATTCCGGAGCCGACAGT-AC-AGTCTGGAT-GGGAGAAGATGGAG BQ GTCTATCTTCGGGG-CAGGGTGAAAATCCCGACCGGCGGT 27 AGCCCGCGA—-- 3 5 3 -----AGGATTTGGTGTGATTCCAAAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG BE ATTCATCTTCGGGG-CAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCGA--- 3 4 3 -----AGGATCCGGTGCGAGTCCGGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGAAG CA AATGATCTTCAGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCGAG-- 3 4 3 ----TATGATCCGGTTTGATTCCGGAGCCGACAGT-AA-AGTCTGGAT-GAAAGAAGATATAT DF GAAGATCTTCGGGG-CAGGGTGAAATTCCCTACCGGCGGT 2 AGCCCGCG---- 6 4 6 -------GATTTGGTGAGATTCCAAAGCCGACAGT-AA-AGTCTGGAT-GAGAGAAGATATTT EF GTTCGTCTTCAGGGGCAGGGTGTAATTCCCGACCGGTGGT 3 AGTCCACGAC-- 5 3 5 ----ATTGAATTGGTGTAATTCCAATACCGACAGT-AT-AGTCTGGAT—-AAAGAAGATAGGG LLX AAATATCTTCAGGG-CACCGTGTAATTCGGGACCGGCGGT 21 ACTCCGCGAT-- 4 4 4 ----–TTGAAGCAGTGAGAATCTGCTAGCGACAGT-AA-AGTCTGGAT-GGAAGAAGATGAAC LO GTTCATCTTCGGGG-CAGGGTGCAATTCCCGACCGGTGGT 3 AGTCCACGAT-- 3 10 3 ----TTGACTCTGGTGTAATTCCAGGACCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGTTG PN AAGAGTCTTCAGGG-CAGGGTGAAATTCCCGACCGGCGGT 125 AGTCCGTG---- 3 4 3 -------GATGTGGTGAGATTCCACAACCGACAGT-AT-AGTCTGGAT-GGGAGAAGACGAAA ST AAGTGTCTTCAGGG-CAGGGTGTGATTCCCGACCGGCGGT 14 AGTCCGCG---- 3 4 3 -------GATGTGGTGTAACTCCACAACCGACAGT-AT-AGTCTGGAT-GAGAGAAGACCGGG MN AAGTGTCTTCAGGG-CAGGGTGAGATTCCCGACCGGCGGT 104 AGTCCGCG---- 3 4 3 -------GATGTGGTGAAATTCCACAACCGACAGT-AA-AGTCTGGAT-GGGAGAAGACTGAG SA ATTCATCTTCGGGG-TCGGGTGTAATTCCCAACCGGCAGT 6 AGCCTGCGAC-- 11 3 11 ----–CTGATCTAGTGAGATTCTAGAGCCGACAGT-AT-AGTCTGGAT-GGGAGAAGATGGAG AMI TCACAGTTTCAGGG-CGGGGTGCAATTCCCCACTGGCGGT 14 AGCCCGCGC--- 5 5 5 ------TGATCTGGTGCAAATCCAGAGCCAACGGT-AT-AGTCCGGAT-GGAAGAAACGGAGC DHA ACGAACCTTCGAGG-TAGGGTGAAATTCCCGACCGGCGGT 20 AGCCCGCAAC-- 11 4 11 --CGACTGACTTGGTGAGACTCCAAGGCCGACGGT-AT-AGTCCGGAT-GGGAGAAGGTACAA FN AATAATCTTCGGGG-CAGGGTGAAATTCCCGACCGGTGGT 2 AGTCCACG---- 4 6 4 -------GATTTGGTGAAATTCCAAAACCGACAGT-AG-AGTCTGGAT-GAGAGAAGAAAAGA GLU ---TGTTCTCAGGG-CGGGGCGAAATTCCCCACCGGCGGT 28 AGCCCGCGAGCG 10 4 10 GTCAGCAGATCCGGTTAAATTCCGGAGCCGACGGTCAT-AGTCCGGAT-GCAAGAGAACC---
Conserved secondary structure of the RFN-element
NNNNyYYUC
NNNNrRRAG
NgGGNcCC
rgGGxc
ARRgxuAG
GRCCYG
AcCG
AGCCRGY
GG YRCC
GRYBy CYRVrG N
YGNaA N U U x N
Nx
AGU
UrN A g
Y
variab lestem -loop
additionalstem -loop
3 4
2
1
5
5 ’ 3 ’
u K NRA
xK
*
****
Capitals: invariant (absolutely conserved) positions.
Lower case letters: strongly conserved positions.
Dashes and stars: obligatory and facultative base pairs
Degenerate positions: R = A or G; Y = C or U; K = G or U; B= not A; V = not U. N: any nucleotide. X: any nucleotide or deletion
Attenuation of transcription
TerminatorThe RFN element
Antiterminator
Antiterminator
Bam GACAAAAAAATATTGATTGTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------GTAAAGCCCCGAATGTGTAA---ACATTCGGGGCTTTTTGACGCCAAAT BS GGACAAATGAATAAAGATTGTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGA 59 ----------CTAAAGCCCCGAATTTTTTA--TAAATTCGGGGCTTTTTTGACGGTAAA BQ CTATAATTTGAGCAAACAGCATCCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGGATAT 250 -----------CCAAACCCCAAGGATATTAAA--ATCCTTGGGGTTTTTTGTTTTTTTT BE ACATAACGATATAGTGATGCATCCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGGATGC 155 ------------TGAGCCCCCGGGGACAT--------CCCGGGGGTTTCATTTTTATTG HD AAATTGAATAATTAATTTTTATCCTTCGGGGCTGGGTG --- TCTGGATGGGAGAAGGAAAC 148 -------------ATGCCCCGTGAGAACAAAA-----TCTCTGGGGCTTTTTTGCGCGC CA TAATGGTAATTTAATAGGATGTTCTTCAGGGATGGGTG --- TCTGGATGAAAGAAGAAATA 34 -------------AATCTCCGAAGGATTACC----TTTCTTTGGAGATTTTTTTATTTG DF TAAATATAAATTTAATACTTAATCTTCGGGGTAGGGTG --- TCTGGATGGAAGAAGATATT 63 ------------TAAACCCTGAGTTAATT--------CTCAGGGTTTTTTGTTTAAAAA LLX ACTTTAGCTACAATTGAATAAATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAT 127 ----------AAAAGACCCTGAAATTTT------ATTTTAGGGTCTTATTTTTTATTAG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 81 ----------TGTATGCCTTGAGTAGTCCCC---TATTCAAGGTATATTTTTTTGGAGG PN* ATCATCTGTAATTGAATAACTATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATAAA 19 ------------CGTGCTCTGAAATGATTACTTGTCATTTCAGAGCATTTTTGTTAATC TM AAAACTGAATACAAAAGAAACGCTCTCGGGGCAGGGTG --- TCCGGATGGGAGAGAGCGTG 13 -----------ATGGGACCCGAGA----------------GGGTCCCTTTTCTTTTACA AO ATTTGCAACAATTTTTTAATAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGATGAA 33 --------TTTACAAGCCTTGAGATCGAAAG----ATTTCAAGGCTTTTTTCATCATTA DU AATTTTTTTAATACTATTTTAATCTTCAGGGCAGGGTG --- TCTGGATGGAAGAAGAAGAG 47 --------TGCATAAGCCTTGAGATCTTAG----GATTTCAAGGCTTTTTCATTAGTTA FN TAATCGAATATGTAAAATAAAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGAATTA 18 ----------ATATTGCTCAGACTTT------------GTTTGAGCATTTTTTTATTAA SA TATAACAATTTCATATATAATTCTTTCGGGGCAGGGTG --- TCTGGATGGGAGAAAGAATG 74 ------TTTTCTCCTTGCATCTTAATT----------GATGTGAGGATTTTTGTTTATA DHA ACTCTTTTTAGATGAATACGAACCTTCGAGGTAGGGTG --- TCCGGATGGGAGAAGGTACA 43 -----------GTTTATGCCTCGAGGAACACCATTTCCTCGAGGCATTTTTGTTCTTTC FN GAAAAATAAATATTAAAAATAATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGAAAAG 40 ------------CTTACCCGAATTCTAT------------AATTCGGTTTTTTTATTTT CA AATATAAAAAAATAAAGAATGATCTTCAGGGCAGGGTG --- TCTGGATGAAAGAAGATATA 19 ----------–-TATGCCCTGACGTTTTT---------CGTTGGGGCTTTTTTAATGCT DF AAAATTAAAAAATCAAAGAAGATCTTCGGGGCAGGGTG --- TCTGGATGAGAGAAGATATT 45 ----------ATAAAAACTCGAAGATAGGG----TCTTCGAGTTTTTTGTTTTTCCTAA BS TAATTAAATTTCATATGATCAATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 103 --AAAGAACCTTTCCGTTTTCGAGTAAGATGTGATCGAAAAGGAGAGAATGAAGTGAAA BQ GGGAAAATAGAATATCGGTCTATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGGA 54 -------ATTCTCCCTTTGTGTAAA------------ACACAAAGGGTTTTTTCGTTCTATG BE ATAAAAATGTATAAGCGATTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGAA 114 --------GGCAGCCTTCTTCTTGTGAGGATGAATCACGAGAAGGGGAGGAGAACAAGCATG PN GTTTTTTGTTATGATAAAAGAGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACGAA 137 -–AACTTCTTCTGATTTTATAG------------AAAATTGGAGGAACCTGTTATGACA ST TAAATCTGCTATGCTAGAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGAGAGAAGACCGG 130 ---GGAACTTCTTTCAATTTGAAA-----------AAATTGGAGGAATTTTTTAATGTC MN ATTTTTTGATATGCTATAAGTGTCTTCAGGGCAGGGTG --- TCTGGATGGGAGAAGACTGA 138 ---–GGCCTTCTTTCGATTTGTAA-----------AAATTGGAGGAATTTTTTTATGAA SA AAATTTAATAATGTAAAATTCATCTTCGGGGTCGGGTG --- TCTGGATGGGAGAAGATGGA 17 --------TCCTCCTATTCTTACG--------AGATGAATGGAAGGAGAAAATTGAATATG EF AAAAAATATAATACAAGGTTCGTCTTCAGGGGCAGGGT --- GTCTGGATAAAGAAGATAGG 33 ---CTACTCTATTTTTCCCTGCAGA------------AAAATAGGGTTTTTTTGTATGA LLX TTTTTGTGCTATAATAAAAATATCTTCAGGGCACCGTG --- TCTGGATGGAAGAAGATGAA 66 -–TCAACTTCCTCGAAATTTGAAGAAT-TATTTTCTCATATTTGGAGGTTTTTTTATGT LO ATTGTAAGAAAATATTCGTTCATCTTCGGGGCAGGGTG --- TCTGGATGGGAGAAGATGTTG 79 ---ATGCACAAACTCTCCCTCAACTTTTTTTA--------GTTGAGGTTTTTTATTTGC
Attenuation of translation
EC AATCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 59 ----------CTGCCCTGATTCTGGTAACCATAATTTTAGTGAGGTTTTT-------TACCATGAATCAGACGCTA TY AACCCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGGGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATGTTAATGAGGTTTTTT------TACCATGAATCAGACGCTA KP ATCTCGCTTATTCTCAGGGCGGGGCG --- TCCGGATGGGAGAGAGTAACG 61 ----------CTGCCCTGATTCTGGTAACCATAATTTTAATGAGGTTTTTT------TACCATGAATCAGACGCTC HI TTAGCTCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 41 ----------CAGCCCTGATTCTGGTATTTAATTGAAATCTCAAAT-TAGGAAAT--TACTATGAATCAGTCAATT VK TATTTGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAGC 76 ----------CAGCCCTGATTCTGGTATCTAAATATCTTTATATTTCAAGGAATT--TACTATGAATCAGTCTATT AB TAGGCGCGCATTCTCAGGGCAGGGTG --- TCTGGATGAAAGAGAATAAAA 54 ----------CCGCCCTGATTCTGGTATAAATTCATCTTATTAAA—AAGGCATT---TACTATGAATCAGTCATTA YP ATGGGGCTTATTCTCAGGGCGGGGTG --- TCCGGATGGGAGAGAGTAACG 194 ----------CCGCCCTGATTCTGGTAATCCATAATTTTTTAATGAGGTTTCT---TTACCATGAATCAGACGCTT VC CACAACAATATTCTCAGGGCGGGGCG --- TCCGGATGAGAGAGAATGACA 83 ----------AAGCCCTGATTCTGGTCATTTTTT--------------GGAGTATT--ACCATGAATCAGTCCTCA Spu CTATCAACAATTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAATGTAA 145 ----------ACGCCCTGATTCTGGATATTCCCATGTCGTATTTTTGAAGGATATTAA-CCATGAATCAGTCTTTA MLO GACGTTAAAGTTCTCAGGGCGGGGTG --- TCCGGATGAAAGAGGACGAAA 44 -------CGTGCGTCCTGATTCTGGTTCGAAACGGA--------------AGGATGGACCCATGAATCAGCATTCC AC AAGCGACATCGCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGACGACG 51 ----------CAGTCCTGAAATGTTTAACCGTAATT-------------------TACGAGAGCATTTCATATGTC BP AAGCAGTACGTCTTCAGGGCGGGGTG --- TCCGGATGAGAGAAGATGTGC 62 ----------TAGCCCTGAAACGTTTTTCGCCATTTCCTTTTTT------------GCGAGAGCGTTTCAATGTCC BPS AGTCAGTGCGTCTTCAGGGCGGGGCG --- TCCGGATGAAAGAAGATGTGC 86 ----------GAGCCCTGAAACGTTTTTCGCCCATTCATGTTTC-----------GCGAGGAGCGTTTCACATCATG BU AATCAGTGCGTCTTCAGGGCGGGGTG --- GCCGGATGGAAGAAGATGTGC 99 ----------ATGCCCTGAAACGTTTTTCGCCCAACTTTT--------------GCGATGAGCGTTTCAACTATGT REU CATCGTTACGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGATGGGC 77 ----------ATCCCCTGAAACGCCCATCCATGGAAATCCACGCAC-------------GGAGCGTTTCAATGCTG RSO GCTTGGTACGTCTTCAGGGCGGGGTG --- TCCGGATGGAAGAAGATGTGC 80 ---------CGTGCCCTGGAACGTCTTGTCGCCCATTTCA---------------GCGAGGAGCGTTTCCATGTTG PP GGTCGGTCGGTCTTCAGGGCGGGGTG --- TCCGGATGAAAGAAGGCGTCA 50 ----------TCGCCCCGAGACGTTCATCGATCATTCA------------------CGAGGAGCGTTTCATGTTCA PY GCCGGTAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAGCGGGA 91 ----------ATGCCCTGTTTTTTCATTAAATT---------------------AAACAGGAGTCAGAACACGTGC PU CGGCGAAACGTTCTCAGGGCGGGGTG --- CCGGATGAAGAGAGAACGGGA 68 ----------ACGCCCTGTTTTTCACAC--------------------------AAACAGGAGTCAGAACATGCAA PA GGCCGTAACGTTCTCAGGGCGGGGTG --- CCGGATAAAGAGAGAACGGG 53 ---------AAAGCCCTGTTTTTCAC---------------------------GAAACAGGAGTTCGTCATATG-- BME CGCGGGCTTGTTCTCGGGGCGGGGTG --- TCCGGATGGAAGAGAGCGAAT 54 ----------GCGCCCTGATTCTAGTTTCGTG--------------------------AGGAACCTATGAACCAAA CAU AATCCGAAGACCTTCGGGGCAAGGTG --- TCCGGATGGGAGAAGGTCGGC 116 ------CGCGATGCCCCGAAGGTGTG-----------------------------TTCAGGGGTGTCGCGATGAAC TFU GTACACACGCGTGCTCCGGGGTCGGT --- GGATGGGAGGTAGTACGTGGT 58 -------GCCTTACCCCGGAGCCTGACCT-------------------------GGCTAGGGGGAAGGCTTCTCGCATG GLU TGAGTTTTGTTCTCAGGGCGGGGCG --- TCCGGATGCAAGAGAACCG 32 ---------AAGGCCCCGAGGATTACATGCTTTTAAATCCTTTGAAAAGGGGACAAGATCATGAATCCTATAACCG DR GAACCGACCTCTTTCGGGGCGGGGCG --- TCCGGACGAAAGAAGGAGGAG 1 GACGCTCAGCTTGCCCCCCA------------------------------------GCAGGCGGCGTCCGCGTATG SM GTCGCAAGCGTTCTCAGGGCGGGGTG --- TCCGGATGGAAGAGAGCAAGC 45 ATCATTGGAAAAATGCCAACCCTGAAA-------------------GGCTTGAGACCATGACCATACTT TQ TTCGGCACCTCCTTCGGGGCGGGGTG --- TCCGGATGGGAGAAGGAGGGCCACTTGCGC AMI CTTACTCACAGTTTCAGGGCGGGGTG --- TCCGGATGGAAGAAACGGAGCGCCTTATGG
SD-sequestorThe RFN element
Antisequestor
RFN: the mechanism of regulation
• Transcription attenuation
• Translation attenuation
Distribution of RFN-elements
Genomes Number of analyzed genomes
Number of genomes with RFN
Number of the RFN elements
α-proteobacteria 8 4 4
β-proteobacteria 7 4 4
γ-proteobacteria 17 15 15
δ- and ε-proteobacteria 3 0 0
Bacillus/Clostridium 12 12 19
Actinomycetes 9 4 4
Cyanobacteria 5 0 0
Other eubacteria 7 5 6
Total 68 47 52
Phylogenetic tree of RFN-elements
YpaA: riboflavin transporter in Gram-positive bacteria
• 5 predicted transmembrane segments => a transporter• Upstream RFN element (likely co-regulation with riboflavin
genes) => transport of riboflaving or a precursor• S. pyogenes, E. faecalis, Listeria sp.: ypaA, no riboflavin
pathway => transport of riboflavinPrediction: YpaA is riboflavin transporter (Gelfand et al., 1999)
Verification:• YpaA transports flavines (riboflavin, FMN, FAD) (by genetic
analysis, Kreneva et al., 2000)• ypaA is regulated by riboflavin (by microarray expression
study, Lee et al., 2001)• … via attenuation of transcription (and to some extent
inhibition of translaition) (Winkler et al., 2003)
More predicted (riboflavin) transporters
impX from Fusobacterium and Desulfitobacterium
– no similarity with any known protein; no homologs in other complete genomes
– 9 predicted TMS
– single RFN-regulated gene
pnuX from Actinomycetes (Corynebacterium, Streptomyces, Thermomonospora)
– no orthologs in other genomes
– 6 predicted TMS
– either a single gene or a part of the riboflavin operon
– regulated by RFN
– similar to the nicotinamide mononucleotide transporter PnuC from E. coli
thi-box and regulation of thiamine metabolism genes by pyrophosphate (Miranda-Rios et al., 2001)
TTCGGGATCCGCGGAACCTGA-TCAGGCTAA-TACCTGCG-AAGGGAACAAGAGTTA THIC_EC TTCGGGATCCGTTGAACCTGA-TCAGGTTAA-TACCTGCG-AAGGGAACAAGAGAAG THIC_VC GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAAGC THIC_MLO GCAGTGACCCGTTGAACCTGA-TCCAGTTCA-CACTGGCG-TAGGGACGGTGCAGAC THIC_SM AGAAATACCCTTTACACCCGA-TCGGGATAA-TACCTGCG-TGGGGAGTTTTCACGG THIC_NM TTCTTAACCCTTTGGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGAAGTAGAGGAA thiC_BS CCGTCGACCGTACGAACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG THIC_MT GGATCGACCCTTTGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGAAATTATGTCG THIT2_TVO TCCTCGACCCCAAGAACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGG thi1_TM
Notation: Red– Conserved nucleotides; Green– Purine or Pyrimidine conserved nucleotides; Blue– Non-conserved nucleotides
Alignment of THI-elements 1 2 3 3' FACULTATIVE STEM-LOOP 2' 4 5 5' 4' 1' ----====>===> -=====> <===== ========> <======= <=== ===> =====> <===== <=== <====---- BACILLUS/CLOSTRIDIUM GROUP BS_THIC TAGTTACTGGGGGTGCCCGCT----------------TTCcgGGCTGAGAGAGAAGGCA-------------AGCTTCTTAACCCTTT---GGACCTGA-TCTGGTTCG-TACCAGCG-TGGGGA-AGTAGAGGA BS_TENA TAACCACTAGGGGTGTCCTTC----------------ATAAGGGCTGAGATAAAAGTGT-------------GACTTTTAGACCCTCA---TAACTTGA-ACAGGTTCA-GACCTGCG-TAGGGA-AGTGGAGCG BS_YLMB TTCATCCTAGGGGTGCTTTG-------------------CGAAGCTGAGAGAGACTT-----------------TGTCTCAACCCTTT---TGACCTGA-TCTGGATCA-TGCCAGCG-GAGGGA-AGCGGTGAA BS_YKOF AAAGCACTAGGGGTGCTGT--------------------TTTGGCTGAGATAAAGCGCGGAA-----GAAACGCGCTTTGATCCCTTA---TGACCCGA-TCTGGATAA-TACCAGCG-TGGGGA-AGTGCAGGT SA_TENA GAACTACTAGGGGAGCCTAAT----------------GATATGGCTGAGATGAATT-------------------GTTCAGACCCTTA---TGACCTGA-TTTGGTTAG-TACCAACG-TAGGAA-AGTAGTTAT SA_YKOE CACACACTAGGGGTGTTT----------------------TATACTGAGATGAGGCTT---------------GCCCTCAAACCCTTT---GAACCTGA-TCTAGCTTG-AACTAGCG-TAGGAA-AGTGTTACT LLX_YUAJ TTTGCACAATGGGTCTATTGACAAA---------ACTGTCAGTAGCGAGA----------------------------AATACCATC----TGACCTGA-TCTGGGTAA-TGCCAGCG-TAGGAA-TGTGTTAAG CA_THIS ATAGTTAACGGGGAGCCTGTA-----------------GACAGGCTGAGAGTGGAATG--------------TGATTCCAGACCCTCA---TAACCTGA-TTTGGATAA-TGCCAACG-TAGGGA-GTTAATGCA CA_YUAJ TATGTGCTAGGGGTGCCTT---------------------TAGGCTGAGAAACAGTTT--------------GTCACGTTAACCCTT-----AACCTGA-TCTGGATAA-TACCAGCG-TAGGGA-AGCAGTTTG ST_YUAJ TTTCACAAAGGAGTGCTT-----------------------TGGCTGAGATCGCAA------------------TTGCGAAATCCTGA---GGACCTGA-TCTTGTTAG-TACAAGCG-TAGGGA-TTGTGACCA DHA_THIC TAATCACTAGGGGGGCCGAATA---------------AGGTCGGCTGAGATAAAGGACCCA---------AGAATCCTTTGACCCTT-----AACCTGA-TCTGGGTAA-TGCCAGCG-TAGGGAAGGTGGATAA LMO_TENA GAAAAACTAGGGGGGCCGAT-------------------TCTGGCTGAGATAGGAAGGTAAT-----------GCTTTCTGACCCTTT---GAACCTGT-TT--GTTAG-TGCAAGCG-TAGGGA-AGTGAATGT LMO_YUAJ TTACCACAGGGGGGGCTTC---------------------TTAGCTGAGATTGAGTCCACGTGT-----TTTTGGATTCTGACCCTTT---GAACCTGT-TC--GTTAA-TACGAGCG-TAGGGA-TTGTGGCGA PROTEOBACTERIA EC_THIB GTTCTCAACGGGGTGCCACGCGT------------ACGCGTGCGCTGAGAAA---------------------------ATACCCGTCGA---ACCTGA-TCCGGATAA-CGCCGGCG-AAGGGATTTGAGGC EC_THIM AAACGACTCGGGGTGCCCTTCTGC-------------GTGAAGGCTGAGAAA----------------------------TACCCGTATC---ACCTGA-TCTGGATAA-TGCCAGCG-TAGGGA-AGTCACG EC_THIC TTTCTTGTCGGAGTGCCTTA-------------------ACTGGCTGAGACCGTTT------------------ATTCGGGATCCGCGGA---ACCTGA-TCAGGCTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THIC CCACTTGTCGGAGTGCCAT---------------------TGGGCTGAGACCGTTT------------------ATTCGGGATCCGTTGA---ACCTGA-TCAGGTTAA-TACCTGCG-AAGGGA-ACAAGAG VC_THID CCTGTAGTCGGGGAGCCTGAGAG-- 66 5 71 -AATTAAAGGCTGAGATCGCGT-------------------AGCGAGACCCGTTGA---ACCTGA-TTCAGTTAG-GACTGACG-TAGGGA-ACTATCC VC_THIB CCCACTCACGGGGGGCCACCCATTCAT-------CCGAATGGCGCTGAGATCAAGCAC---------------TGCTTGGGACCCGCA 21 -ACCTGA-ACCAGATAA-TGCTGGCG-TAGGAATTGAGCTA XFA_THIC TTTGAAGCGGGGGTACCATAGCCA------------AGCTGCGGTTGAGAC----------------------------ACACCCTTCGA---ACCTGA-TCCGGTTTA-CACCGGCG-TAGGAAAGCTTCGT MLO_THIC CATTCACCAGGGGAGTCCCGG----------------CAAGGGGCTGAGATACTGCTGGCTTTC------GCGGCGCAGTGACCCGTTGA---ACCTGA-TCCAGTTCA-TACTGGCG-TAGGGACGGTGCAA MLO_THIB CGCTCTAACGGGGTGCCGGA------ 5 3 5 -----GACCGGCTGAGAGGCAGT------------------CTCGCCAACCCGCTGA---ACCTGA-TCCGGTTTG-TACCGGCG-GAGGGA-TTAGACG MLO_YK GCCCATCCACAGGGGTGCTCCGTAC-------------GGTCGGGGCTGAGACGGGGGCGG-----------CAAGCCCACAGACCCTAGA----AGCTGA-TCTGGGTAA-TACCAGCG-GAGCGA-GGCGGGCG NX_CITX CTCCTTGTCGGAGTGCCGCCGC---------------CGGGCGGCTGAGATTGCGA------------------AAGCAGAATCCGTAGA---ACCTGT--CGGGGTAA-TGCCTGCG-TAGGAA-ACAAACC NX_THIC ATTGAAACAGGGGTGCTGCCTGAT----------GTTTAGGCGGCTGAGAA----------------------------ATACCCTTTAC---ACCCGA-TCGGGATAA-TACCTGCG-TGGGGA-GTTTTCA ACTINOBACTERIAE MT_THIO CTGTAGACACGGGAGTCCCGGG--------------AGCGGGGTCTGAGAGTGGGCGCGCCT-------------GCCCTTACCGTCAC----ACCTGA-TCCGGATCA-TGCCGGCG-AAGGGAGGTCAAGGATG MT_THIC GTACCCACGCGGGAGCGCACGC--------------CGAGTGCGCTGAGAGGACGGCTCGGG------------GCCGTCGACCGTACGA---ACCTGA--CCGGGTAA-TGCCGGCG-TAGGGAGTTGCAAATG CGL_THIC CAGTCCCCACGGGCGCCCGA-----------------GCACGGGCTGAGATCGCGCTGATT---------GCTGCGCGAGCACCGTTTGA---ACCTG--TCCGGTTAG-CACCGGCG-AAGGAAGAGAGGAATGGTGCAATG CGL_THID ACTAGGCACGGGGTGCCAACCGGATGG---AAAAATTCCGGAGGCTGAGAAA---------------------------ACACCCGTTGA---ACCTGC-TCTAGCTCG-TACTAGCG-AAGGGATGGCCTTAACGTG CGL_THIE CTTACCCCACGGGTGCCCAAT---------------GCATTGGGCTGAGATTGCGCGCTGT---------TGCTGCGCGGGACCGTTCGA---ACCTG--TCTGGTTAA-CACCAGCG-AAGGAAGCGAGGATTGATTGTCCCGTG CGL_YKOE TCATAGACACGGGTGCTCGGTGA------------AAATCCGGGCTGAGATCTGGCA----------------TAGCCACGACCGTCGA----ACCTG-ATCCGGATAA-TGCCGGCG-ATAGGGAGGAAAAATATG CGL_OARX TAGTGACACGGGGTGCAAAAGCACTTT----AAAAAAGCTTTCGCTGAGATT---------------------------ACACCCGTCGA---ACCTG-ATCCAGTTAG-TACTGGCG-AAGGGACTGTCGCAT CYANOBACTERIA NPU_THIC TCCATGCTAGGGGTGCCTACAT---------------AACCAGGCTGAGATC---------------------------ACACCCTTAAC---ACCTGAGTCTGGGTAA-TACCAGCG-GAGGGAAGCTGTTTATTG CY_THIC CCATAGCTAGGGGTGTCTAGAA---------------AGCTAGGCTGAGAA----------------------------AAACCCTTAGA---ACCTGAGACTGGGTAA-TACCAGCG-GAGGGAAGCTCACCATTC AN_THIC TCCATGCTAGGGGTGCTTGCAC---------------TAACAGGCTGAGATT---------------------------ACACCCTTAAC---ACCTGAGACTGGGTAA-TACCAGCG-AAGGGAAGCTGTTTATTG THERMUS/DEINOCOCCUS, THERMOTOGALES, Fusobacterium, CFB group DR_THIB CGCGTCACCGGGGGTGCCCTGCTT------------CGGCAGCGGCTGAGAAC---------------------------ACACCCCAGGA---ACCTGA-ACCGGGTCA-TTCCGGCG-GAGGGAGTGTGATGC DR_THIC ATCGTCAACAGGGGTGCCTCCGCATA--------TGGGCCGGAGGCTGAGAGGGCAACT---------------CGGGCCTAACCCTATGA---ACCTGA-ACTGGTTAG-CACCAGCG-GAGGGA-GTGTGACG TQ_THIBGGCCGTCACCGGGGGTGCCCCA------------------AAAGGGCTGAGAGC---------------------------ATACCCTTGGA---ACCTGA-TCCGGGTCA-TGCCGGCG-TAGGGAAGGTGACGGCC TM_THI1 CCTTCCCCAGGGGGAGCTCCTAT---------------TCCGGGGCTGAGAGGAGGACGG-------------AAGTCCTCGACCCCAAGA---ACCTGA-TCCGGGTAA-TGCCGGCG-GAGGGATCGGGGAAGGA FN_THIC TATATGTACTGGGGAGCTT----------------------TGTGCTGAGATTAGAACCT------------TTTTTCTTAGACCCATAGT---ACCT-GA-TTTGGATAA-TGCCAACG-AAGGGA—GTACCA FN_THIX ACTAGTTACAAGGGAGTTAATA-----------------AATTGACTGAGAAAAGGATG--------------TGAGCCTTGACCTTTTG----ACCT-GA-TTTGGATAA-TGCCAACG-TAGGAA--GTAAA PG_THIS AGACCGCTACGGGGGTGCTTGCCG--- 4 3 4 -GATACGGCAGGCTGAGAT---------------------------AATACCCATAG---ACCT-GA-TCCGGATAA-TACCGGCG-GAGGGAT-GTAG PG_OMR ATTGGGAGAAGGGGTGCTTCCTGTA--- 3 7 3 --GTGGATGGCTGAGAAC---------------------------AAACCCTCATC---ACCT-GA-ACCGGATAA-TACCGGCG-TAGGAAA-CTCTC BX_THIS TAAAGACAAAGGGGTGCCACC------------------CGGTGGCTGAGATT---------------------------ATACCCTAAGA---ACCT-GA-TGCAGTTAG-TACTGCCG-AAGGGA—TTGTG ARCHAEA TAC_T1 GGTGTGGTGGGGGAGCTCCAT-----------------AAGGGGCTGAGAGGATCCGG---------------ATGGATCGATCCCTGGA---ACCTGA-TCCGGGTAA-TACCGGCG-GAGGGAAATTATG FAC_T1 AGTTATACCGGGGAGCTAA---------------------AATGCTGAGAGGATAA-------------------GGATCGACCCGTGCA---ACCTGA-TCCGGACAA-TACCGGCG-GAGGGAGATGGATA
Conserved secondary structure of the THI-element
MG
GG K
CC
C A
G G A
A G
C C U
THI-elem ent
Thi-box
1
4
5
2
C Y G G
G R C C
N U NR
UR
NG
YY
UC
RR
NAG
AG
A
G
3
GA U
GC
N
facultative stem -loop
Capitals: strongly conserved positions. Dashes and points: obligatory and facultative base pairs
Degenerate positions: R = A or G; Y = C or U; K = G or U; M= A or C; N = any nucleotide
THI: the mechanism of regulation
1 ,2
1 ,2
•Thermus/Deinococcus group,•CFB group•Proteobacteria,
• Translation attenuation
•Actinobacteria,•Cyanobacteria,•Archaea
•Bacillus/Clostridium group,•Thermotoga, •Fusobacterium,•Chloroflexus
• Transcription attenuation
Distribution of THI-elements
Genomes Number of analyzed genomes
Number of genomes
with THI
Number of the THI elements
-proteobacteria 7 7 15
-proteobacteria 6 6 12
-proteobacteria 18 17 38
- and proteobacteria 3 1 1
The Bacillus/Clostridium group 18 18 51
Actinomycetes 9 9 25
Cyanobacteria 5 5 5
Other eubacteria 14 11 11
Archaea (Thermoplasma) 17 3 6
Total 97 77 164
Mandal et al., 2003: THI in 3’UTR (plants). THI in untranslated intron (fungi)
Predicted THI-regulated genes: transporters
yuaJ: predicted thiamin transporter (possibly H+-dependent)
• Found only in the Bacillus/Clostridium group;• Occurs in genomes without the thiamin pathway (Streptococci);• Has 6 predicted transmembrane segments (TMS);• Regulated by THI-elements in all cases with only one exception (E. faecalis);• In B. cereus, the thiamin uptake is coupled to proton movement (Arch Microbiol,
1977).
thiX-thiY-thiZ and ykoF-ykoE-ykoD-ykoC: predicted ATP-dependent HMP transporters
• Found in some Proteobacteria and Firmicutes;• Not found in genomes without the thiamin pathway;• Always co-occur with thiD and thiE;• In Pasteurellae, Brucella and some Gram-positive cocci, they are present without
thiC;• Regulated by THI-elements in all cases with only one exception (T. maritima);• Putative substrate-binding protein ThiY is homologous to Thi12 from yeast, known
to be involved in the biosynthesis of HMP
Predicted THI-regulated genes: more transporters
• thiU from P. multocida and H. influenzae belongs to the possible thiMDE-thiU operon, has 12 predicted TMS; similar to proline permease; no orthologs in other genomes
• thiV from Methylobacillus and H. volcanii clustered with thiamin genes or has THI-elements, has 13 predicted TMS , similar to the pantothenate symporter PanF from E.coli; no orthologs in other genomes
• thiW from S. pneumoniae and E. faecalis forms an operon with thiamin genes, has 5 predicted TMS; no homologs in other complete genomes
• pnuT from the CFB group of bacteria forms operon with thiamin-related genes; has 6 TMS; similar to the nicotinamide mononucleotide transporter PnuC from E.coli; no orthologs in other genomes
• cytX from Neiserria and Chloroflexus has 12 TMS, similar to the cytosine permease CodB from E. coli, forms an operon with thiamin genes in Neiserria and Pyrococcus; homologs in other genomes are not regulated by THI-elements.
• thiT1 and thiT2 from three different Thermoplasma (Archaea) are two paralogous genes; have 9 TMS; belong to the MFS family of transporters. This is the first example of THI-element-regulated genes in Archaea
The PnuC family of transporters
The RFN elements
The THI elements
Predicted THI-regulated genes: enzymes
• thiN: non-orthologous displacement of thiESeparate gene in archaea or with thiD (in M. theroautotrophicum)Always present if ThiD is present and ThiE is absent
• tenA: gene of unknown function somehow associated with thiDFound in most firmicutes, some proteobacteria and archaea; ThiD-TenA gene fusions in some eukaryotes;Forms clusters with thiD and other THI-elements-regulated genes in most bacteria;Single tenA gene is also regulated by THI-elements in some bacteria;Not found in genomes without the thiamin pathway;Always co-occurs with the thiD and thiE genes
• tenI: gene of unknown function, thiE paralog Found in some unrelated bacteria;Forms a separate branch in the phylogenetic tree for thiE;In most bacteria, located in clusters of THI-elements-regulated genes.
• ylmB from Bacilli belongs to the ArgE/dapE/ACY1/CPG2/yscS family of metallopeptidases;regulated by the THI-elements in B. subtilis and B. halodurans, not regulated in B. cereus.
• thi-4 from Thermotoga maritima belongs to a family of putative thiamine biosynthetic enzymes from archaea and eukaryotes. Located in the one operon with thiC and thiD.
• oarX from Methylobacillus and Staphylococcus is a single THI-elements-regulated gene; belongs to the short-chain dehydrogenase/reductase (SDR) superfamily
Metabolic reconstruction of the thiamin biosynthesis
= thiN (confirmed)
(Gram-positive bacteria)
(Gram-negative bacteria)
Transport of HMPTransport of HET
THI-elements in delta-proteobacteria: co-operative binding?
• Tandem arrangement of THI-elements upstream of the main thiamine operon thiSGHFE1 in Desulfovibrio spp.
• Tandem arrangement of glycine riboswitches in B. subtilis and V. cholerae (Mandal et al., 2004):– co-operative binding of the cofactor (glycine)– rapid activation/repression– same arrangement in all glycine riboswitches
B12-box and regulation of cobalamin metabolism genes by pyrophosphate (Nou & Kadner, 2000; Ravnum &
Andersson, 2001; Nahvi et al., 2002)
• Long mRNA leader is essential for regulation of btuB by vitamin B12.
• Involvement of highly conserved B12-box rAGYCMGgAgaCCkGCcd in regulation of the cobalamin biosynthetic genes (E. coli, S. typhimurium)
• Post-transcriptional regulation: RBS-sequestering hairpin is essential for regulation of the btuB and cbiA
• Ado-CBL is an effector molecule involved in the regulation of the cobalamin biosynthesis genes
Conserved RNA secondary structure of the regulatory B12-element
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
c
C
C
GG
G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
g
gN
RN
N
NN
r
r
r
g
g C
c
c T
C
C G
CC
a
ta N
B 12 box
P 0
5' 3'
P 1
P 4 V S
B I IB I
P 5 P 6
P 2
N
A dd- I
F acultative stem- loop
A dd- I I
The group
Bacillus/Clostridium
Other taxonomic groups
-proteobacteria
base stem
CGh
G
d
yc c
C C
P 3
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
c
C
C
GG
G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
g
gN
RN
N
NN
r
r
r
g
g C
c
c T
C
C G
CC
a
ta N
P 0
P 1
P 4 P 5 P 6
P 2
N
CGh
G
d
yc c
C C
P 3
B12-element
+Ado-CBL
Ado-CBL
pseudoknot
terminator
1 2 3
1 2
antiterminator
3
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
c
C
C
GG
G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
g
gN
RN
N
NN
r
r
r
g
g C
c
c T
C
C G
CC
a
ta N
P 0
P 1
P 4 P 5 P 6
P 2
N
CGh
G
d
yc c
C C
P 3
B12-element
+Ado-CBL
Ado-CBL
pseudoknot
RBS-sequestorhairpin
1 2
1 2
antisequestor
A. B.
The predicted mechanism of the B12-mediated regulation of cobalamin genes
B12-element regulates cobalamin biosynthetic genes and transporters, cobalt transporters and a number of other cobalamin-related genes.
Distribution of B12-elements in bacterial genomes
Metabolic reconstruction of
cobalamin biosynthesis: new
enzymes and transporters
Cobalt ion transportcbiMNQO, hoxN, hupE, cbtAB, cbtC, cbtD, cbtE, cbtG, cnoABCD
If a bacterial genome contains B12-dependent and B12-independent isoenzymes, the genes encoding the B12-
independent isoenzymes are regulated by B12-elements
Ribonucleotide reductasesRibonucleotide reductases
NrdJ NrdJ ((BB1212-dependent-dependent)
NrdAB/NrdDG NrdAB/NrdDG ((BB1212-independent-independent))
+ ––
–– +
+ +
Methionine synthaseMethionine synthase
MetH MetH ((BB1212-dependent-dependent))
MetEMetE((BB1212-independent-independent))
++ ––
–– ++
++ ++
B12B12 B12
LYS-element: lysine riboswitch
uaAG
u
CG
P 1
5' 3'base stem
R Yr y
Gy
y
r
aa
g
u g
a a a GG
r Cr G
y G Cyk
a G ug R
C a Yu
a
Gg N
a
aA
a N
acUGC
GA
G G gaR
ru
Yy
P 2
P 5P 6
P 7
P 3P 4
Reconstruction of the lysine metabolism
-aspartyl-phosphate
aspartate semialdehyde
homoserine
dihydrodipicolinate
tetrahydrodipicolinate
N-acetyl-2-amino-6-ketopimelateN-succinyl-2-amino-6-ketopimelate
N-acetyl-L,L-diaminopimelateN-succinyl-L,L-diaminopimelate
L,L-diaminopimelate
meso -diaminopimelate
Lysine transport
L-aspartate
lysC,dapG,yclMlysC,thrA,metL
asd
hom
thrA,metL
dapA
dapB
dapDdapD
ykuR
dapC(argD)
ddh
patA
dapE
dapF, dal
lysA
predicted genes are boxed (pathway of acetylated intermediates in B. subtilis)
Regulation of lysine catabolism: the first example of an activating riboswitch
• LYS-elements upstream of pspFkamADEatoDA operon in Thermoanaerobacter tengcongensis; kamADElysE operon in Fusobacterium nucleatum– lysine catablism pathway– LYS element overlaps candidate terminator
=> acts as activator
• similar architecture of activating adenine riboswitch upstream of purine efflux pump ydhL (pbuE) in B. subtilis (Mandal and Breaker, 2004)
S-box (SAM riboswitch)
g u y
c a r
NaAUGc
AP 1
5' 3'base stem
u R
CA
U
U
uGa
P 4
NaGA
g
c
GR
CA
aCcD H
Gg
UGCY
a
AA NuccN
r
N
N
G gy
C cr
P 2
G GG A
C C DC
rG
N y G A a
Ac
gg
P 3
P 5g
Reconstruction of the methionine metabolism
Cystathionine
Homocysteinemethyl-THF
Sulfide
CH
methylene-THF
THF
3
O-acetylhomoserine
Homoserine
Aspartate semialdehyde
Methionine
S-ribosyl-hom ocysteine
(SRH)
S-adenosyl-hom ocysteine
(SAH)
S-adenosyl-methionine
(SAM)
Methylthioribose (MTR)MTA
Threonine
metI yrhB
metC yrhAmetF
yxjH*
metK
mtnKSUVW XYZ
hom
cysH-...metB
metH
metX
metEmtn
mtn
metY
predicted genes are marked by *(transport, salvage cycle)
A new family of amino acid transporters
S-box (rectangle frame)MetJ (circle frame)LYS-element (circles)Tyr-T-box (rectangles)
BC1434
FN 062 4
269.47
SON-3
CJ
CPE
LysT
MetT
TyrT
MleN
DF
CTCCB
OB
SO N-2VC-2
NM B
SON-1
VC-1
BHHP
C
TTE-nhaC
AC0744
FN0978
BL1111
CTC 00901
OB2874OB1118
NMB05 36
FN0352BC4121
EF-nhaC 1
EF-nhaC 2
PPE
LP-nha2
LP-nha1 L
L
M
G A
ELB
BS-yheL
BS-m leN
FN0650
VC2037
BC1709
SA 2292HI1107
VV21061FN207 7
BH3946
BC0373
FN14 22
BB0638
BB0637
F N1420
CTC02529SO1087
VCA0193
BT1270
C
CB
T C02520
CPE2317
FN1414
SA2117
Archaea
clostrid ia
Pasteure llaceae
malate/lactate
Regulation of reverse pathway Met-Cys in Clostridium acetobutylicum
ubiG yrhA
antisense transcript
Cysteine
S-adenosylmethionine
yrhB
AA
Cys-T-box S-box
sense transcript
Three methionine regulatory systems in Gram-positive bacteria: loss of S-box regulons
• S-boxes (riboswitch)– Bacillales– Clostridiales– the Zoo:
• Petrotoga
• actinobacteria (Streptomyces, Thermobifida)
• Chlorobium, Chloroflexus, Cytophaga
• Fusobacterium
• Deinococcus
• proteobacteria (Xanthomonas, Geobacter)
• Met-T-boxes (Met-tRNA-dependent attenuator)– Lactobacillales
• MET-boxes (transcription factor MtaR)– Streptococcales
Lact. Strep. Bac. Clostr.
ZOOMetJ, MetR in proteobacteria
Riboswitches in the Sargasso sea metagenome
• 125 THI-elements
• 38 LYS-elements
• 25 B12-elements
• 9 RFN-elements
• 3 S-boxes
Conserved structures of known riboswitches
NNNNyYYUC
NNNNrRRAG
NgGG
NcCC
Rg
GGxc G
Aux
gRRA
GRC
CYG
AcCG
AGCCRGYGG YRCC GRYBy CYRVr
G N
YGN
aA N U U x N
Nx
AGU
UrN
A gY
uK N
RA
xK
Var
Add
RFN-element
MG
GG
A
G G A
A G
C C U
THI-element
C Y G GN U N
RUR
UC
RR G
A
A
A
AA
AA
CGd
a
aa
a
a
ktk
h
CC
c
C
C
GG
G
GGG
G
GT
M
Y
K
y
c
c G
g
g G
G
G YG
tg
g
g
gN
RN
N
NN
r
r
r
g
g C
c
c T
C
C G
CC
a
ta N
B 12 box
P1
5' 3'
P2
P5 P6 P7
P3
N
base stem
CGh
G
d
yc c
C C
P4
g u y
c a r
NaAUGc
AP1
5' 3'
u R
CA
U
U
uGa
P4
NaGA
g
c
GR
CA
aCcD H
Gg
UGCY
a
AA NuccN
r
N
N
G gy
C cr
P2G GG A
C C DC
rG
N y G A a
Ac
gg
P3
P5g
AUR
UA
P1
5' 3'
C GU R
Y
CA RUAU
GG
P2
AN
U
A
C
GU N U U
A
UA
A A
G
GCC
P3
C
N G A
U
P1
P2
P3
P4
P5
P3 P2
P4
base stem base stem5' 3' 5' 3'
B12-element
base stem
S box-
base stem
G box-
Add
Add I
Add II
Add III
Var
P5
P1
uaAG
u
CG
P1
5' 3'base stem
R Yr y
Gy
y
r
aa
g
u g
aa a GG
r Cr G
y G Cyk
a G ug R
C a Yu
a
Gg N
a
aA
a N
acUGC
GA
G G gaR
r
uYy
P2
P5P6
P7
P3P4
LYS-element
Characterized riboswitches (more are predicted)RFN Riboflavin
biosynthesis and transport
FMN (flavin mononucleo-tide)
Bacillus/Clostridium group, proteobacteria, actinobacteria, other bacteria
THI Biosynthesis and transport of thiamin and related compounds
TPP (hiamin pyrophosphate)
Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, other bacteria, archea (thermoplasmas), plants, fungi
B12 Biosynthesis of cobalamine, transport of cobalt, cobalamin-dependent enzymes
Coenzyme B12 (adenosyl-cobalamin)
Bacillus/Clostridium group, proteobacteria, actinobacteria, cyanobacteria, spirochaetes, other bacteria
S-box Metabolism of methionine and cystein
SAM (S-adenosyl- methionine)
Bacillus/Clostridium group and some other bacteria
LYS Lysine metabolism lysine Bacillus/Clostridium group, enterobacteria, other bacteria
G-box Metabolism of purines
purines Bacillus/Clostridium group and some other bacteria
glmS Synthesis of glucosamine-6-phosphate
glucosamine-6-phosphate
Bacillus/Clostridium group
gcvT Catabolism of glycine
glycine Bacillus/Clostridium group
Mechanisms
UUUUUUUU
5 ’
33 ’
5 ’
Regulatory hairpin(terminator of transcription and or RBS-sequestor)/
In the case of regulation of transcription
In the case of regulation of translation
GENES
3 ’ GENES
RNA-element
A
5 ’
1 3UUUUUUUU
Antiterm inator/Antisequestor
3 ’ GENES
5 ’ 1 2
RNA-element
3 ’ GENES
B 5 ’
2 3
Antiterminator/Antisequestor
3 ’ GENES
C
5 ’
RNA-element
3 ’ GENES
12
5 ’
1 23 ’ GENES
Regulatory hairpin
+ Effector
UUUUUUUU
- Effector
2
1
gcvT: ribozyme, cleaves its mRNA (the Breaker group)
Properties of riboswitches
• Direct binding of ligands• Same structure – different mechanisms• Distribution in all taxonomic groups
– diverse bacteria– archaea - thermoplasmas– eukaryotes – plants and fungi
• Lineage-specific features…• … horizontal transfer, duplications, lineage-specific loss• Correlation of the mechanism and taxonomy:
– attenuation of transcription (anti-anti-terminator) – Bacillus/Clostridium group
– attenuation of translation (anti-anti-sequestor of translation initiation) – proteobacteria
– attenuation of translation (direct sequestor of translation initiation) – actinobacteria
• Andrei Mironov– software genome analysis, conserved RNA patterns
• Alexei Vitreschak– analysis of RNA structures
• Dmitry Rodionov– metabolic reconstruction
• Support:– Howard Hughes Medical Institute– INTAS– Russian Fund of Basic Research– Russian Academy of Sciences