Identification of a putative methylenetetrahydrofolate reductase by sequence analysis of a 6·8 kb...

5
YEAST VOL. 12: 1047-1 051 (1996) 0 oooo t- 0 vn % 0 Yeast Sequencing Reports Identification of a Putative Methylenetetrahydrofolate Reductase by Sequence Analysis of a 6-8 kb-DNA Fragment of Yeast Chromosome VII BELEN TIZON. ANA M. RODRIGUEZ-TORRES, ESTHER RODRIGUEZ-BELMONTE, JOSE L. CADAHIA AND ESPERANZA CERDAN* Depcirtanietito de Biologiu Celuliir J Moleculur. Faciiltad de Ciencias, Utiiversidad de A Coruthu, 15071 Spaiti Received 19 February 1996; accepted 12 April 1996 We report the sequence analysis of a 6.8 kb DNA fragment from Saccharomyces cerevisiue chromosome VII. This sequence contains five open reading frames (ORFs) greater than 100 amino acids. There is also an incomplete ORF flanking one of the extremes, G2868, which is the 3' end of the SCS3 gene (Hosaka et d., 1994). The translated sequence of ORF G2882 shows similarity to the human methylenetetrahydrofolate reductase (Goyette ef al., 1994). ORF (32889 shows no significant homologies with the sequences compiled in databases. ORF G2893 corresponds to the gene SUP44, coding for the yeast ribosomal protein S4 (All-Robin et al., 1990). G2873 and (32896 are internal ORFs, !The whole sequence of the fragment is available at the EMBL nucleotide sequence database, 'GenBank and Data Bank of Japan under the Accession Number X94106. KEY WORDS - genome sequencing; Saccharotnvces cerevisiue; chromosome VII; niethylenetetrahydrofola te reductase; SCS3: SUP44 INTRODUCTION We have determined a sequence of 6.8 kb from Saccharor?zyces cerevisiue chromosome VII. This fragment is contained in cosmid pEGH348 and 24,600 bp from the S. cerevisiae insert have been published elsewhere (Rodriguez-Belmonte et ul., 1996; Escribano et al., 1996). This report represents the last remaining part to complete the analysis of the whole clone. The sequence reported here contains a fragment of the SCS3 gene (Hosaka et al.. 1994), a putative 5,lO- methylenetetrahydrofolate reductase (MTHFR) gene, one open reading frame (ORF) of unknown function, two internal ORFs, and the previously sequenced SUP44 gene (All-Robyn et al., 1990). This work forms part of the European BIOTECH I1 project to sequence the genome from S. cerevisicir. *Corresponding author. MATERIALS AND METHODS Cosinid Cosmid pEGH348 was provided by H. Tettelin (Universite Catholique de Louvain La Neuve). It contains about 40.9 kbp of DNA from region 28 of chromosome VII inserted in the BuiiiHI site of pWE15 (Evans and Wahl, 1987). The EcoRI re- striction map of the pEGH348 insert and the se- quenced area presented here is shown in Figure 1. Strains md plasnzitls Escherichiu coli strains HBlOl and JM109 were used for plasmid transformation and amplifica- tion. Cells were grown. maintained and trans- formed on selective media by conventional methods (Sambrook et ul., 1989). Subcloning from pEGH348 was done in pBluescript KS+ CCC 0749-j03X/96/101047-Oj cp 1996 by John Wiley & Sons Ltd

Transcript of Identification of a putative methylenetetrahydrofolate reductase by sequence analysis of a 6·8 kb...

YEAST VOL. 12: 1047-1 051 (1996)

0 oooo

t- 0 vn % 0 Yeast Sequencing Reports

Identification of a Putative Methylenetetrahydrofolate Reductase by Sequence Analysis o f a 6-8 kb-DNA Fragment of Yeast Chromosome VII BELEN TIZON. ANA M. RODRIGUEZ-TORRES, ESTHER RODRIGUEZ-BELMONTE, JOSE L. CADAHIA AND ESPERANZA CERDAN*

Depcirtanietito de Biologiu Celuliir J Moleculur. Faciiltad de Ciencias, Utiiversidad de A Coruthu, 15071 Spaiti

Received 19 February 1996; accepted 12 April 1996

We report the sequence analysis of a 6.8 kb DNA fragment from Saccharomyces cerevisiue chromosome VII. This sequence contains five open reading frames (ORFs) greater than 100 amino acids. There is also an incomplete O R F flanking one of the extremes, G2868, which is the 3' end of the SCS3 gene (Hosaka et d., 1994). The translated sequence of O R F G2882 shows similarity to the human methylenetetrahydrofolate reductase (Goyette ef al., 1994). O R F (32889 shows no significant homologies with the sequences compiled in databases. ORF G2893 corresponds to the gene SUP44, coding for the yeast ribosomal protein S4 (All-Robin et al., 1990). G2873 and (32896 are internal ORFs, !The whole sequence of the fragment is available at the EMBL nucleotide sequence database, 'GenBank and Data Bank of Japan under the Accession Number X94106.

KEY WORDS - genome sequencing; Saccharotnvces cerevisiue; chromosome VII; niethylenetetrahydrofola te reductase; SCS3: SUP44

INTRODUCTION We have determined a sequence of 6.8 kb from Saccharor?zyces cerevisiue chromosome VII. This fragment is contained in cosmid pEGH348 and 24,600 bp from the S. cerevisiae insert have been published elsewhere (Rodriguez-Belmonte et ul., 1996; Escribano et al., 1996). This report represents the last remaining part to complete the analysis of the whole clone. The sequence reported here contains a fragment of the SCS3 gene (Hosaka et al.. 1994), a putative 5,lO- methylenetetrahydrofolate reductase (MTHFR) gene, one open reading frame (ORF) of unknown function, two internal ORFs, and the previously sequenced SUP44 gene (All-Robyn et al., 1990). This work forms part of the European BIOTECH I1 project to sequence the genome from S. cerevisicir. *Corresponding author.

MATERIALS AND METHODS Cosinid

Cosmid pEGH348 was provided by H. Tettelin (Universite Catholique de Louvain La Neuve). It contains about 40.9 kbp of DNA from region 28 of chromosome VII inserted in the BuiiiHI site of pWE15 (Evans and Wahl, 1987). The EcoRI re- striction map of the pEGH348 insert and the se- quenced area presented here is shown in Figure 1.

Strains m d plasnzitls

Escherichiu coli strains HBlOl and JM109 were used for plasmid transformation and amplifica- tion. Cells were grown. maintained and trans- formed on selective media by conventional methods (Sambrook et ul., 1989). Subcloning from pEGH348 was done in pBluescript KS+

CCC 0749-j03X/96/101047-Oj cp 1996 by John Wiley & Sons Ltd

1048 B. TIZON ET AL.

R R R R R R R R RR

pEGH348

62868 62882

1 kb 62889

62893

Figure 1. EcoRI restriction map of the Succhuromyces cerevi- siue DNA insert in cosmid clone pEGH348. The black bar indicates the sequenced region presented here. Relative posi- tions of ORFs within the contiguous sequence are indicated by arrows.

(Stratagene) which was also used for double-strand sequencing.

DNA preparation and recombinant DNA t e c h iques

Enzymes, from Boehringer Mannheim, were employed as recommended by the manufacturer. Cosmid DNA was purified using the plasmid kit from Quiagen. Plasmid DNA for sequencing was obtained with the Wizard midiprep DNA purification system (Promega).

Sequencing strategy Three contiguous EcoRI-EcoRI subclones of

5600,4200 and 5900 bp were deleted by the DNase I method (Sambrook et al., 1989) and sequenced

on both strands by the method of Sanger et al. (1 977). The junctions between contiguous frag- ments and some gaps not covered by the dele- tions were sequenced using synthetic primers and overlapping subclones as templates.

Sequence analysis The complete nucleotide sequence has been

entered in the EMBL Data Library under Acces- sion Number X94106. The sequence was analysed using DNASWPROSIS (Hitachi) and FASTA (Pearson and Lipman, 1988) software. Additional analyses were performed at MIPS (Martinsrieder Institut fur Protein Sequenzen).

RESULTS AND DISCUSSION The region of the pEGH348 insert which is described in this work is shown in Figure 1. The left end overlaps with the insert of cosmid pEGH 163 (sequenced at Guy Lauquin's labora- tory). Sequence analysis reveals five complete ORFs plus another one that extends into an area already published (Rodriguez-Belmonte et al., 1996). The characteristics of these ORFs are described in Table 1.

Two of the ORFs have been described already. G2868 overlaps with the SCS3 gene which is required for inositol prototrophy (Hosaka et al., 1994; Rodriguez-Belmonte et al., 1996). G2893 is identical to SUP44, the gene encoding S. cerevisiae ribosomal protein S4 (All-Robyn et al., 1990). Analysis of the two novel, non-internal ORFs is described below.

Table 1. Major characteristic features of the open reading frames identified in the 6.8 kb fragment.

Amino Homologous Accession Molecular ORF' acids Strand' Start End CAI3 protein number4 weight (Da)

G2868u 380 w 442 0.07 s c s 3 pir:S53293 42733 G2873i 104 C 393 82 G2882 599 w 826 2622 0.22 MTHFR pir:S53294 68554

pi1346454 pir: S408 84 pir:S03 169 pir:H64 123

G2889 644 C 5017 3086 0.01 73490 G2893 254 W 5920 6681 0.80 SUP44 pir:R3BYS2 27448 G2896i 108 C 6375 6052

'Open reading frame, u =incomplete, i =internal; 'C= Crick strand, W = Watson strand; 3codon adaptation index; 4pir= Protein Information Resource.

6.8 kb FRAGMENT OF CHROMOSOME VII 1049

radl G2882 MTHFHU MTHFEC MTHFST

radl G2882 MTHFHU MTHFEC MTHFST

radl G2882 MTHFHU MTHFEC MTHFST

radl G2882 MTHFHU MTHFEC MTHFST

G2882 MTHFHU MTHFEC MTHFST

G2882 MTHFHU MTHFEC MTHFST

G2882 MTHFHU MTHFEC MTHFST

G2882 MTHFHU

G2882 MTHFHU

G2882 MTHFHU

G2882 MTHFHU

G2882

G2882

M-------------------------------------------- IRDLY MKITE--------------------------------------- KLEQHR AMVNEARGNSSLNPCLEGSASSGSESSKDSSRCSTPGLDPERHERLREKM

AEVQGQIN--- AEVQGQIN---

HAR---ASPFISLEFFPPKTELGTRN_LME_RMHRM--TALDPLFITVT~GA QTS---GKPTYSFEYFVPKTTQGVQNLYD-WDRMYEASL-PQFIDIT-WA RRRLESGDKWFSLEFFPPRTAEGA~LISBFDRM--AAGGPLYIDVT-~P ---------- VSFEFFPPRTSEMEQTLWNSIDRL--8SLKPKFVSVTYGA ---------- VSFEFFPPRTSEMEQTLWNSIDRL--SBLKPKFVSVTYGA GGTT-AEKTLTLASLAQQTL---NIPVCMHLTCTNTEKAIIDDALDRCYN GGGRLSHLSTDLVATRQSVL---GLETCMHLTCTNMPISMIDDALENAYH AGDPGSDKETSSMMIASTAVNYCGLETILHMTCCRQRLEEITGHLHKAKQ NSGERDRTHSIIKGIKDRT----GLEAAPHLTCIDATPDELRTIARDYWN NSGERDRTHSVIKGIKERT----GLEAAPHLTCIDPTRDE

AGIRNILALRGNLPIGVVWLVSQSNRLLNMRLF SGCQNILALRGDPPRDAENWTPVEGGFQYAKDLIKYIKSKYGJHFAIGVA LGLKNIMALRGDPI--GDQWEEEEGGFNYAVGLmIRSEFGJYFDICVA NGIRHIVALRGDLPPGSGKPE------ MYASDLVTLLKE--VADFDISVA NGIRHIVALRGDLPPGSGKPE------MYAADLVGLLKE--VADFDISVA

- G Y P E C H P E L P N K D V K L D L E Y L S R R - S T G G D F I I T Q M F Y D V C S Q V GYPKGHPEAGSFEA--DLKHLKEKVSAGADFIITQLFFEADTFFRFVKAC

AYPEVHPEAKSAQA--DLLNLKRKVDAGANRAITQFFFDVESYLRFRDRC

RAAGMDVPIIPGIMPITTYAAFLRRIQWGQISIPQHFSSRLDPIKDDDEL TDMGITCPIVPGIFPIQG~HSLRQLVKLSKLEVPQEIKDVIEPIXDNDAA VSAGIDVEIIPGILPVSNFKQAKKFADMTNVRIPAWMAQMFDGLDDDAET VSAGIDVEIIPGILPVSNFKQAKKFADMTNVRIPSWMSLMFEGLDNDAET

Q D I G T N L I V E M ~ K L L D Z Y V S H L H I Y T M N L E K A _ P L M L N I L P T E S 1 ~ Y G I E L A V S L ~ E L L A ~ L V P G ~ F Y T L N R ~ ~ - T T E V L K m G M W T E D - RKLVGANIAMDMV-KILSREGVKDFHFYTLNRAEMSYAICHTLGVRPGL RKLVGANIAMDMV-KILSREGVKDFHFYTLNRAEMSYAICHTLGVRPGL

MSFFHASQRDALNQSL----------------------- MSFFHANQREALNQSL-----------------------

AYPEVHPEAKSAQA--DLLNLKRKVDAGANRAITQFFFDVESY LRFRDRC

EFNAHPLAVEWRKSLNPKRKNEEVRPIFWKRRPYSYVARRSQWAVm ----- - PRRPLPWALSAHpKRREEDVRPIFWASRPKSYIYRTQEW--m ****** - NGRFGDSSSPAFGDLDLCGSDLIRQSANKCLELWSTPTSINDVAFLVINY NGRWGNSSSPAFGEL----------------------------------- - -

LNGNLKCLPWSDIPINDEINPIKAHLIELNQHSIITINSQPQVNGIRSND .................................................. KIHGWGPKJGYVYQKQEEFMLPKTKLELIDTLKNNEFLTYFAIDSQGD

- KDYYLF---E----- KSKSPIJE ------- LLSNHPDNSKSNAVTWGIFPGREILQPTIVEKISFLAWKEEFYHILNEWK

LNMNKYDKPHSAQFIQSLIDDYCLVNIVDNDYISPDDQIHSILLSL Figure 2. Alignment of inethylenetetrahydrofolate reductases from various species. Bold residues represent identities or equivalent substitutions common to eukaryotic and prokaryotic MTHFR. Underlined residues represent identities only conserved at the eukaryotic level. The highly charged motif KRREED is indicated with asterisks.

(32882 with extensive homology to human MTHFR (EC FASTA analysis shows that translation of 1.5. I .20). This enzyme catalyses the NADPH-linked

G2882 gives the amino acid sequence of a protein reduction of 5,lO-methylenetetrahydrofolate to

1050

5-methyltetrahydrofolate, a cofactor for methyla- tion of homocysteine to methionine. The most common human inborn error of folic acid metabolism is related to a deficiency of this enzyme, which, in the most severe forms, can cause developmental delay, motor abnormalities, seizures and psychiatric disturbances (Rosenblatt, 1989).

Mammalian enzymes characterized so far, human and porcine, can be reduced by NADPH or NADH and are allosterically regulated by S-adenosylmethionine (Kutzbach and Stokstad, 1971; Daubner and Matthews, 1982). Both native enzymes appear to consist of dimers of identical 77 kDa subunits (Zhou et al., 1990; Matthews, 1986). The porcine MTHFR protein is organ- ized into structurally and functionally different domains which can be evidenced by tryptic cleav- age; the 40 kDa N-terminal domain contains the catalytic center and the 37 kDa C-terminal domain the regulatory center. After treatment with tryp- sin there is a loss of allosteric regulation by S-adenosylmethionine without effect on the cata- lytic activity, and the two fragments remain associ- ated non-covalently (Matthews et al., 1984). It has been proposed that, in the human protein, the point of cleavage lies between residues 351 and 374, this region contains the highly-charged sequence KRREED (Figure 2; Goyette et al., 1994).

Sequences for two MTHFR proteins from bacteria are available in the databases, one from E. coli (accession number S40884) and the other from Salmonella typhimuriurn (accession number S03169); both have 296 amino acids and a high identity between them. MTHFR purified from E. coli (EC 1.7.99.5) is a 33 kDa protein which utilizes reduced FAD as a source of reducing equivalents for reduction of methylene to methyltetrahydro- folate and it does not oxidize NADH or NADPH. S-adenosylmethionine has no effect on the activity of the enzyme (Katzen and Buchanan, 1965).

The genes encoding for MTHFR in bacteria and humans have been cloned and sequenced, although the human cDNA clone does not contain the whole ORF but only a portion corresponding to 416 amino acids from the N-terminal end (Saint-Girons et al., 1983; Stauffer and Stauffer, 1988; Goyette et al., 1994). No data about the MTI-IFR gene(s) in S. cerevisiae have been reported so far, although an ORF, which is diver- gently transcribed with respect to the excision repair gene R A D l located on chromosome XVI

B. TIZON ET AL.

(accession number K02070), presents homology to bacterial and mammalian MTHFR genes (Yang and Friedberg, 1984).

Figure 2 shows the alignment of G2882 and four MTHFR proteins from different sources. The alignment of MTHFRs from prokaryotes and eukaryotes reveals that there is a high degree of sequence conservation within the N-terminal domain; this region probably contains the sub- strate binding site. The deduced molecular weight of the protein encoded by G2882 is 68 kDa, in contrast to the ORF located in yeast chromosome XVI, which has a molecular weight of 19 kDa and only contains the N-terminal domain. The putative MTHFR from chromosome VII has a sequence very similar to human MTHFR, both in the catalytic and regulatory domains. Moreover, the sequence flanking the proposed point of tryptic cleavage is highly conserved (Figure 2). Future functional analysis of these ORFs and purification of the corresponding proteins would be necessary to understand the evolutionary and physiological significance of the coexistence of these two genes in yeast.

G2889 Translation of ORF G2889 shows no significant

homologies with other proteins in the data banks, and no hypothesis about its function can be de- duced from the sequence analysis. The G2889 protein has a molecular weight of 73.5 kDa, with leucine (13%) and serine (11%) as predominant amino acids. No transmembrane segments could be detected by the method of Kyte and Doolitle (1982). The carboxy-terminal region has a moder- ate abundance of acidic amino acids. A codon adaptation index value of 0.01 indicates a very low level of expression.

ACKNOWLEDGEMENTS

This work was supported by the Commission of the European Communities under the BIOTECH I1 programme and by CICYT (BI094-1333-CE). We are grateful to H. Tettelin for the clone and preliminary information. We thank MIPS, especially K. Kleine, for help with sequence analysis; the Servei of Bioinformatics of Valencia University for access to the GCG package; and A. Santos from the Servicios Informaticos de Apoyo a la Investigacion, Universidad de A Coruiia. E. R.-B. was supported by a fellowship from the Universidad de A Coruiia (Spain).

6.8 kb FRAGMENT OF CHROMOSOME VII 1051

REFERENCES All-Robyn, J. A., Brown, N., Otaka, E. and Liebman,

S. W. (1990). Sequence and functional similarity be- tween a yeast ribosomal protein and the Escherichia coli S5 ram protein. Mol. Cell. Biol. 10, 6544-6553.

Daubner, S. C. and Matthews, R. G. (1982). Purification and properties of methylenetetrahydrofolate reduct- ase from pig liver. J. Biol. Chem. 257, 140-145.

Escribano, V., Eraso, P., Portillo, F. and Mazon, M. J. (1996). A 14.6 kb DNA fragment of Saccharomyces cerevisiae chromosome VII reveals SEC27, SSMlb, a putative S-adenosyl methionine-dependent enzyme and six new ORFs. Yeast (submitted).

Evans, G. A. and Wahl, G. M. (1987). Cosmid vectors for genomic walking and rapid restiction mapping. Methods. Enzymol. 152, 604-610.

Goyette, P., Sumner, J. S., Milos, R. et al. (1994). Human methylenetetrahydrofolate reductase: isola- tion of cDNA, mapping and mutation identification. Nature Genetics 7, 195-200.

Hosaka, K., Nikawa, J., Kodaki, T., Ishizu, H. and Yammashita, S. (1994). Cloning and sequence of the SCS3 gene which is required for inosotol prototrophy in Saccharomyces cerevisiae. J. Biochem. 116, 13 17- 1321.

Katzen, H. M. and Buchanan, J. M. (1965). Enzymatic synthesis of the methyl group of methionine VIII. Repression-derepression, purification and properties of 5,lO-methylenetetrahydrofolate reductase from Escherichia coli. J. Biol. Chem. 240, 825-835.

Kutzbach, C. and Stokstad, E. L. R. (1971). Mamma- lian methylenetetrahydrofolate reductase. Partial purification, properties and inhibition by S- adenosylmethionine. Biochim. Biophys. Acta 250, 459-577.

Kyte, J. and Doolitle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105-132.

Matthews, R. G., Vanoni, M. A,, Hainfeld, J. F. and Wall, J. (1984). Methylenetetrahydrofolate reductase. Evidence for spatially distinct subunit domains ob-

tained by scanning transmission electron microscopy and limited proteolysis. J. Biol. Chem. 259, 11647- 11650.

Matthews, R. G. (1986). Methylenetetrahydrofolate re- ductase from pig liver. Meth. Enzymol. 122, 372-38 1.

Pearson, W. R. and Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 25, 24442448.

Rodriguez-Belmonte, E., Rodriguez-Torrres, A. M., Tizon, B. et al. (1996). Sequence analysis of a 10 kb DNA fragment from yeast chromosome VII reveals a novel member of the DnaJ family. Yeast 12, 145-148.

Rosenblatt, D. S. (1989). Inherited disorders of folate transport and metabolism. In Scriver, C. R., Beaudet, A. L., Sly, W. S. and Valle, D. (Eds), The Metabolic Basis of Inherited Disease. McGraw-Hill, New York, pp. 2049-2064.

Saint-Girons, I. et al. (1983). Nucleotide sequence of metF, the E. coli structural gene for 5-10 methylene tetrahydrofolate reductase and its control region. Nucl. Acids Res. 11, 6723-6732.

Sambrook, J., Fritsch, E. F. and Maniatis, T. (1989). Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, New York.

Sanger, F., Nicklen, S. and Coulson, S. A. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467.

Stauffer, G. V. and Stauffer, L. T. (1988). Cloning and nucleotide sequence of the Salmonella typhimurium LT2 metF gene and its homology with the corre- sponding sequence of Escherichia coli. Mol. Gen. Genet. 212, 246-25 1.

Yang, E. and Friedberg, E. C. (1984). Molecular cloning and nucleotide sequence analysis of the Saccharo- myces cerevisiae RADl gene. Mol. Cell. Biol. 4, 2161- 2169.

Zhou, J., Kang, S.-S., Wong, P. W. K., Fournier, B. and Rozen, R. (1990). Purification and characterization of methylenetetrahydrofolate reductase from human cadaver liver. Biochem. Med. Metab. Biol. 43, 234- 242.