Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible...

5

Click here to load reader

Transcript of Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible...

Page 1: Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the gene coding region

ELSEVIER Biochimica et Biophysica Acta 1308 (1996) 88-92

BB Biochi ~mic~a et Biophysica ~ t a

Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the

gene coding region

Brendan Marshall *, Gloria Isidro, Mafia Guida Boavida Departamento de Gen~tica Humana, Instituto Nacional de Safide, Dr. Ricardo Jorge, Avenida Padre Cruz, 1699 Lisboa, Portugal

Received 19 December 1995; revised 26 March 1996; accepted 4 April 1996

Abstract

We have identified certain unusually spliced cDNA species following PCR amplification of peripheral blood lymphocyte (PBL) mRNA from the hMSH2 gene. A naturally occurring transcript containing a nonsense codon due to the skipping of 5 exons was amplified from PBLs of several healthy individuals. A feature of this and another unusual splicing product was the presence of sequence motifs which bore significant similarity to mRNA instability determinants in the region immediately downstream of the stop codon. In particular, the rare tetranucleotide GAUG, previously identified in yeast as being of critical importance to the rapid degradation of nonsense-contain- ing mRNAs was situated 23 base pairs downstream of the stop codon. Furthermore the region downstream of the stop codon was A:U rich and contained 2 copies of the AUUUA motif. As other forms of alternative splicing would not result in the same juxtaposition of stop codons and instability motifs, we suggest that the stop codons may have been deliberately introduced by the splicing process for their proximity to these destabilising motifs, and that splicing may play a role in channelling mRNAs into degradative pathways. These results are consistent with the hypothesis that nuclear factors may scan pre-mRNAs prior to splicing.

Keywords: Splicing variant; mRNA; Instability motif; hMSH2 gene; Nonsense codon; (Human lymphocyte)

I. Introduct ion

Correct splice site selection is an essential component of normal eukaryotic gene expression and multiple cis and trans acting factors have now been described which ensure the fidelity of the splicing process [1,2]. Under particular physiological conditions splice site selection may vary depending on the gene in question, resulting in alternative splicing and multiple transcripts from a single gene. Re- cently an unexpected but intriguing observation has sug- gested that inappropriate nonsense mutations inserted within the coding region of various genes may also result in altered splice site selection leading to the production of mature transcripts which contain the nonsense-containing exon, or 'exon skipping' [3]. This has led to the suggestion that some component of the splicing machinery may be able to 'proof read' the frame of pre-mRNAs and thus

* Corresponding author. Fax: + 351 1 7590441; e-mail: p.loureir @ pen.gulbenkian.pt.

0167-4781/96/$15.00 © 1996 Elsevier Science B.V. All rights reserved. PIIS0167-4781(96)00078-4

eliminate potentially deleterious nonsense codons which would result in a truncated protein. Multiple examples now exist of this phenomenon [4]. Whether proof-reading is completed by the translation machinery or some other factor is presently uncertain.

Nevertheless, an association between the processes of translation and mRNA turnover is indicated by the phe- nomenon of nonsense mediated mRNA decay, whereby transcripts which contain nonsense mutations exhibit di- minished abundance relative to normal transcripts. Prema- ture stop codons introduced into the coding region of genes may cause a diminution of transcript half life partic- ularly in nuclear RNA, whereas cytoplasmic RNA appears to be unaffected [5,6]. It is now well established that cis acting sequences within the transcript itself contribute to this enhanced instability when they are positioned down- stream of stop codons. These sequences may also occur within the T-untranslated region of genes. There are essen- tially three main types of mRNA destabilising motifs which can predict mRNA turnover in other genes. The first is the AUUUA motif which is present in the 3' untrans-

Page 2: Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the gene coding region

B. Marshall et al. / Biochimica et Biophysica Acta 1308 (1996) 88-92 89

lated region of many oncogene and cytokine mRNAs [7,8]. However, it has recently been reported that the AUUUA motif, while being essential for mRNA destabilisation is not in itself sufficient ~Lnd that the nanomer UUAUU- UAUU is the minimal determinant [9,10]. Secondly, Zhang and co-workers have reported the identification in yeast of a 13 base pair consensus; sequence involved in nonsense mediated mRNA decay, namely TGYYGATGYYYYY where Y is T or C [11]. The sequence is typically located within coding regions and its activity is modulated by flanking sequences. The critical feature of the motif is the GATG tetranucleotide which is complementary to 18S rRNA and may initiate mRNA degradation by facilitating pausing of the ribosome on the nonsense containing mRNA. Thirdly, a 19 base consensus sequence, has been identified in Xenopus and appears to serve as an endonuclease recognition site [ 12].

Two general hypotheses, which are not mutually exclu- sive, have been put forward to explain the mechanism of nonsense-mediated mRNA decay. Firstly, it has been sug- gested that translation of nonsense-containing mRNAs may result in pausing of the libosomes upon contact with the stop codon and subsequent inhibition of splicing of un- spliced introns downstream of the stop codon causing rapid degradation of the intron-containing RNAs. A second hypothesis states that nuclear factors may scan and detect in frame nonsense codons on nuclear RNAs, initiating degradation. We wished to investigate the possibility of a relationship between the phenomena of nonsense mediated mRNA decay on the one hand and the apparent ability of the splicing machinery to scan reading frames on the other hand. We reasoned that processes such as these may be utilised not only in the presence of nonsense mutations but also under normal circumstances. We therefore wished to determine if naturally occurring transcripts encoding trun- cated proteins due to the presence of premature nonsense codons could be detected in vivo as naturally occurring splicing variants might provide evidence for our hypothe- sis. We chose to investigate the human hMSH2 genes, which is the human homologue of the bacterial MutS gene.

2. Materials and methodLs

2.1. Amplification of hMSH2 cDNA

Peripheral blood lymphocytes were isolated from 10 ml of peripheral blood drawn from 10 healthy individuals and total cellular mRNA was isolated using the Pharmacia 'Quick Prep' Kit. cDNA was synthesised by the random hexamer primer method using 50 ng of total cellular mRNA. Codons 8 to 388 of hMSH2 cDNA were amplified using 2 rounds of nested PCR. In the first round PCR reaction using primer set 1 reaction conditions were 95°C 5 minutes, initial denaturation followed by 35 cycles of

95°C 30 seconds, 54°C 1 minute, 72°C 2 minutes, using 50-100 ng of cDNA template, 300 ng each of forward and reverse primer and 1.5 units of AmpliTaq DNA poly- merase.

In the second nested PCR, 1 Ixl of the first 50 Ixl PCR reaction was diluted 1:10 and 1 Ixl of the dilution was used as template using primer set 2. Conditions for this reaction were the same as for the first PCR reaction except that the primer annealing temperature was 62°C. Negative controls were processed as above except that the reverse transcrip- tase step was omitted. PCR products were visualised fol- lowing electrophoresis in 2% agarose.

Primer set 1 was: 5'-CAGCCGAAGGAGACGCTGC-3' sense 5'-CTTCTTGGCAAGTCGGTTAAG-3' antisense.

Primer set 2 was: 5'- G C T G C A G T T G G A G A - G A G C G C G G C - 3 ' sense 5 ' - G G T T A A G A T C T G G - GAATCGAC-3' antisense.

Codons 680-934 and 146 bp of the 3' untranslated region were amplified using primer set 3 and 4.

Primer set 3 was: 5 '-GACAAACTGGGGTGATAG- TAC-3' sense 5'-CAGCACATCACTTATTATTGC-3' an- tisense.

Primer set 4 was: 5 '-GTACTCATGGCCCAAAT- TGGG-3' sense 5'-CTATGTCAATTGCAAACAGTC-3' antisense.

2.2. DNA sequencing

hMSH2 cDNA PCR products were direct sequenced using the 'Sequenase' kit (United States Biochemical) according to the manufacturer's instructions. Sequencing was performed using both primers from primer set 2 as shown above.

3. Results

The hMSH2 gene is involved in the mismatch repair of DNA and the entire cDNA sequence of the gene has recently been published due to the involvement of the gene in hereditary non-polyposis colon cancer [13,14]. As non- sense containing transcripts are likely to be rare due to possible rapid degradation, we used a nested PCR protocol to amplify codons 8 to 388 utilising primer sets 1 and 2. As template, we used cDNA prepared from PBLs of 10 healthy individuals. Under these amplification conditions we were able to amplify, in addition to the expected full length product of 1.1 kb, additional smaller amplification products in 6 of the 10 samples (lanes 2, 3, 4, 5, 6, and 8) as shown in Fig. 1. Negative controls in which the reverse trancriptase was omitted were routinely negative for all of the above bands.

Sequencing of the smaller products revealed that they were all products of the hMSH2 gene and most were the result of unexpected splicing events. All but one contained a premature stop codon. Samples 3, 4, 5 and 6 in Fig. 1 all

Page 3: Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the gene coding region

90 B. Marshall et al. / Biochimica et Biophysica Acta 1308 (1996) 88-92

contain smaller amplification products of approx. 280 bp. Sequencing revealed that these products were identical and resulted from the splicing of exon 1 to exon 7 at the constitutively used splice sites with the subsequent dele- tion of exons 2-6 inclusive (Fig. 2A). This resulted in the introduction of a premature stop codon (TGA) in exon 7 at nucleotide 1079 of the cDNA sequence [14]. We have since determined that this particular splicing product is relatively common, being detectable in a significant num- ber of individuals. The = 600 base pair PCR product in sample 2 appears to be the result of a splicing event between nucleotide 228 in exon 2 and nucleotide 800 in exon 5 resulting in the out of frame deletion of 572 base pairs of cDNA sequence with the consequent introduction of a TAA stop codon at nucleotides 818-820 (Fig. 2B). The sequences in exons 2 and 5 at which splicing has occurred appear to be cryptic splice sites as the splice donor and acceptor sequences show considerable homol- ogy to consensus splice site sequences. The splice donor sequence of AG/AGTGTT bears substantial homology to the consensus mammalian splice donor site of AG/GTRAGT, where the bar represents the exon/intron boundary, although the invariant GT splice donor sequence is one base removed from the usual position. The splice acceptor sequence at nucleotides 797-799 of the cDNA is the canonical AG dinucleotide sequence, invariably found at mammalian splice acceptor sites. Sequencing of exon- intron boundaries failed to show any mutations at splice donor-acceptor sites in these individuals (not shown), while genomic DNA could not be amplified using the primers shown, indicating that the amplification products are not due to deletions in contaminating genomic DNA.

In addition other shorter RT-PCR products were present in sample 8 and sample 3, which contained the = 280 bp product referred to above and a second shorter transcript of = 600 bp (Fig. 1B). This second shorter transcript of --- 600 bp, appeared to be the result of an unusual process as it contained a deletion of 553 bp which did not coincide with the use of constitutive or cryptic splice sites and in place of the missing sequence contained an insertion of 36 bp which bore 100% homology to certain Alu repeats and is not present in any h M S H 2 exonic region (Marshall et al., unpublished results). The shorter transcript in sample 8

A.

& C G T

B.

Fig. 2. Sequence of nonsense-containing cDNAs from samples 3, 4, 5, 6, (A) and sample 2 (B).

1 2 3 4 5 6 7 8 9 10 M

Fig. I. Amplification of codons 8 to 388 of hMSH2 cDNA. Individual lanes represent the amplification products obtained from different healthy individuals. The smallest product present in all lanes is due to the PCR primers. M: Molecular weight ladder.

was a result of a -~ 700 bp in-flame deletion and thus no stop codon was present.

We analysed the sequences in the region immediately downstream from the stop codons in sample 2 and samples 3, 4, 5, and 6 and were surprised to find sequences which bear similarity to previously described mRNA destabilis- ing motifs. A common feature of these downstream se- quences in sample 2 and samples 3, 4, 5 and 6 was that they were all A:U rich and the relatively rare tetranu- cleotide GAUG was situated 23 base pairs downstream from the stop codon in each. Within the 1:1 kb being amplified there are only 4 GAUG motifs (in exons 3, 4, 5

Page 4: Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the gene coding region

B. Marshall et al. / Biochimica et Biophysica Acta 1308 (1996) 88-92 91

Sto~ Sample 2 UAA UC [ - ' A A G ~ A G A A l CUCUUAUCA GAUG

Sample3,4,5,6 UGA AUL1UAGUGG IAAGCUUUUGUAGAAI GAUG

Fig. 3. Comparison of sequences of hMSH2 nonsense containing tran- scripts downstream of stop codon. Regions of homology are boxed while non-homologous bases are shown in bold type.

and 7). Furthermore within the 23 base pair A:U rich region which separates the stop codon from the GAUG motif there is a 14 base pair region of homology with only 2 mismatches, although the position of the 14 base pair sequence relative to the GAUG motif differs in the two samples (Fig. 3). Additionally, in samples 3, 4, 5 and 6 there are two AUUUA motifs within 60 base pairs down- stream of the introduced stop codon (not shown). Within the entire 2.8 kb cDNA sequence of the gene there are only 5 such pentamers. The first AUUUA motif is located immediately downstrearr~ of the stop codon while the second is located betweea 54-58 nucleotides downstream of the stop codon. Sample 2 does not contain any AUUUA motifs downstream of the,, stop codon, although the down- stream region is A:U rich.

4. Discussion

Two possible explanations could underlie the presence of the nonsense containing transcripts. Firstly, they may be the result of occasional splicing errors which the cell is nevertheless able to tolerate due to their relative infre- quency or lack of functional consequences. If this were so, it might be expected that out of frame splicing errors resulting in nonsense mutations might occur elsewhere within the coding region. We therefore amplified a region of = 1 kb from the 3'-end of the hMSH2 gene using primer sets 3 and 4 encompassing the final 4 exons and 3'-untranslated region of the gene. We could detect no discrete shorter splice products of the type described above and only full length product was observed (data not shown). This = 1 kb region contains only two GAUG motifs, both situated towards the 5'-end of the segment, with the final 600 bp pairs of the hMSH2 cDNA sequence lacking any GAUG tetranucleotide.

A second possible explanation is that the nonsense containing transcripts were the result of deliberate splicing decisions and have definite functional consequences for the cell. The limited number of nonsense-containing tran- scripts, together with the relatively common occurrence of the = 280 bp product in 4 out of 10 individuals is consistent with the idea that they do not occur randomly. However this does not necessarily mean that they are not 'errors'. It is possible, for instance, that intronic polymor- phism may play a role in the frequency of unusual splicing products. Although the splicing of exon 1 to exon 7 could theoretically occur as a particularly error prone splicing

artefact we regard this possibility as unlikely. The factors which determine splice site usage include both cis and trans acting factors. With respect to the former both in- tronic [15-17] and exonic [18,19] sequences are involved in guiding the splicing machinery toward splice site defini- tion with distance between splice donor and acceptor sequences an important component of the equation. Fur- thermore, recent evidence suggests that an additional level of scrutiny is involved in splice site selection, namely, the necessity to maintain an open reading frame [4]. Thus considerable safeguards exist to prevent nonsense codon insertion due to error. It should be noted that the nonsense containing transcripts do not contain introns and are thus not partial splicing products due to incomplete splicing although we cannot rule out the possibility that introns are present in the region downstream from that analysed. Intron containing transcripts have been reported to be unstable due to the presence of stop codons within the intron and are likely candidates for a nonsense-mediated mRNA decay pathway [20].

It has been suggested that mRNA surveillance and degradation of transcripts containing nonsense codons may exist in order to protect against severe clinical manifesta- tions due to truncated, non-functional proteins [21]. We propose that it is part of a general post-transcriptional regulatory process. We suggest that the nonsense-contain- ing splicing products are part of a post-transcriptional system for the control of gene expression and that the stop codons arose due to definite splicing decisions which placed the stop codons in the vicinity of sequences which bring about rapid destabilisation of mRNA following trans- lation. Instability motifs have been identified within the coding region of several other genes, most notably the c-fos and c-myc oncogenes [22,23] and promote RNA turnover. The 'proof-reading' ability of the splicing ma- chinery may function not only to splice out exons which contain inappropriate stop codons but also to insert stop codons when mRNA turnover requires it. The 280 bp splicing variant contains elements of two definite mRNA turnover motifs in the region downstream of the stop codon, in the context of an A:U rich domain. A:U rich regions, particularly those located in 3'untranslated re- gions, have been demonstrated to be essential determinants of mRNA instability. It is possible that the GAUG and AUUUA motifs may be able to interact to bring about a rapid degradation of mRNA as neither on its own consti- tutes a perfect match to reported consensus sequences. For the phenomenon to be a general one it would be necessary for nonsense containing transcripts to be kept to a rela- tively low level, as some prematurely truncated proteins may have a dominant negative effect. Previous workers have reported that stop codons located near the 3'-end of mRNAs have no effect on mRNA stability. Our inability to amplify nonsense-containing transcripts from the 3' region of the hMSH2 gene is regulated is consistent with this finding. The manner in which the expression of the hMSH2

Page 5: Naturally occurring splicing variants of the hMSH2 gene containing nonsense codons identify possible mRNA instability motifs within the gene coding region

92 B. Marshall et al. / Biochimica et Biophysica Acta 1308 (1996) 88-92

gene is regulated is of particular interest. It has been suggested that mismatch repair activity may be attenuated in G2 phase of the cell cycle in order to allow pairing of homologous chromosomes before mitosis [24]. Confirma- tion that the downstream sequences described here facili- tate mRNA turnover will require the production of con- structs containing the putative instability elements and assaying their in vitro activity when placed downstream of stop codons in reporter genes. We are presently undertak- ing these experiments.

Acknowledgements

We wish to thank Lufs Vieira for preparation of cDNA samples and Cristina Alves for photographic assistance and Margarida Amaral for helpful discussions.

References

[1] Sharp, P.A. (1987) Science 235, 766-771. [2] Guthrie, C. (1991) Science 253, 157-163. [3] Dietz, H.C., Valle, D., Francomano, C.A., Kendzior, R.J., Pyeritz,

R.E. and Cutting, G.R. (1993) Science 259, 680-683. [4] Dietz, H.C. and Kendzior, R.J. (1994) Nature Genet. 8, 183-188. [5] Cheng, J. and Maquat, L.E. (1993) Mol. Cell Biol. 13, 1892-1902. [6] Daar, I.O. and Maquat, L.E. (1988) Mol. Cell Biol. 8, 802-813. [7] Caput, D., Beutler, B., Hartog, R., Thayer, R., Brown-Shimer, S.

and Cerami, A. (1986) Proc. Natl. Acad. Sci. USA 83, 1670-1674. [8] Shaw, G. and Kamen, R. (1986) Cell 46, 659-667.

[9] Zubiaga, A.M., Belasco, J.G. and Greenberg, M.E. (1995) Mol. Cell Biol. 15, 2219-2230.

[10] Lagnado, C.A., Brown, C.Y. and Goodall, G.J. (1995) Mol. Cell Biol. 14, 7984-7995.

[11] Zhang, S., Ruiz-Echevarria, M.J., Quan, Y. and Peltz, S.W. (1995) Mol. Cell Biol. 15, 2231-2244.

[12] Brown, B.D., Zipkin, I.D. and Harland, R.M. (1993) Genes Dev. 7, 1620-1631.

[13] Fishel, R., Lescoe, M.K., Rao, M.R.S., Copeland, N.G., Jenkins, N.A., Garber, J., Kane, M. and Kolodner, R. (1993) Cell 75, 1027-1038.

[14] Leach, F.S., Nicolaides, N.C., Papadopolous, N., Liu, B., Jen, J., Parsons, R., Peltomaki, P., Sistonen, P., Aaltonen, L.A, Nystrom- Lahti, M., Guan X.-Y., Zhang, J., Meltzer, P.S., Yu, J.W., Kao F.-T., Chen, D.J., Cerosaletti, K.M., Fournier, R.E.K., Todd, S., Lewis, T., Leach, R.J., Naylor, S.L., Weissenbach, J., Mecklin J.-P., Jarvinen, H., Petersen, G.M., Hamilton, S.R., Green, J., Jass, J., Watson, P., Lynch, H.T., Trent, J.M., de la Chappelle, A., Kinzler, K.W. and Vogelstein, B. (1993) Cell 75, 1215-1225.

[15] Fu, X.Y. and Manley, M. (1987) Mol. Cell Biol. 7, 738-748. [16] Green, M.R. (1986) Annu. Rev. Genet. 20, 671-708. [17] Manniatis, T. and Reed, R. (1987) Nature 325,673-678. [18] Domenjoud, D., Gallinoro, H., Kister, L., Meyer, S. and Jacob, M.

(1991) Mol. Cell Biol. 11, 4581-4590. [19] Robberson, B.L., Cote, G.J. and Berget, S.M. (1990) Mol. Cell Biol.

10, 84-94. [20] He, F., Peltz, S.W., Donahue, J.L., Rosbash, M. and Jacobsen, A.

(1993) Proc. Natl. Acad. Sci. USA 90, 7034-7038. [21] Kugler, W., Enssle, J., Hentze, M.W. and Kulazik, A.Z. (1995)

Nucleic Acids Res. 23, 413- 418. [22] Kabnick, K.S. and Houseman, D.E. (1988) Mol. Cell Biol. 8,

3244-3250. [23] Bemstein, P.L., Herrick, D.J., Prokipcak, R.D. and Ross, J. (1992)

Genes Dev. 6, 642-654. [24] Rayssiguer, C., Thaler, D.S. and Radman, M. (1989) Nature 342,

396-401.