Nonsense Mutations

9
 Complementat ion groups Many of these condit ions can be di vi ded into a ser ies ofcomplementation groups by using cell fusion tests (Figure 13.JO). f t!o cells" #and $" lac% function in different repair genes" then !hen the cells are fused the resulting hybrid cell !ill contain functioning copies of both genes. Cell #" !ith gene # defective" !ill  provide a functioning copy of gene  B, and cell $" in !hich gene B is defective" !ill provide a functional copy of gene #. &hus" the hybrid should recover the !dtype resistance to '# damage. sing this techni*ue" cells from patients !ith +eroderma pigmentosum" caused by defects in nucleotide e+cision repair (see $o+ 13.3)" have been divided into seven different groups. Cells from anyone group !ill complement (ma%e good) the defect in a cell from any other group. Fanconi anemia" caused by a defective cellular response to '# damage" has been divided into at least 1, groups. n general" a different gene is mutated in each separate complementation group" although the clinical phenotypes overlap. 'etails of these genetic diseases and their complementation groups can be found in the OMM database (http-!!!.ncbi.nlm.nih.govomim). Molecular studies of these various groups have defined a large number of genes involved in human '# repair. /orting out the individual path!ays has been greatly aided by the very strong conservation of repair mechanisms across the !hole spectrum of life. ot only the reaction mechanisms but also the protein stru ctu res and gene se*uences are often con serv ed from 0. coli to humans. enerally" eu%aryotes have multiple systems corresponding to each single system in 0. coli. For e+ample" nucleotide e+cision repair re*uires si+ proteins in  E. coli  but at least 32 in mammals. # do!nside of the conservation is a confu/ing gene nome nclatu re" referring somet imes to human diseases (e.g.  xeroderma pigmentosum type '" or  XPD), sometimes to ye ast mutant s (RAD gen es)" and sometimes to mammal ian cel l comp lementa tion systems (0CC 4e+cisi on repair cross4c omplem enting ). /o" for e+amp le"  XPD, ERCC2, and RAD are the same gene in human" mouse" and yeast.  ot all the diseases that involve hypersensitivity to '#4damaging agents are caused by defect s in the '# rep air systems themselves. /ometimes it is the broad er cellular response to damage that is defective. ormal cells react to '# damage by stallin g progress through the cell cycle at a chec%point until the damage has been repaired" or by triggering apoptosis if the damage is irreparable. 5atients !ith ata+ia4telangiectasia and Fanconi anemia !a"e intact repair systems but are deficient in damage4sensing or response mechanisms. 'efects in cell cycle control and in the apoptotic response are central to the de"elopment of cancer" and they !ill be discussed further in Chapter 16. 13.3 5#&7O0C '# 8##&/ #s menti one d a#o"e, the genome s of healthy ind ividu als !a "e huge numbers of se*uence variants. &he great ma9ority of these are completely harmless and have no %no!n effect on the  phenotype. 0ven most of those that do affect the phenotype are part of the normal variation that ma%es us all individual. /pecial interest" ho!ever" naturally attaches to those variants that are pathogenic4that is" they either ma%e us ill or ma%e us susceptible to an illness. 'eciding !hether a '# se*uence change is pathogenic can be difficult  ot every se*uence variant seen in an affected person !ill be pathogenic. Just as perfectly healthy people carry innumerable se*uence variants" the same !ilJ be true of a person !ith a genetic disease. 7o! can !e decide !hether a se*uence change !e !a"e disco"ered in such a  person is the cause of their disease or a harmless variant: Only a functional test can give a def ini tiv e ans!er 4bu t fun ctio nal tests are oft en dif fic ult to int egrate int o the !or % of a diagnostic laboratory.

description

ddddd

Transcript of Nonsense Mutations

Complementation groups Many of these conditions can be divided into a series ofcomplementation groups by using cell fusion tests (Figure 13.JO). If two cells, Aand B, lack function in different repair genes, then when the cells are fused the resulting hybrid cell will contain functioning copies of both genes. Cell A, with gene A defective, will provide a functioning copy of gene B, and cell B, in which gene B is defective, will provide a functional copy of gene A. Thus, the hybrid should recover the wIIdtype resistance to DNA damage. Using this technique, cells from patients with xeroderma pigmentosum, caused by defects in nucleotide excision repair (see Box 13.3), have been divided into seven different groups. Cells from anyone group will complement (make good) the defect in a cell from any other group. Fanconi anemia, caused by a defective cellular response to DNA damage, has been divided into at least 12 groups. In general, a different gene is mutated in each separate complementation group, although the clinical phenotypes overlap. Details of these genetic diseases and their complementation groups can be found in the OMIM database (http://www.ncbi.nlm.nih.gov/omim). Molecular studies of these various groups have defined a large number of genes involved in human DNA repair. Sorting out the individual pathways has been greatly aided by the very strong conservation of repair mechanisms across the whole spectrum of life. Not only the reaction mechanisms but also the protein structures and gene sequences are often conserved from E. coli to humans. Generally, eukaryotes have multiple systems corresponding to each single system in E. coli. For example, nucleotide excision repair requires six proteins in E. coli but at least 30 in mammals. A downside of the conservation is a confuSing gene nomenclature, referring sometimes to human diseases (e.g. xeroderma pigmentosum type D, or XPD), sometimes to yeast mutants (RAD genes), and sometimes to mammalian cell complementation systems (ERCC-excision repair cross-complementing). So, for example, XPD, ERCC2, and RAD3 are the same gene in human, mouse, and yeast. Not all the diseases that involve hypersensitivity to DNA-damaging agents are caused by defects in the DNA repair systems themselves. Sometimes it is the broader cellular response to damage that is defective. Normal cells react to DNA damage by stalling progress through the cell cycle at a checkpoint until the damage has been repaired, or by triggering apoptosis if the damage is irreparable. Patients with ataxia-telangiectasia and Fanconi anemia have intact repair systems but are deficient in damage-sensing or response mechanisms. Defects in cell cycle control and in the apoptotic response are central to the development of cancer, and they will be discussed further in Chapter 17. 13.3 PATHOGENIC DNA VARIANTS As mentioned above, the genomes of healthy individuals have huge numbers of sequence variants. The great majority of these are completely harmless and have no known effect on the phenotype. Even most of those that do affect the phenotype are part of the normal variation that makes us all individual. Special interest, however, naturally attaches to those variants that are pathogenic-that is, they either make us ill or make us susceptible to an illness. Deciding whether a DNA sequence change is pathogenic can be difficult Not every sequence variant seen in an affected person will be pathogenic. Just as perfectly healthy people carry innumerable sequence variants, the same wilJ be true of a person with a genetic disease. How can we decide whether a sequence change we have discovered in such a person is the cause of their disease or a harmless variant? Only a functional test can give a definitive answer-but functional tests are often difficult to integrate into the work of a diagnostic laboratory.

In any case, for many gene products no laboratory test is available that checks all aspects of the gene's function in vivo. Some variants may be pathogenic only at times of environmental stress, and others may have subtle effects that manifest as susceptibility to a disease, perhaps only when in combination with certain other genetic variants. In the absence of a definitive functional test, the nature of the sequence change often provides a clue. First we can ask whether the variant affects a sequence that is known to be functional. Such sequences would include the coding sequences of genes, sequences flanking exon-intron junctions (splice sites), the promoter sequence immediately upstream of a gene, and any other known regulatory sequences. The great majority of all known pathogenic variants affect sequences that were already known to be functional, and these comprise only a small percentage of our total DNA. However, it is always possible that a variant located outside any known functional sequence might lie in a currently unidentified functional element. As we saw in Chapter II, the ENCODE project is revealing many previously unsuspected functional elements in the human genome. Such elements are suspected to be locations for variants that merely alter susceptibility ro a disease, rather than directly causing any disease. [fa variant does affect a known functional sequence, we must try to predict its effect. A table of the genetic code (see Figure 1.25) can be used to identify the effect of a coding sequence variant on the protein product of a gene. As described below, nonsense mutations, frameshifts, and many deletions can be confidently predicted to wreck the protein. Similarly, changes to the invariant GT...AG sequences at splice sites are highly likely to be pathogenic. Changes that merely replace one amino acid with a different one (missense changes) are more difficult to interpret. Another approach is to look for precedents. Maybe a variant is already documented in dbSNp, the database of single nucleotide polymorphisms (see above). Alternatively, it may be documented in one of the databases of pathogenic mutations listed in Further Reading. A different sort of precedent can be sought by checking the normal sequence of related genes. These may be in humans (paralogs) or other species (orthologs). If the variant is present as the normal, wildtype sequence of a related gene it is unlikely to be pathogenic. Further aspects of this problem are considered in Chapter 18, where we discuss genetic testing. In the rest of this section we consider some of the many ways in which a change in a functional sequence can be pathogenic. Single nucleotide and other small-scale changes are a common type of pathogenic change Pathogenic changes are often caused by small-scale sequence changes in either the coding sequence or the regulatory region of a gene. Missense mutations A single nucleotide substitution within the coding sequence of a gene mayor may not alter the sequence ofthe encoded protein. The genetic code is degenerate: the 64 codons encode only 20 different amino acids (plus three stop codons). Thus, some codon changes do not alter the amino acid-they are silent or synonymous. When the codon change does result in a changed amino acid (a nonsynonymous change), the effect depends partly on the chemical differences between the old and new amino acids. As explained in Chapter I, the 20 amino acids can be classified into acidic, basic, uncharged polar, and uncharged nonpolar types. Replacing an amino acid by one in the same class (a conservative substitution) has less effect on the protein strucrure than a nonconservative substitution. Adding or removing cysteine alters the potential for forming disulfide bridges, and so can cause major structural changes. Similarity matrices have been constructed that give a quantitative score for the likely disruptive effect of any substitution (see Further Reading). Some amino acids are crucial to the functioning of a particular protein-for example, those at the active site of an enzyme. Others may be important for maintaining tile protein structure. Globular proteins tend to have uncharged nonpolar amino acids in the interior and charged ones on the outside; any substitution that changes this may disrupt the three-dimensional folding. The sickle cell mutation is pathogenic because it replaces a polar amino acid with a nonpolar one on the outside of the globin molecule (Figure J3.11). This makes the molecules tend to stick together. Protein aggregation, the result of abnormal proteins having sticky external areas, has emerged as a common pathogenic mechanism in a variety of diseases, especially progressive neurodegenerative conditions, and is discussed further on p. 425. It is seldom possible to predict these effects with much confidence. It helps if the three-dimensional structure of the protein has been solved, so that one can model the likely structural effect of a substitution. If amino acid sequences of related proteins (from humans or other organisms) are known, we can see which amino acids are invariant and which seem free to vary widely between species. Most amino acid substitutions probably have no effect on the functioning of a protein.

Nonsense mutations Three of the 64 codons in the genetic code are stop codons, and so it is quite common for a nucleotide substitution to convert the codon for an internal amino acid of a protein into a stop codon. When ribosomes encounter a stop codon they dissociate from the mRNA, and the nascent polypeptide is released (Figure 13.12.A). However, genes containing premature termination codons seldom cause production of the truncated protein that might be predicted. Cells have a mechanism, nonsense-mediated decay (NMD), that detects mRNAs containing premature termination codons and degrades them. Thus, the usual result of a nonsense mutation is to prevent any expression of the gene. NMD works because the spliced mRNA that travels from the nucleus to the ribosomes retains a memory of the positions of the introns. The splicing mechanism leaves proteins of the exon junction complex (EJCj attached to splice sites. During the first (pioneer) round of translation, as the ribosome passes each splice site it clears the ETC proteins attached to that site. Ifthere is a premature termination codon, the ribosome will not have traversed every splice site before it detaches. Some EJC proteins will remain attached to the mRNA, and this marks dle mRNA for destruction (Figure 13.128). Nonsense-mediated decay is not always fully effective. It does not apply to premature stop codons that are in the last exon of a gene, or less than about 50 nucieotides upsU'eam of the last splice junction. [n some cases, some quantity of truncated protein is produced even when the stop codon is not in this protected zone. Truncated proteins are potentially more pathogenic than a simple absence of the protein (Figure 13.12C) because they have the potential to interfere with the function of the normal product. Such dominant-negative effects will be discussed later in this chapter (see p. 431). [t is assumed that NMD has arisen to protect against this problem. Changes that affect splicing of the primary transcript The positions of splice sites are marked by the (almost) invariant canonical GT... AG sequence, embedded within a less tightly defined consensus splice site recognition sequence (see Chapter 1). Mutations that change the canonical GT or AG will always prevent recognition of the site by the spliceosome and so will disrupt splicing at that site (Figure 13.13A), but a variety of other sequence changes may also affect it. Splicing is not an all-or-nothing process. As mentioned in Chapter II, splice sites can be strong or weak. Because of variable use ofweak splice sites, most human genes produce a variety of alternatively spliced transcripts. Splicing enhancer or suppressor sequences modulate the strength of an adjacent splice site by binding proteins of the SR (serine and arginine-rich) and hRNP (heterogeneous ribonucleoprotein) families, possibly in a stage-specific or tissue-specific way. If one of these modulating sequences is mutated in a gene that naturally produces multiple splice isoforms, the effect may just be to alter the balance of isoforms. Depending on the functions of the various isoforms, this may abolish all gene function or make more subtle changes, such as affecting the pattern of tissue-specific isoforms. Inactivating a splice site will usually destroy the function of a gene, or at least of all isoforms that use that site, but the precise molecular events are hard to predict. Sometimes an exon is skipped; sometimes intronic sequence is retained in the mature mRNA; often an adjacent cryptic splice site is used. Cryptic splice sites are sequences within a primmy transcript (in exons or introns) that resemble true splice sites, but not sufficiently closely that they are normally recognized as such by the cell (Figure 13.13B). A nucleotide substitution within a cryptic site may increase the resemblance sufficiently to convert it into a functional site. This will disrupt the correct processing of the transcript. Alternatively, a sequence change may reduce the strength of a true splice site so that a nearby cryptic site is used preferentially. It is often hard to predict ftom the DNA sequence whether or not a change will affect splicing. Apparent missense or silent mutations may actually be pathogenic through an effect on splicing. The difference between the two human SMN genes illustrates such an effect (Figure 13.13C). At least two duplicated but slightly divergent copies of the SMNgene are found on chromosome 5q13; the copy closest to the centromere is highly expressed, whereas the copy or copies nearer the telomere produce almost no protein. The difference is due to a TTT ....TTC change in exon 7. Although this is apparently silent (both ITT and TTC encode phenylalanine), the change inactivates a splicing enhancer and prevents correct splicing at the intron 6-exon 7 junction. Individuals homozygous for a loss offunction of tl1e centromeric SMN gene suffer from spinal muscular atrophy (WerdnigHoffmann disease; OMIM 253300), but the severity is reduced in those patients who have multiple copies ofthe weakly expressed telomeric gene. Computer programs are available that estimate the strength of a splice site, or check whether an apparent missense mutation might affect a splicing enhancer or suppressor, but there is no substitute for real experimental data obtained by RT-PCR. This is also the only method likely to identify a mutation that activates a cryptic splice site deep within an intron, as does the cystic fibrosis 3849+10 kb C....T mutation (see Figure 13.13B). Frameshifts The translational reading ftame is set by the initiator AUG codon; any downstream change that adds or removes a non-integral number of codons (that is, a number of nucleotides that is not a multiple of three) will cause a frameshift (Figure 13.14). Two out of every three random changes to the length of a coding sequence would be expected to produce a frameshift. Because 3 out of 64 possible codons are stop codons, reading a message out of ftame will usually quickly hit a premature termination codon. Nonsense-mediated decay will then most probably mean that no protein is produced. An example is found in the GJB2 gene that encodes connexin 26, a component of gap junctions between cells. A run of six consecutive G nucleotides in the gene is prone to replication slippage. This introduces a premature stop codon (Figure 13.15). This mutation is the most frequent single cause of autosomal recessive congenital hearing loss (OMIM 220290) in most European populations. Several different types of event can produce a frameshift. As well as small insertions or deletions within a single exon, abnormal splicing or whole exon deletions or duplication sussulay cause frameshiftt. Most exons are not frameneutral (that is, the numberofnucleotides in rhe exon is not a multiple ofthreel, so excluding one or more exons because of a deletion or spliCing error will more often rhan not create a frameshift. Most introns are much larger rhan rhe exons they surround, so in most genes the breakpoints of random intragenic deletions or duplications will lie wirhin introns. The result is to delete or duplicate one or more whole exons [FIgure 13.16). Changes that affect the level of gene expression A variant in a control sequence might affect the level oftranscription of a gene, so that although the gene product is entirely normal, too little (or too much) of it is produced. Most obviously this would happen ifthe variant changed the promoter sequence. In reality, fewsuch variants have been described. This is partly because diagnostic laboratories do not routinely sequence promoters. If they did find a change, rhey would seldom know how to interpret its significance. As described in Chapter 1, promoters can include consensus binding sites for a variety oftranscription factors, and the effect of a sequence change can usually only be determined by experiment. Patients with rJ.-or ~-thalassemia have been thoroughly investigated in rhe search for mutations affecting transcription. Their diseases are forms of anemia caused by a quantitative deficiency of a-or ~-gJobin, respectively, and so were prime candidates for mutations of this type. However, a-rhalassemia is most commonly caused by reduced numbers of active a-globin genes (see below). Some mutations in the Jl-globin gene promoter have been identified (see OMIM 141900, listed mutations 370-381, and also Figure 13.HAI, but rhe great majority of~-rhalassemia mutations work by producing an unstable mRNA or an unstable globin protein, rarher than directly repressing transcription. Three common types of event have been noted: Many thalassemia mutations are splicing errors or nonsense mutations that produce a prematw'e termination codon, resulting in nonsense-mediated decay ofthe mRNA. Changes in the 3' untranslated region may cause the mRNA to be unstable. A change in rhis region may create or abolish a critical binding site for a microRNA or a protein that regulates translation (see Chapter Ill. Cells are very efficient at detecting and degrading abnormal proteins that fold incorrectly. For example, one form of a.-thalassemia is caused by a change of the normal TAA stop codon of an a.-globin gene to CAA. This encodes glutamine, and translation of the mRNA continues for a further 31 amino acids until another stop codon is encountered (Figure 13.17B). This variant hemoglobin (Hb Constant Spring) is unstable, and clinically the effect is a.-thalassemia resulting from a quantitative deficiency of a.-globin chains. Pathogenic synonymous (silent) changes As mentioned above, synonymous changes are nucleotide substitutions that convert one codon for a particular amino acid into another that encodes the same amino acid. Such changes would not be expected to have any phenotypic effect. Sometimes, however, the altered nucleotide is part of a splicing enhancer or suppressor and the change affects splicing. As mentioned above, in spinal muscular atrophy a silent TTC-.TTT change marks the difference between an active, correctly spliced gene and an inactive, incorrectly spliced gene (see Figure 13.13C). Even silent changes that do not affect splicing may not be entirely neutraJ. The MDRl gene encodes the P-glycoprotein, which is important in the transport of many drugs; its overexpression is an important determinant of multi-drug resistance in chemotherapy. A silent change, c.3435C-.T, in exon 26 of the gene results in the production ofa protein that has an unchanged amino acid sequence (the change converts one codon for isoleucine 1145 into another) but is subtly differently folded and has a different activity profile. This is thought to be because the alternative codon requires an alternative tRNA that is in short supply, and this therefore changes tlle kinetics of translation. How common such effects might be is not known. Variations at short tandem repeats are occasionally pathogenic Whereas single nucleotide changes might affect any coding sequence, most short tandem repeats are located in noncoding DNA. Those that do occur in coding sequences are seldom polymorpltic. However, tandem repeat variants located near promoters or splice sites ofgenes can sometimes affect gene expression. For example, different alleles ofa 14 bp minisateUite near the promoter ofthe insulin gene on l1p15 are associated with differential risk of type 2 diabetes (see OM1M 176730). Another example occurs within the cystic fibrosis transmembrane conductance regulator (CFfR) gene, where a run ofT nucleotides near the 3' end of intron 8 affects the effiCiency of the adjacent splice site. Alleles with five, seven, or nine T nucleotides are common. Whereas the splicing of 7-Tor 9-T alleles is normal, 5-T aUeles are often mis-spliced and exon 9 is skipped. 5-T aUeles on their own do not reduce the output of correctly spliced mRNA so greatly as to be pathogenic, but in conjunction with other low-functioning variants they can be a cause of cystic fibrosis. Their effect is enhanced if a (TG)" repeat nearby in the intron has more than 11 repeats (Figure \3.\8).

Tandem repeats within coding sequences are not normally polymorphic, but may be liable to pathogenic mutations because of polymerase stutter. Somatic mutations of this type are a major cause of disease in people with defects in the post-replicative mismatch repair system (see Chapter 17). Expanded polyalanine runs in certain proteins are responsible for several inherited diseases. Examples include the PHOX2B protein in people with congenital central hypoventiIation syndrome (OMIM 209880) and the HOXDI3 protein in patients with synpolydactyly I (OMIM 186000). These variants presumably originated through polymerase stutter, but within a family they are stably transmitted, just like any other STRP aUele. In at least some cases, the expanded alanine run interferes with correct localization of the protein within the ceU. Dynamic mutations: a special class of pathogenic microsatellite variants So-caUed dynamic mutations are STRPs that, above a certain size, become intensely unstable. The molecular causes are not weU wlderstood, but they may be a consequence of the way in which, when DNA is replicated, one strand (the lagging strand) is synthesized as a series of discontinuous fragments-the Okazaki fragments (see Chapter I). A special endonuclease, FENI, cuts off the overhangs in overlapping Okazaki fragments. One proposed mechanism for repeat expansion is that FENI fails to make the CUIS, and overlapping fragments end up being joined end-to-end. Repeats up to a certain size are stable, and it may be significant that, in most cases, the threshold of instability occurs when the repeat sequence reaches the typical size of an Okazaki fragment. Not all dynamic mutalions are pathogenic, but several are (Table 13_1). Others are responsible for the nonpathogenic fragile sites seen by cytogeneticists when ceUs of some people are subjected to replicative stress (see Figure 13.5). For example the FRAI6A fragile site on chromosome 16 is due to an expanded (CCG)" repeat, whereas FRA16Bon the same chromosome is caused by an expanded 33 bp minisateIlite. The diseases in Table 13.1 are heterogeneous in many respects. There are different-sized repeat units, different degrees of expansion, different locations with respect to the affected gene, and different pathogenic mechanisms. Within these, the polyglutamine diseases form a weU-definedgroup, of which Huntington disease (HD; OMIM 143100) is the prototype. In these conditions, modest expansions of a (CAG)n repeat in the coding sequence of a gene lead to an expanded polyglutamine run in the encoded protein (Figure 13,19C). This, in turn, predisposes the protein to form intracellular aggregates that are toxic to ceUs, especially neurons (B .. x 13.4). The result is a progressive late-onset neurodegenerative disease. Other dynamic mutations affect gene sequences outside coding regions, and may involve much larger expansions. The expanded tandem repeat may be in the promoter, in the 5' or 3' untranslated region, or in an intron. Usually the effect is to prevent expression of the gene (Figure 13. 19A), and the pathology comes from the resulting lack of gene function. When this occurs, the same disease can sometimes be caused by other loss-of-function mutations in the same gene. In myotonic dystrophy 1 (DM1; OMIM 160900) the mechanism is quite different. There is a gain of function involving a toxic mRNA. A massively expanded (CTG)" run in the 3' untranslated region ofthe DMPK gene produces an mRNA that sequesters CUG-binding proteins in the nucleus. These are required for correct splicing of the primary transcripts of several unrelated genes, which therefore no longer function correctly. The result is a multisystem disease whose features bear no relation to the function ofthe DMPKgene product, a protein kinase, which is still produced in normal quantities. No other mutations in the DMPK gene produce myotonic dystrophy, but a very similar clinical disease (DMZ; OMIM 116955) can be caused by massive expansion of a (CCTG)n sequence in the completely unrelated ZNF9 gene.A hallmark of diseases caused by dynamic repeats is anticipation-in successive generations the age of onset is lower andlor the severity worse because of successive expansions of the repeat. See Chapter 3 for a cautionary note about the way in which biases in the way in which family members are ascertained can produce a spurious appearance of anticipation. In some cases, the threshold for instability is lower than the threshold for pathogenic effects, In these cases intermediate-sized nonpathogenic but unstable premutation alleles are seen that readily expand to full mutation alleles when transmitted to a child (e.g. FRAXA repeats of 50-200 units in fragile X syndrome; OMIM 300624). In other cases, alleles below the pathogenic threshold only very occasionally expand (e.g. HD alleles with 29-35 repeats). In some cases, there is a sex effect such that large expansions are mainly seen in alleles inherited from a parent of one sex (the father in HD, the mother in myotonic dystrophy). These reflect a differential survival of gametes carrying large expansions, not an inherent tendency to expand in one sex rather than the other.

Comparisons of repeat sizes in parent and child are hard to interpret mechanistically because they may reflect the mitotic or meiotic instability of the repeat in the parental germ line 01; alternatively, the ability of gametes to transmit large repeats. Moreover, repeat sizes are usually studied in DNA extracted from peripheral blood lymphocytes, and if there is mitotic instability the lymphocytes may have very different repeat sizes from those in sperm or egg, or in the tissues involved in the disease pathology. Some repeats are unstable somatically, and so give a smeared band when blood DNA is analyzed by electrophoresis. Others are W1stable between parent and child but stable in mitosis, and so blood DNA gives a sharp band, but of different sizes in parent and child. Variants that affect dosage of one or more genes may be pathogenic The pathogenic potential of abnormal gene dosage has long been appreciated because of the severe phenotypes produced by chromosomal trisomies. Nevertheless, only a minority of genes are sensitive to dosage. If a condition is recessive, that means that heterozygoteS are phenotypically normal despite having only a single fW1ctional copy of the gene in question. For such genes, dosage evidently does not matter. The recent discovery of abundant and large-scale variations in copy number between apparently normal people reinforces tlle message that not all variants in copy number are harmful. Chromosomal trisomies probably owe their characteristic phenotypes to just a few dosage-sensitive genes. For example, the characteristic features of Down syndrome are thought to be due largely to dosage effects of just two genes, DSCRI and DYRKIA. It is to be expected that more genes would produce phenotypic effects at half dosage than at 1.5-fold increased dosage. Thus, large deletions or monosomy of a whole chromosome are less well tolerated than duplications or trisomy in human development.

One common mechanism generating changes in gene dosage is non-allelic homologous recombination (NAHR). Segmental duplications (often defined as sequences 1 kb or longer witb 95% or greater sequence identity) may misalign when homologous chromosomes pair in meiosis. NAHR then produces deletions or duplications. The misaligned repeats have the same sequence but not the same chromosomal location, so recombination is homologous but the sequences are not alleles. Many (but by no means all) of the common nonpathogenic variants in copy number seen in normal healtby people are generated by this mechanism. a-Thalassemia provides a good example ofNAHR producing a pathogenic variation in gene dosage. Most people have four copies oftbe a-globin gene (aa/ a.a) as a result ofan ancient tandem duplication. As shown in Figure 13.20, NAHR between low-copy repeat sequences flanking the a-globin genes can produce chromosomes carrying more or fewer a-globin genes. Reduced copy numbers of a-globin genes produce successively more severe effects. People with tbree copies (aa/a-) are healthy; those with two (whetber the phase is a-/a-or aa/--) suffer mild a-tbalassemia; those with only one gene (a-/--) have severe disease; and lack of all a genes (--/--) causes lethal hydrops fetalis (fluid accumulation in the fetus). X-chromosome monosomy and trisomy are particularly interesting because X-inactivation (inactivation of all except one of the Xchromosomes in a cell; see Chapter 3) ought to render them asymptomatic in somatic tissues. However, as noted in Chapter 11, a surprisingly large number of genes on tbe Xchromosome escape inactivation. Some ofthese have a counterpart on the Ychromosome, but most do not. For those genes tbat escape X-inactivation but lack a Y-linked counterpart, normal females would have two functional copies and males only one. Turner (45,X) females would have the same single active copy as normal malesbut perhaps in tbe context of female development, a single copy is not sufficient. The skeletal abnormalities of Turner syndrome are caused by haploinsufficiency for SHOX (50% of the normal gene product is not sufficient to produce a normal phenotype). This is a homeobox gene tbat is located in the Xp/Yp pseudoautosomal region, and so is present in two copies in botb males and normal females. Below the level of conventional cytogenetic resolution but above the single gene level, pathogenic variations in copy number are classified as microdeletions or microduplications Cfable 13.2). Among tbese, three different molecular pathologies can be distinguished: Single gene syndromes, in which all the phenotypic effects are due to the deletion (or sometimes duplication) of a single gene. For example, Alagille syndrome (OMIM 118450) is seen in patients with a micro deletion at 20p11. However, 93% of Alagille patients have no deletion but instead are heterozygous for point mutations in tbeJAGl gene located at 20p12. The cause ofthe syndrome in all cases is a half dosage of the JAGI gene product. Contiguousgene syndromes are seen primarily in males vvith X-chromosome deletions (Figure 13.2lA). The classic case was a boy BB who had Duchenne muscular dystrophy (DMD; OMIM 310200), chronic granulomatous disease (CGD; OMIM 306400), and retinitis pigmentosa (OMIM 312600), together witb mental retardation. He had a chromosomaldeletion inXp21 that removed a contiguous set of genes and incidentally provided investigators witb the means to clone the genes whose absence caused two of his diseases, DMD and CGD. Deletions ofthe tip ofXp are seen in anotber set ofcontiguous gene syndromes. Successively larger deletions remove more genes and add more diseases to the syndrome. Microdeletions are relatively frequent in some parts ofthe Xchromosome (such as Xp21 and proximal Xq) but are rare or unknown in others (such as Xp22.1-22.2 and Xq28). No doubt the deletion of certain individual genes, and visible deletions in gene-rich regions, would be lethal. Similar contiguous gene syndromes are much less common with auto somes because ofthe presence ofthe balancing normal chromosome (Figure 13.21B). Langer-Giedon syndrome (trichorhinophalangeal syndrome, type II; OMIM 150230) is a rare example. Segmental aneuploidy syndromes are a special type of contiguous gene syndrome that regularly recur with a well-recognized phenotype. Examples include Williams-Beuren (OMIM 194050), Prader-Willi (OMIM 176270), Angelman (OMIM 105830), Smith-Magenis (OMIM 182290), and DiGeorge/ velocardiofacial (OMIM 1884001192430) syndromes (see Table 13.2). These syndromes all have deletions produced by NAHR between low-copy repeats that flank the region in question. NAHR will also produce duplications of these regions, although these may not be pathogenic. The example ofPraderWilli and Angelman syndromes (see Figure 11.20, p. 367) happens ro involve an imprinted region, which complicates the phenotype, but the same mechanism produces the other syndromes ruentioned above. As with other contiguous gene syndromes, the phenotype usually depends on dosage effects of more than one gene and is not seeu in people with a point mutation in just one of the genes. Williams-Beuren syndrome is typical. Patients are heterozygous for a 1.5 Mb deletion on chromosome 7ql1.23 that removes about 20 genes. Cases have been described who have smaller deletions, but no typical case has been found with just one gene deleted or mutated. Some other recognizable recurrent syndromes are produced by independent random terminal deletions of chromosomes in which a dosage-sensitive geue lies close to the telomere. Examples are the Wolf-Hirschhorn (OMIM 194190) and cri-du-chat (OM1M 123450) syndromes. In Miller-Dieker lissencephaly syndrome (OMIM 247200), random terminal deletions of 17p can remove one or mOre dosage-sensitive genes, producing a contiguous deletion syndrome.