aporotizite

download aporotizite

of 9

Transcript of aporotizite

  • 7/30/2019 aporotizite

    1/9

    Copyright0 99 1 by the Genetics Society of America

    Circumsporozoite Protein Genes of Malaria Parasites(Plasmodium spp.):Evidence for Positive Selection on Immunogenic RegionsAustin L. Hughes'

    Center o r Demographic and Population Genetics, The University of Texas HealthScience Center, Houston, Texas 77225Manuscript received May 30, 1990Accepted for publication October 10, 1990

    ABSTRACTTh e circumsporozoite (CS) protein is a cell surface protein of the sporozoite, the stage of the life

    cycle ofmalaria parasites (Plasmodium spp.) that infects the ver tebrate ost. Analysisof DNA sequencessupports the hypothesis that in Plasmodium falc iparum, ositive Darwinian selection favors diversity inthe T-cell epitopes (peptides presented to T cells by host MHC molecules) of the CS protein. In generegions encoding T cell epitopes of P. fa l c iparum, he rate of nonsynonymous nucleotide substitutionis significantly higher than that of synonymous substitution, whereas this is not true of o ther generegions. Furthermore nonsynonymous nucleotide substitutions in these regions cause a change ofamino acid residue charge significantly more frequently than expected by chance. By contrast, inPlasmodium cynomolgi, the same regions show no evidence of positive selection, and residue charge isconserved. Th e CS protein has a central repeat region, which is the target of host antibodies. InP .falciparum, the amino acid sequence of the repeat region is conserved within and between alleles. InP. cynomolgi, on the other hand, there is evidence that positive selection has favored evolution of twodifferent repeat types within a given allele.

    S VERAL recent papers have reported that posi-tive Darwinian selection acts on genes encodingproteins involved in recognition of foreign antigensby the vertebrate immune system (HUGHES nd NEI1988, 1989;TANAKAnd NEI 1989). So far, nostudyhas used similar methods to test the hypothesis thatgenes of pathogenic organisms are subject to corre-sponding selection for the ability to evade the host'sdefense mechanisms. I test this hypothesis in the caseof the circumsporozoite CS) protein genesof malariaparasites (Protista: Sporozoa: Plasmodium spp.). TheCS protein is expressed only on the surface of thesporozoite, the stage of the Plasmodium life historythat is infective to the vertebrate host, and has longbeen known to be a target of host immune responses(CLYDEt a l . 1973, 1975; NARDINt a l . 1982; GODSONet a l . 1983; ZAVALA t a l . 1983).

    Recent DNA sequence data have revealed consid-erable polymorphism at the CS protein locus in Plas-modium species, and several authors have suggestedthat this polymorphism may be the result of positiveselection. For example, in the simian parasite Plasmo-dium cynomolgi, there is considerable polymorphism intherepeatregion of the CS protein, which is thetarget of host antibodies (GALINSKIt a l . 1987). ENEAand ARNOT 1988) suggest that selection may havepromoted this polymorphism. In the human parasitePlasmodium falciparum, ariation among alleles in non-

    ' Currentaddress:Department o f Biology and Institute o f MolecularEvolutionaryGenetics, 208 MuellerLaboratory, T h e PennsylvaniaStateUniversity, University Park, Pennsylvania 16802.

    repeat regions has been shown to involve only differ-ences at first and second codon positions (Mc-CUTCHAN,OOD nd MILLER1989). Since differencesat first and second codon positions are generally non-synonymous and since the nonrepeat regions includepeptidesresented by major histocompatibility(MHC) molecules to T cells, it has been argued thatthis polymorphism is caused by positive selection toevade T cell recognition (GOOD t a l . 1988a; Mc-CUTCHEON,OOD nd MILLER 1989). However, sincethenumber of nucleotidedifferences involved issmall, this interpretation has been questioned (ARNOT1989). Because evidence regarding positive selectionon CS protein genes has so far been inconclusive, Iconducted detailed statistical tests of this hypothesisby examining patterns of nucleotide substitution inimmunologically important egions ofCS proteingenes.

    DNA SEQUENCES ANALYZEDIn Plasmodium fertilization occurs in the midgut of

    the insect host and is directly followed by meiosis.The haploid products of meiosis develop into sporo-zoites; these migrate o themosquito's salivary glands,where they mature and ventually infect a subsequentvertebrate host. The CS protein covers theentiresurface of the mature porozoite (YOSHIDA t a l . 1980;FINEet a l . 1984), accounting in one species for 10-20% of theprotein synthesized by the sporozoite(COCHRANEt a l . 1982). The CS gene, which is presentin a single copy per haploid genome, lacks introns and

    Genetics 127: 345-353 (February, 1991)

  • 7/30/2019 aporotizite

    2/9

    346 A . L. HughesT A B L E 1

    Circumsporozoite proteingene sequencesused in analyses

    Allele Amino ac id residuesSpears Hoststrain) 5 N R RepeatsNReferencesP . alc iparum HumanE5 N/A 1604) N/A DE LA CRUZ et a[ . (1987, 1988)N F 5 4 104 1764) 125 CASPERSt al . (1989)T 4 123 1764) 125 DEL PORTILLO,NUSSENZWEIGndENEA (1987)We1 133 1844) 125 LOCKYERndCHWARZ 1987)7G8 125 1644) 125 DE L A CRUZ,LALand MCCUTCHAN1 987)1. yoelii Murineodents 13818 (6 , 4) 1 1 1 LAL t al. (1987)P . berghei Murineodents N K 6 5 92 136(8,4) 100 L A N A R ( ~ ~ ~ ~ )1. cynomolgi Monkeys (Macaca) B 96 1729, 16) 110 GALINSKIt al. (1987)

    C 96 1959, 17) 107 GALINSKIt al. (1987)G 97 182(11) 123 GALINSKIt a l. (1987)L 97 1806, 11) 101 GALINSKIt a l. (1987)M/N 9808 (4) 109 GALINSKIt a l . ( 1987)H 9944(12) 120 SHARMAt al. ( 1 985)

    1. knowlesi Monkeys (Macaca) N 97 1449) 110 SHARMAt a l. (1985)P. v ivax Human B 95809) 103 ARNOT,ARNWELLnd STEWART1988)

    Numbers in parentheses are engths of amino acid repeat units. N/A =complete sequence not available.encodesaprotein which varies in length betweenspecies but is usually around 400 amino acid residueslong. The CS gene can be divided into three regions:(1) the 5 nonrepeat egion ( 5 N R ) ; (2) acentralrepeat region, consisting of one o r two short motifs(4- 12 codons in length) repeated n tandem numeroustimes; (3) the Snonrepeat 3NR) region. Table 1 listsDNA sequences analyzed in this paper; where dataare available, the number of amino acid residues en-coded by each of these gene regions is given.

    Evidence from a number of different host and par-asite species indicates that he repeat region is thetarget of antibodies against the CS protein (GODSONet a l . 1983; ZAVALA t al. 1983; BALLOUt a l . 1985).In the case of P. falc iparum, epitopes presented byclass I 1 MHC molecules and recognized by helper Tcells have been identified. These consist of two pep-tides (T cell epitopes 1 and 2, henceforth TCE) en-coded in 3NR region of the gene (GOOD, ERZOFSKYand MILLER1988; GOOD t a l . 1987, 1988; LOCKYER,MARSH and NEWBOLD 1989). So far, regionspre-sented by class I MHC molecules and recognized bycytotoxic T cells have not been recognized experi-mentally, although it has been argued that they arelikely to have similar properties to the helper T cellepitopes and thus to overlap them GOOD t al. 1987).

    Th e number of repeats in the repeat region mayvary between allelesof one species and in some species(such as P . cynomolgi), the lengthof each repeated unitvaries widely between alleles (Table 1). For thesereasons, the repeat region cannot be reliably alignedeven between alleles of the same species. The nonre-peat regions were aligned at the protein level by themethod of GOTOH 1986) (Figures 1 and 2). In both5NR and 3NR, notll sequences are complete. Also,in some parts of the 5NR, homology between differ-

    P falciparum H M R K L A I L S V S S F L F V E E Y O C Y G S S S N T R V L N E ~ - D N A G T N L Y N ~ 59P. y o e ~ i i KKCTILWASLLLVDSLLPGYGQNKSVQAQRNLNELCYNEENDNKL~V~SK-NGKI 58P . berghei -~CTILWASLLLVNSLLPGYGQNKSIQAQRNLNELCYNEGNDNKLYHVWSK-NGKI 58P knowlesi KNFILIAVSSILLVDLLPTHFEHNVDLSRAINVNGVSRVNVDTSSUiAAQSASRG 59P. cynomolgi KNFNLLAVSSILLVDLFPTHCGHNVDFSRGINLNGVSFNNVDASSHGAEQVRQSASRG 59P. , ax -PZKNFILIAVSSILLVDLFPTHCGHNVDLSKAINLNGVN~NVDASSLG~VGQSASRG 9

    p . falciparum ~ ~ S L - - - - - - - - - - - - - - K K N S R S L G E N D D G N N N N G D N G R E G K D E D K R D G N N E - ~04P. yoelii YNRNIVNRLLGDAINGKPEEKKDDPPKDGNKDDLPKEEKKDDLPKEEKKDDPPKDPKKDD 11 8p , berghex YNRNTVNRLL ............................................. ADAPE 73p , knovlesi RG L.......................................... GEKPKEGADKEKKKE 7 7p RG L.......................................... GENPKDEGADKPKKK 11p , v i v a x RG L.............. ............................. G E ~ P D D E E G D ~ 5

    P. falciparum LRKPKHKKLKOPGDGNPD-P 123P. yoelii PPKEAQNKINQPWADENVD 138P berghei GKKNEKKNEKIERNNKLKQP 9 3P knowlesiEKEKEEEPKKPNENKLKQ 97P. cynomolgi DEKQVEPKKPRENKLKQPRE7P Y L V ~ X KKDGKKAEPKNPRENKLK5FIGURE 1-Alignment of the N-terminal region of the CS pro-

    tein from six Plasmodium species (encoded by the 5NR region ofthe gene). Regions analyzed in this paper are underlined n the topsequence.ent species is low. So that all analyses would be basedon a comparable data set,65 aligned codons from the5NR and 78 aligned codons from the 3NR (includ-ing 36 codons in theTCE) were used nanalyses(Figures 1 and 2).

    R E S U L T SNucleotidesubstitution in nonrepeatregions: I

    computed the numberof synonymous differences persynonymous site ( p , ) and the number of nonsynony-mous differences per nonsynonymous site ( p , ) in pair-wise comparisons among available sequences. p.7 andp , were computed separately for aligned codons in

  • 7/30/2019 aporotizite

    3/9

    Circumsporozoiteroteinenes 34 7T A B L E 2

    Percent synonymous ( p . ) and nonsynonymous (pN) ifferences in different regions of circumsporozoite proteingenes5'NR ( N =6 5 ) S ' N R (excludingTC E) ( N =42 ) T C E ( N =36 )

    Cotnprison (No.) P.$ P P. 5 P , P. 5 P. bP . falciparumvs. P . falr iparum (10)

    vs. P . yoelii ( 5 )vs. P . berghei (5)vs. P. ynomolgi (25)vs. P . knowlesi (10)vs. P . vivax ( 5 )vs. P . berghti (1)vs. P . cynomolgi ( 5 )vs. P . knowlesi (2)vs. I". viuax ( 1)vs. P . cynomolgi ( 5 )vs. I? knowlesi (2)vs. P . vivax (1)vs. P . cynomolgi ( 1 0)vs. P . knowlesi (10)vs. P . vivax ( 5 )vs. P . knowlesi (1)vs. P . vivax (2)All comparisons ( 1 05)

    P . yoelii

    1'. berghez

    P . cynomolgi

    P . knowlesi

    0.0 f 0.073.4 f 7.082.4 f 6.064.8 f 7.763.5 f 7.660.5 f 7.727.0 f 7.075.9 f 7.170.2 f 7.377.7 f 6.674.3 f 7.269.6 f 7.375.2 f 6.810.0 f 3.713.8 * 3.627.9 f 6. 74.8 f 3.4

    31.7 f 7.247.5 f 5.1

    0.6 f 0.542.5 f 4.0***49.1 f 4.0***56.2 f 4.059.6 f 4.059.0 f 4.024.0 f 3.460.4 f 4.0*58.9 f 4.067.5 f 3.854.7 f 4.0*58.4 f 4.057.7 f 4.0*

    5.0 f 1.314.6 f 2.616.3 f 2.82.0 f 1.0

    22.4 f 3.338.8 f 2.5

    0.0 f 0.045.1 f 10.050.5 f 10.057.5 f 10.055 . 1 f 10.053.5 f 10.012.3 f 6.746.1 f 9.948.1 f 10.036.8 t 9. 655.0 f 10.054.6 f 10.046.5 f 10.0

    5.2 * 3.120.8 f 7.817.9 f 7.30.0 f 0.0

    25.2 _t 8.737.5 f 6.3

    0.8 f 0.627.8 f 4.4"29.5 f 4.Sh36.4 f 4Ah36.9 k 4.8'36.0 f 4.8'

    6.9 f 2.5'25.8 f 4.3'27.9 f 4.5'25.3 f 4.3'30.9 f 4.6*'33.4 f 4.730.8 f 4.6'

    2.2 f 0.914.4 f 3.49.8 f 2.81.0 f 1.0

    13.7 f 3.423.5 f 2.7*'

    0.0 k 0.072.3 f 9.369.5 f 9.669.6 f 9.866.4 f 9.764.1 f 10.015.5f 7.577.5 f 8.894.7 f 4.781.6 f 8.189.4 f 6.698.5 f 2.593.2 f 5.2

    1.6 k 1.721.8 f 8.513.0 f 6.90.0 f 0. 0

    20.9 f 8.549.3 f 6.7

    5.4 f 1.6***h42.0 f 5.3**42.6 f 5.3*38.5 f 5.2**h46.1 f 5.Y"46.1 f 5.417.0 f 4.145.7 f 5.4*"43.8 f . 5 * * * "45.0 f 5.4***'41.0 f 5.3***"35.6 f 5.2***'38.0 f 5.3***"

    3. 3 f 1.322.3 f 4.414.6 f 3.7

    1.2 f 1.220.2 f 4. 330.5 f 3.2**"

    Values are percent synonymous difference at synonymous sites ( p , $ ) nd percent nonsynonymous difference at nonsynonymous sites (p , , )f E. N =number of codons compared. p S significantly different from p,vat (*) 5% level: (**) 1% level: (***) 0.1% level. p, . significantlydifferent from p , , in ~ ' N Rt * 5% ievel; h ' ~% &vel: 0.1% level.the 5 ' N R , the 3'NR excluding the TCE, and the TCE(Table 2). N E I an d GOJOBORI'S1986) method wasused to estimate p , an d p N . Standard errors of meanp,s and P N were computed by the method of N E I an dJ I N (1989).

    In the comparison among he five P. a lc iparumalleles, there are no synonymous differences in the5 ' N R or 3 ' N R . Only in theTCE, however, is p Nsignificantly higher than p s (Table 2) , and in P. fa lc i -par u m, P N in the TCE is Significantly higher than P Nin the other regions. Comparison among alleles in P .cynomolgz and between the two P . knowlesi alleles doesnot reveal a similarly elevated P N in theTCE. nbetween-species comparisons, the TCE seems to be arelatively conserved region, since p s significantly ex-ceeds P N in this region in a majority of such compari-sons. The 5'NR is the least conserved region in be-tween-species comparisons; in most cases, p N in thisregion is significantly higher han hat in the othertwo regions.ARNOT 1989) suggested that the lack of synony-mous differences among P. f a k i p a r u m alleles may bedue to some factor that prevents synonymous substi-tution at this locus. It is known that an extreme biasin G +C content (leading to either ve5y high or verylow levels of G +C at third-codon positions) is asso-ciated with areduction in the rate of synonymous

    substitution (WOLFE,SHARPEnd LI 1989). Plasmo-dium species are known to have lo w G+C content ingenomic D N A ( WE BE R1988), and in regions of CSgenes analyzed in Table 2 a similar bias is apparent(Table 3). Third-position G +C content is particularlylow in the TCE of P. fa lc ipar u m and the two rodentparasites (Table 3). This G +C content bias may welllower the rate of synonymous substitution in thesespecies, but it does not seem to have totally eliminatedsynonymous substitution in this region, since therear e synonymous differences among these species inthe TCE (Table2).

    LOCKYER, ARSH nd NEWBOLD1989) sequencedthe TCE only from a number of CS alleles from P .fa lc ipar u m. Combining their sequence data with thatpreviously published, it was possible to compare theTCE (36 codons) for 16 alleles. In this comparison p ,y= 0 . 5 k 0.8, and P N = 5 .7 & 1.3. p , y and p , aresignificantly different at the 1% level, and the valueof P N in the TCE for this expanded data set in veryclose to hat obtained for he five complete allelesanalyzed in Table 2. These dataalso provide evidencethat synonymous substitutions can occur in the TCEof P . f a l c i p a r u m .

    In P . cynomolgz and Plasm odium knowlesi, there is n oevidence that the same regions which form the TCEof P. fa lc iparum are under positive selection. These

  • 7/30/2019 aporotizite

    4/9

    348 A . L. HughesT A B L E 3

    Mean G+C content (in %) at all positions and at third-codonpositions (3d) in different regions of circumsporozoite protein

    genes

    Species -~N o . ;t llelea) Sd All 3d All Sd All/ . f a k i p a r u m (5) :10.8 33.9 19.0 31.1 16.7 30.0I? yoeli i (1) 24.6 32.8 11.9 33.3 25.0 33.31. b wg h e i ( 1 ) 29.2 31.3 16.7 27.0 25.0 33.31. rynomolgi (5) 34.2 42.2 31.0 44.3 32.8 38.51. know les i (2 ) 40.0 47.2 28.6 46.0 27.8 38.01. v iuax ( 1 ) 40.0 47.2 28.6 46.0 27.8 38.0A =number of aligned codons analy7.ed.

    regions may not be T cell eptitopes for other species,b u t instead certain other sections of the 3NR mayserve as T cell epitopes. If such epitopes are undersimilar positive selection to the TC E of P. falciparum,one way to look for them s to look for regions showingP,,, >Ps . In P. cynomolga, a search for such a regionproduced a candidateT cell epitope corresponding tothe portion of the 3NR that is 5 to T-cell eptiope 1of P. falciparum plus T cell epitope 1 of P. falciparum(3 0 codons; see Figure 2). In this region, in the com-parison among the five available P. cynomolgi alleles,p,v = 0.0 f 0.0 and p N = 3.8 k 1.6. Th e differencebetween pS and p N in this region is significant at the5% level. By contrast, in the remainder of the 3NRoutside this region, in the comparison among all P.cynomolga alleles, p . ~=6.1 & 3.0 an d p , =2.1 f 0.9.I n this case, the difference between p , and p , is notstatistically significant.

    Conservative and radical aminoacid replace-ments: Under positive selection favoring diversity atthe protein level, diversity with respect to a particularamino acid property may be favored; and thus aminoacid replacements that are radical (nonconservative)with respect to this property will occur with a dispro-portionate frequency. MONOSet a l . ( 1 984) noted thatclass I MHC alleles show an exceptionally large num-ber of charge differences and speculated that thesedifferences may be important in determining differ-ences among alleles with respect to heirpeptide-binding capacities. HUGHES,OTA and NEI (1990)tested this idea statistically by a methodwhich classifiesnonsynonymous nucleotide sites (NEI and GOJOBORI1986) as conservative o r radical (with respect to anamino acid property of interest) and, in comparingtwo D N A sequences, estimates the number of con-servative nonsynonymous nucleotide differences perconservative nonsynonymous site ( P N C ) and the num-ber of radical nonsynonymous nucleotide differencesper radical nonsynonymous site ( P N K ) . If >P N R ,the amino acid property of interest is conserved. If

    P falciparum N - - - - - - - - - - - - - - - - - - . - ~ N Q G N G Q G H N H P N N P N R M ~P yoelii q....................Q............ PRPQPDGNNNNNNNNGNNNEDS..SP berghei D P A P p Q C N N N p Q P Q P R P Q P Q p Q P Q ~ P Q P Q P Q ~ P R P q P Q ~ P G G N N N ~ N N N D D S Y l P S

    ++

    P howlesi .................................. GDGARGGNAGAGKGQGqNNQGANVPNp cynomolgi ..................... NARAC~PPACCmocAGEAGGNAGAGQGQNNEGANVPNP. , a x GGNAANKKAEDACGNAGGNAGCG~NNEGANAPN........................

    ++++++++++++++++ e+++++++++++++++++P falciparumP yoelii AEQILEWKqISSQLTEEWSQCSICGSGVRVR-KRKNVNKQPENLTLE-DIDTEICKnDP berghei AEKILEFVKQlRDSITEEWSqCNVTCGSGIRVR-KRKGSNKKAEDLILE.DIDTEICKnDP . knowlesi EK W N D Y l l l K I RS S V TTEW TP CS V TCG N G V RI RRK G H A G N K MD - D LEV EA CV H DP cynomolgi A K L V K E Y L D K I R S T L G V E U S P C S V T C G K G V R M R R K V S A A N - D L G T G V C T M DP. v i v a x E K S V K E Y L D K V R A T V G T E W T P C S V T C G V G V R V R R R V N A A N - D L E T O V C T M D

    P . f a k i p a r um m N W N S S I G L I M I L S F L F W 1 2 5P yoelii K CS S IF NIVS NS LG ~ IILLVLVF F X 111P. berghei KCSSIFNIVSNSLGFVILLVLVFFN 143P. hnow l e s~ KCAGIF NW S NS L GL VIL L VL L F 5 110P . cynomolgl KCAGIFNWSNSIICLVILLVLLFN 12 3P . vivax KCAGIF NW S NS L GL VIL L VL L F X 119

    402860263938

    10086

    118859894

    F I G U R E 2.-Alignment of the C-terminal region of the C S pro-tein fro111 sixPlasmodium species (encoded by the 3NR region ofthe gene). +indicates amino acid residues in the P. ulczparum T(ell epitopes. Th e region analysed in this paper is underlined i n the~ o pequence.PNC:=P N , , nonsynonymous substitutions occurat ran-dom with respect to the property, and thus there isno particular constraint with respect to that property.If P N , >PN,, then selection favors diversificationbetween the sequences compared with respect to theproperty. In thecase of class I MHC genes of humansand mice, P N R > N < : with respect to amino cid residuecharge in the bindingcleft, indicating that in thisregion nonsynonymous nucleotide substitutions caus-ing a charge change occur more frequently than ex-pected by chance (HUGHES, TAand NEI 1990).

    If positive selection on the TC E of P. falciparumfavors the ability of this region to evade binding byMHC molecules, it might bepredicted hat in theTCE also nonsynonymous substitutions causingcharge change occur more frequently than expectedby chance. I tested this prediction by applying themethod of HUGHES, TA nd NEI (1 990)o nonrepeatregions of Plasmodium CS genes (Table 4). In the TC Eof P. falciparum, pNR with respect to charge exceedspN c by a ratio of over 9:1, whereas in other generegions in this species pNc ; nd p N K re not ignificantlydifferent. In P. cynomolgi, by contrast, P N C is signifi-cantly greater than p N R in the T C E . Also in severalbetween-species comparisons, p N C is significantlygreater than P N K in the TCE. Outside the TCE, bycontrast, p,, and P N R tend to be about the same inmost comparisons (Table 4) . These results support thehypothesis that in P. falciparum selection favors div-ersification of charge profile in the TCE.

    A similar analysis was applied to the candidate Tcell epitope for P. cynomolgi mentioned above. In the

  • 7/30/2019 aporotizite

    5/9

    CircumsporozoiteProtein Genes 349TABLE 4

    Percent conservative(pNC)nd radical(pNR)onsynonymous nucleotide difference in different regions of circumsporozoite proteingenes

    5NR ( N =65) 3NR (excluding TCE)( N =42 ) TCE ( N =36)Comparison (No.)

    P . alciparumvs. P . ulciparum (10)vs. P. yoelii (5)vs. P. berghei (5)vs. P. cynomolgi (25)vs. P . knowlesi (10)vs. P. vivax (5)vs. P. berghei (1)vs. P . cynomolgi (5)vs. P . knowlesi (2)vs. P. uiuax (1)vs. P. cynomolgz (5)vs. P. Rnowlesi (2)vs. P. v i vax (1 )vs. P . cynomolgz (10)vs. P . Rnowlesi ( 10)

    P . yoelii

    P . berghei

    P . cynomolgi

    vs. P . viuux (5)P . knowlesi

    vs . P . knowlesi (1)vs. P . vivax (2)

    All comparisons (105)

    P N CN K

    0.8 f 0.8 0.5 f 0.543.3 f 5.6 49.6 f 5.649.5 f 5.8 48.7 f 5.555.0 f 5.7 58.4 5.659.7 f 5.6 59.4 f 5.555.6 f 5.6 62.4 f 5.618.9f 4.6 28.4 f 4.954.3 k 5.8 66.1 f 5.455.6 f 5.8 61.8 k 5.459.6 k 5.7 75.1 f 4.9*

    57.5 f 5.9 52.5 f 5.557.7 f 5.9 59.0 2 5.456.1 f 5.8 59.2 f 5.5

    4.8 f 1.7 5.1 f 1.813.9k 3.7 15.3 f 3.816. 6f 4.1 15.8 f 4.0

    1.4 k 1.4 2.5 f 1.720.7 f 4.7 24.0 f 4.837.4 f 3.5 40.0 f 3.6

    P N CN K

    1.3 f 1.0 0.0 f 0.026.1 f 5.6 30.3 f 7. 126.7 f 5.7 33.7 f 7.331.8 f 5.9 43.1 f 7.630.1 f 5.9 47.0 k 7.929.1 f 5.8 46.3 f 7.98.2 f 3.5 4.9 f 3.4

    23.7 f 5.4 29.0 f 7.126.8 f 5.7 29.7 f 7.323.2 f 5.4 28.6 f 7.3

    31.4 f 6.0 30.1 f 7.134.4 f 6.1 32.0 f 7.331.6 -+6.0 29.7 f 7.2

    3.1 f 1.5 1.0 -+ 1.017.8 f 4.8 9.3 f 4.513.4 f 4.3 4.4 f 3.1

    1 .7 f 1.7 0.0 f 0.015.7 f 4.7 10.5f 5.021.8 f 3.4 26.0 f 4.4

    P N CN K

    0.9 f 0.9 9 .7 f 3.1**45.0 f 9.3 39.0 f 7.346.1 f 7.6 39.1 f 7. 348.8 f 7.7 28.9 k6.6*58.4 f 7.6 34.5 f 7.0*61.6 f 7.4 30 .9 f 6.8**15.2 f 7.5 18.8 f 6.050.8 f 7.6 40.6 f 7.647.2 f 7.6 40.4 f 7.550.0 f 7.6 39.7 f 7. 652.3 f 7.6 29.5 f 7.0*43.3 ? 7.5 28.0 f 6. 847.7 f 7.6 28.1 f 7.0

    6.6 f 2.7 0.0 f o.o*30.6 f 6.9 14.1 f 5.324.1 f 6.3 4.9 f 3.4**

    2.4 f 2.4 0.0 f 0.030.9 k 7.1 9.5 f 4.5*37.6 f 4.8 23.5 f 4.1*

    Values are percent conservative nonsynonymous difference at conservative nonsynonymous sites ( p s ) and percent radical nonsynonymousdifference at radical nonsynonymous sites ( p , ) f E. N = number of codons compared. pN csignificantly different from p , , at (*) 5% level;(**) 1 % level.

    putative epitope, p N C =6.0 & 2.8 and PN R = 1.3 +-I .2 ; in this case the difference between PNCand P N R isnot statistically significant. In the remainder of the3 N R , P N C=3 .7 f 1.6 and P N R=0.0 -t 0.0; here thedifference between p N C an d P N R is significant at the5 % level. Thus, there is no evidence of positive selec-tion favoring charge profile diversity in the putativeT cell epitope of P . cynomolgi. However, it appearsthat the entire C-terminal region of the CS protein isunder fairly strong constraint with respect to chargein P . cynomolgz (Table 4).

    Evolutionof repeat regions:Because alignment ofthe repeat region of CS genes is not possible betweenspecies, repeat regions were analyzed within speciesfor he two species with the most available allelicsequences, P . falciparum and P . cynomolgi. In the caseof P . falciparum, the basic repeat unit encodes a 4-amino acid unit with the consensus sequence Asn-Ala-Asn-Pro. The number of repeat units varies amongalleles, presumably as a result of deletion and dupli-cation due to unequal intralocus crossing over. T hedifference between alleles in the number of repeatunits makes any alignment of this region arbitrary.Rather than attempting o align alleles, I computed p san d P N between individual repeat units, both within

    and between alleles, as a way of examining the typeof natura l selection acting on the repeat region (Table5 ) . p . y was found to exceed P N in the comparison ofrepeat units both within and between alleles. In fact,mean p s and mean p , in within-allele comparisons arealmost identical to those in between-allele compari-sons (Table 5 ) . These results indicate that the aminoacid sequence of the repeat unit is conserved withinand between alleles in P. fa lc ipar u m.

    In P . cynomolgi, the length of the repeat unit variesamong alleles, and several alleles have two separaterepeat types which differ in length (Table 1) . Align-ment of the repea t region in this species is thereforedifficult. However, GALINSKIt a l . (1987) have iden-tified a4-amino acid core equence, homologs towhich can be found in all P. cynomolgi repeat types.When p s an d P N are calculated for this core sequencewithin and between repeat types, a very differentpattern emerges from that een in P. a lc iparum repeatunits (Table6). n P. cynomolgi, p N exceeds p s incomparisons betweendifferentrepeat types of thesame allele. This suggests that positive selection favorsdiversification of the amino acid sequence of the corerepeat unit within alleles. On the other hand, in thecomparison between alleles,ps and p N are about qual.

  • 7/30/2019 aporotizite

    6/9

    350 A . L. HughesTABLE 5

    Percent synonymous (ps)and nonsynonymous ( p , ) differencesin comparisons among repeat units (4 codons) within and

    betweenP. ale iparum allelesAllele ( N o . repeat

    units) Comparison (No.) p S PNL.F.5 (40) us . LE5 (780)0.8 f 18.9 4.8 2 3.1**

    us . NF54 (1760) 52.9 k 17.5 4.1 f 2.6**us. T 4 (1760) 52.4 f 17.5 3.8 f 2.4**us . We1 (1840)54.3 k 17.8 4.3 f 2.7**us . 7G8 (1646)54.5 f 17.6 4.2 f 2.7**

    NF54 (44) us . NF54 (946) 44.0 f 16.0 3.6 f 2.3*us . T 4 (1936)42.0 f 15.6 3.1 f 2.0*us . We1 (2024) 46.2 & 16.9 3.7 k2.3*us . 7G8 (1804) 46.1 f 16.8 3.7 f 2.3*

    '1'4 (44) us. T4 (946) 41.8 f 16.4 2.7 f 1.8*us . We1 (2024) 45.5 f 16.8 3.3 f 2.1*us . 7G8 (1804) 45 .3 f 16.7 3.3 f 2.1*

    us . 7G8 (1886)48.3 f 17.4 3.8 f 2.4*We1 (46) us . We1 (1035) 48.7 k 17.8.9 f 2.4*7

  • 7/30/2019 aporotizite

    7/9

    Circumsporozoiteroteinenes 35 1TABLE 6

    Percent synonymous(ps) nd nonsynonymous (p.) differences in comparisons among core repeat units (4 codons) within and betweenP. cynomolgi allelesAllele (No. core re-peat (No.) p.s P N

    B (17) Within alleleRepeat type 1Repeat type 21 us. 2All comparisonsus . cus . Gus . Lus . M / N

    Within alleleRepeat type 1Repeat type 21 us . 2All comparisonsus . Gus. Lus. M/N

    Within alleleAll comparisons

    Between allelesus. Lus . M /N

    Within alleleRepeat 1Repeat 21 us . 2All comparisonsus. M /NAll comparisons

    Between alleles

    Between alleles

    Between allelesM I N (52) Within alleleMeans

    Within allelesWithin repeat types ( B , C , and L )Within repeat types (all alleles)Between repeat types ( B , C , and L )

    Between alleles

    16.6f 13.34.5 f .0

    13.3f 10.313.7 f 10.917.4 f 13.016 .4 f 11.17.9 f 6.6

    43.6 f 22.118.4 f 14.50.0 f 0.0

    13.4 f 12.717.2 f 14.022.1 f 14.212.5f 11.049.6 f 23.713.2 f 11.78.6 f 7.8

    31.1 f 18.30.0 f 0.00.0 f 0.00.0 f 0.00.0 f 0.0

    36.1 f 21.427.2 f 14.9

    8.6 f 8.522 . 6 f 13.56.9 f 9.4

    30.7

    2.3 f 2.010.0 f 5.135.7 f 15.219. 4f 8.128.5 f 11.435.6 f 12.734.1 f 11.6*25.1 f 10.8

    5.3 f 5.00.0 f 0.0

    23.2 f 13.59.2 f 5.8

    2 2 .6 f 12.638.2 f 13.221.8 f 11.4

    0.0 f 0.039.1 f 14.014.0 f 10.06.1 f 5.60.0 f 0.0

    58.4 f 16.5***24.5 f 7.7**46.9 f 13.1

    4.4 f 3.3

    5.2 f 3.84.2 f 3.5

    44.7 f 15.1**30.3

    Values are mean percent synonymous difference at synonymous sites ( p , ) and mean percent nonsynonymous difference at nonsynonymoussites ( p , ) * E. p, s is significantly different from p Nat (*) 5% level; (**) 1% level; (***) 0.1% level.A method is not available for estimating SE of between-allele means.evolved, such a new repeat type may be duplicatedwithin the allele by unequal crossing over; finally, thisprocess will lead to the production of an allele whoserepeats are entirely of the new type. Once a newrepeat type is formed, here would be nofurtherselection for change in its amino acid sequence; thiswould explain the fact that in P . cynomolgi, synony-mous differencespredominate within repeat types.Note that, on this model, P . cynomolgi alleles havingtwo different repeat ypes represent transitionalstagesin the spread of new repeat types.

    The reasons for the differences betweenP . falcipa-ru m and P . cynomolgi in the way the CS genes haveevolved are not fully understood at present, but itmay be worthwhile to mention some possible differ-

    ences between their host species that may be corre-lated with different selective pressures. Polymorphismat human MHC loci is very high, suggesting that thelong-term effective population number for humans isquitehigh ( NEI and HUGHES1990). The Macacaspecies which serve as hosts for P . cynomolgi may nothave as high effective population sizes and thereforemay have more limited MHC polymorphism. Thus,in P . a lc ipar u m selection on the CS genes may mainlyhave arisen as a result of adaptation to a host withhigh MHC polymorphism, whereas in P . cynomolgz thehosts MHC may have been less important as a sourceof selection, giving a correspondingly greater role tothe avoidance of host antibody defenses.

    This research was supported by National Institutes of Health

  • 7/30/2019 aporotizite

    8/9

    352 A . L . Hughesgl-mntsR 0 1 GM43940 andR01 GM20293 andby National ScienceFoundation grant BSR8807910.

    L I T E R A T U R EC I T E DARNOT,D. E., 1989 Malaria and hemajor histocompatibility

    complex. Parasitol. Today 5: 138-142.ARNOT,D. E., J . W . BARNWELL and . J. STEWART, 1988 Doesbiased gene conversion influence polymorphism in the circum-sporoLoite protein-encoding gene of Plasmodium viuax? Proc.Natl. Acad. Sci. USA 85: 8102-8106.

    UALLOU, W . R., . ROTHBARD,R. A. WIRTZ, R. W. GORE, I .SCHNEIDER, . R. HOLLINGDALE,R.L.BEAUDOIN, W. L.MALOY, 1,. H. MILLER and W . T . HOCKMEYER,1985 Immunogenicity of synthetic peptides from circumspo-ro7oite protein of Plasmodium falciparum. Science 2 2 8 996-999.

    CASPERS, P., . GENTZ,H. MATILE,. R. PINK and F. SINIGAGLIA,1989 Th e circumsporozoite protein gene from NF54, a Plas-modium falciparum isolate used in malaria vaccine trials. Mol.Biochem. Parasitol. 35: 185-190.

    ( ~ Y D E ,. F., H . MOST, V. C. MCCARTHYnd J. P. VANDENBERG,1973 Immunization of man agains t sporozoite-induced falci-parum malaria. Am. J. Med. Sci. 2 6 6 169-177.

    CLYDE, . F.,V. C. MCCARTHY, . M. MILLER nd W. E. WOOD-WARD,1975 mmunization of man against alciparum andvivax malaria by use of attenuated sporozoites. Am. J. Trop .Med. Hyg. 24: 397-401.

    COCHRANE, . H., F. SANTORO, . NUSSENZWEIG,. W. GWADZand R. S. NUSENZWEIG, 982 Monoclonal antibodies identifythe protective antigens of sporozoites of Plasmodium knowlesi.Proc. Natl. Acad. Sci. USA 7 9 651-5655.

    II E LA CRUZ,V.., A. A . LALnd T . F. MCCUTCHAN,1987Sequence variation in putative functional domains ofPlasmodium falciparum : implications for vaccine development.J . Biol. Chem. 262: 11935-1 1939.

    DE LA CRUZ,V. F., W. L. MALOY, . H. MILLER,A. A. LAL,M. F.Goon and T . F. MCCUTCHAN, 988 Lack of cross-reactivitybetween variant T cell determinants from malaria circumspo-rozoite protein. J. Immunol. 141: 2456-2460.

    IX L PORTILLO, H. A., R . S. NUSSENZWEIG and V. ENEA,1987 Circumsporozoite protein gene of a Plasmodium falci-parum strain from Thailand. Mol. Biochem. Parasitol. 24: 289-294.

    ENEA,V., and D. ARNOT, 1988Thecircumsporozoitegene inPlasmodia, pp. 5-1 1 in Molecular Genetics of Parasitic Protozoa,edited by M. J . TURNERnd D. ARNOT. Cold Spring Har borLaboratory, Cold Spri ng Harbor, N.Y.

    FINE,E., M. AIKAWA, . H. COCHRANE and R.S. NUSSENZWEIG,1984 mmunoelectron microscopicobservations on Plasmo-dium knowlesi sporozoites: localization of pro tect ive anti gen andits precursors. Am. J. Tr op . Med. Hyg. 33: 220-226.

  • 7/30/2019 aporotizite

    9/9

    Circumsporozoite Proteinenes 353strains of the malarial parasite Plasmodiumknowlesi. Science

    TANAKA,. , A N D M. NEI, 1989 Positive Darwinian selectionobserved at he variable-region genes of immunoglobulins.Mol. Biol. Evol. 6 447-459.

    VERGARA,. , R . GWADZ, . SCHLESINGER,. NUSSENZWEIC ndA . FEREIRA,986 Multiple non-repeated epitopes on thecircumsporozoite protein of Plasmodiumulcipurum. Mol.Biochem. Parasitol. 14: 283-292.

    WERER,. L. , 1988 Molecular biology of malaria parasites. Exp.WEISS,W. R., M. F . GOOD,M . R. HOLLINGDALE,. H. MILLERand

    J. A . BERZOFSKY,1989 Genetic control of immunity to Plus-

    2 2 9 779-782.

    Parasitol.66: 143-170.

    modium yoelii sporozoites.J. Immunol. 143: 4263-4266.WOLFE, . H., P. M . SHARPEnd W.-H. LI, 1989 Mutation rates

    vary among regions of the mammalian genome. Nature 337:YOSHIDA,N., R . S . NUSSENZWEIC, . POTOCNJAK,. NUSSENZWEIC

    and M . AIKAWA,980 Hybridoma produces protective anti-bodies directed against the sporozoite stages of malaria parasite.Science 207: 71-73.ZAVALA,., A . H. COCHRANE,. H. NARDIN, . S. NUSSENZWEIGand V. NUSSENZWEIC,1983 Circumsporozoite proteins ofmalaria parasites contain a single immunodominant region withtwo or more identical epitopes. J. Exp. Med. 157: 1947-1957.

    Communicating editor: A. G. CLARK

    283-285.