Domain Loss Facilitates Accelerated Evolution and

13
Domain Loss Facilitates Accelerated Evolution and Neofunctionalization of Duplicate Snake Venom Metalloproteinase Toxin Genes Nicholas R. Casewell,* ,1,2 Simon C. Wagstaff, 1 Robert A. Harrison, 1 Camila Renjifo, 1 and Wolfgang Wu ¨ster 2 1 Alistair Reid Venom Research Unit, Liverpool School of Tropical Medicine, Liverpool, United Kingdom 2 School of Biological Sciences, Environment Centre Wales, Bangor University, Bangor, United Kingdom *Corresponding author: E-mail: [email protected]. Associate editor: Willie Swanson Abstract Gene duplication is a key mechanism for the adaptive evolution and neofunctionalization of gene families. Large multigene families often exhibit complex evolutionary histories as a result of frequent gene duplication acting in concordance with positive selection pressures. Alterations in the domain structure of genes, causing changes in the molecular scaffold of proteins, can also result in a complex evolutionary history and has been observed in functionally diverse multigene toxin families. Here, we investigate the role alterations in domain structure have on the tempo of evolution and neofunctionalization of multigene families using the snake venom metalloproteinases (SVMPs) as a model system. Our results reveal that the evolutionary history of viperid (Serpentes: Viperidae) SVMPs is repeatedly punctuated by domain loss, with the single loss of the cysteine-rich domain, facilitating the formation of P-II class SVMPs, occurring prior to the convergent loss of the disintegrin domain to form multiple P-I SVMP structures. Notably, the majority of phylogenetic branches where domain loss was inferred to have occurred exhibited highly significant evidence of positive selection in surface-exposed amino acid residues, resulting in the neofunctionalization of P-II and P-I SVMP classes. These results provide a valuable insight into the mechanisms by which complex gene families evolve and detail how the loss of domain structures can catalyze the accelerated evolution of novel gene paralogues. The ensuing generation of differing molecular scaffolds encoded by the same multigene family facilitates gene neofunctionalization while presenting an evolutionary advantage through the retention of multiple genes capable of encoding functionally distinct proteins. Key words: molecular evolution, domain loss, positive selection, gene duplication, Serpentes, metalloproteinase. Introduction The evolution of gene families is a key mechanism for the adaptive evolution of organisms (cf. Ohno 1970, 1973; Zhang et al. 1998). One process central to the generation of new genetic material for biological evolution is gene du- plication (Ohno 1970, 1973; Kimura and Ohta 1974; Hughes 1994; Zhang et al. 1998; Lynch and Conery 2000; Bergthorsson et al. 2007). Since the inception of the ‘‘neofunctionalization model’’ (or ‘‘mutation during nonfunctionality’’), whereby the duplication of a gene pre- cedes the emergence of a new function through the relief of functional constraints from one copy (Ohno 1970; Ki- mura and Ohta 1974), the evolutionary processes that gov- ern the neofunctionalization of duplicate genes have been the subject of much interest. However, this model fails to explain how duplicate genes may be sufficiently main- tained in a population to accumulate mutations to confer a new function, particularly when maintaining a duplicated gene by selection would restrict the freedom to diverge (‘‘Ohno’s dilemma’’—Bergthorsson et al. 2007). It has since been proposed that genes may develop one or more discrete or neutral secondary functions prior to gene duplication, with subsequent mutations (Hughes 1994) or further gene amplification prior to mutation (Bergthors- son et al. 2007), allowing selective pressures to maintain duplicates with divergent functions. Frequent gene dupli- cation is consequently capable of generating large related multigene families that exhibit a diverse array of functions (cf. Nei et al. 1997; Kordis ˇ and Gubens ˇek 2000; Lynch and Conery 2000; Fry et al. 2003; Lynch 2007). The evolutionary history of gene families containing numerous paralogues can be particularly difficult to trace, especially when genes have been the subject of positive Darwinian selection and/or alterations in domain structure, as the result of deletion or divergence (Ohta 1991; Richards and Cavalier-Smith 2005). A number of snake venom toxin families demonstrated to have evolved via gene duplication and positive selection are consequently typified by a com- plex evolutionary history, with diverse functional and path- ological activities encoded by members of the same multigene family (cf. Kini and Chan 1999; Fry et al. 2003; Fox and Serrano 2005; Lynch 2007; Casewell et al. 2011). Furthermore, some toxin families exhibit fascinating diver- sity in their molecular scaffolds: Genes have been identified that tandemly repeat domains (Ducancel et al. 1993; Soares et al. 2005) or motifs (Wagstaff et al. 2008), encode specific © The Author 2011. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 28(9):2637–2649. 2011 doi:10.1093/molbev/msr091 Advance Access publication April 4, 2011 2637 Research article Downloaded from https://academic.oup.com/mbe/article/28/9/2637/1012429 by guest on 09 December 2021

Transcript of Domain Loss Facilitates Accelerated Evolution and

Domain Loss Facilitates Accelerated Evolution andNeofunctionalization of Duplicate Snake VenomMetalloproteinase Toxin Genes

Nicholas R. Casewell,*,1,2 Simon C. Wagstaff,1 Robert A. Harrison,1 Camila Renjifo,1 andWolfgang Wuster2

1Alistair Reid Venom Research Unit, Liverpool School of Tropical Medicine, Liverpool, United Kingdom2School of Biological Sciences, Environment Centre Wales, Bangor University, Bangor, United Kingdom

*Corresponding author: E-mail: [email protected].

Associate editor: Willie Swanson

Abstract

Gene duplication is a key mechanism for the adaptive evolution and neofunctionalization of gene families. Large multigenefamilies often exhibit complex evolutionary histories as a result of frequent gene duplication acting in concordance withpositive selection pressures. Alterations in the domain structure of genes, causing changes in the molecular scaffold ofproteins, can also result in a complex evolutionary history and has been observed in functionally diverse multigene toxinfamilies. Here, we investigate the role alterations in domain structure have on the tempo of evolution andneofunctionalization of multigene families using the snake venom metalloproteinases (SVMPs) as a model system. Ourresults reveal that the evolutionary history of viperid (Serpentes: Viperidae) SVMPs is repeatedly punctuated by domainloss, with the single loss of the cysteine-rich domain, facilitating the formation of P-II class SVMPs, occurring prior to theconvergent loss of the disintegrin domain to form multiple P-I SVMP structures. Notably, the majority of phylogeneticbranches where domain loss was inferred to have occurred exhibited highly significant evidence of positive selection insurface-exposed amino acid residues, resulting in the neofunctionalization of P-II and P-I SVMP classes. These resultsprovide a valuable insight into the mechanisms by which complex gene families evolve and detail how the loss of domainstructures can catalyze the accelerated evolution of novel gene paralogues. The ensuing generation of differing molecularscaffolds encoded by the same multigene family facilitates gene neofunctionalization while presenting an evolutionaryadvantage through the retention of multiple genes capable of encoding functionally distinct proteins.

Key words: molecular evolution, domain loss, positive selection, gene duplication, Serpentes, metalloproteinase.

IntroductionThe evolution of gene families is a key mechanism for theadaptive evolution of organisms (cf. Ohno 1970, 1973;Zhang et al. 1998). One process central to the generationof new genetic material for biological evolution is gene du-plication (Ohno 1970, 1973; Kimura and Ohta 1974;Hughes 1994; Zhang et al. 1998; Lynch and Conery2000; Bergthorsson et al. 2007). Since the inception ofthe ‘‘neofunctionalization model’’ (or ‘‘mutation duringnonfunctionality’’), whereby the duplication of a gene pre-cedes the emergence of a new function through the reliefof functional constraints from one copy (Ohno 1970; Ki-mura and Ohta 1974), the evolutionary processes that gov-ern the neofunctionalization of duplicate genes have beenthe subject of much interest. However, this model fails toexplain how duplicate genes may be sufficiently main-tained in a population to accumulate mutations to confera new function, particularly when maintaining a duplicatedgene by selection would restrict the freedom to diverge(‘‘Ohno’s dilemma’’—Bergthorsson et al. 2007). It has sincebeen proposed that genes may develop one or morediscrete or neutral secondary functions prior to geneduplication, with subsequent mutations (Hughes 1994)

or further gene amplification prior to mutation (Bergthors-son et al. 2007), allowing selective pressures to maintainduplicates with divergent functions. Frequent gene dupli-cation is consequently capable of generating large relatedmultigene families that exhibit a diverse array of functions(cf. Nei et al. 1997; Kordis and Gubensek 2000; Lynch andConery 2000; Fry et al. 2003; Lynch 2007).

The evolutionary history of gene families containingnumerous paralogues can be particularly difficult to trace,especially when genes have been the subject of positiveDarwinian selection and/or alterations in domain structure,as the result of deletion or divergence (Ohta 1991; Richardsand Cavalier-Smith 2005). A number of snake venom toxinfamilies demonstrated to have evolved via gene duplicationand positive selection are consequently typified by a com-plex evolutionary history, with diverse functional and path-ological activities encoded by members of the samemultigene family (cf. Kini and Chan 1999; Fry et al. 2003;Fox and Serrano 2005; Lynch 2007; Casewell et al. 2011).Furthermore, some toxin families exhibit fascinating diver-sity in their molecular scaffolds: Genes have been identifiedthat tandemly repeat domains (Ducancel et al. 1993; Soareset al. 2005) or motifs (Wagstaff et al. 2008), encode specific

© The Author 2011. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, pleasee-mail: [email protected]

Mol. Biol. Evol. 28(9):2637–2649. 2011 doi:10.1093/molbev/msr091 Advance Access publication April 4, 2011 2637

Research

articleD

ownloaded from

https://academic.oup.com

/mbe/article/28/9/2637/1012429 by guest on 09 D

ecember 2021

domain products (cf. Calvete et al. 2005; Fry et al. 2008),and combine through domain loss and gene duplication(Fry et al. 2010) to produce structurally distinct venom tox-ins. Multigene toxin families therefore represent a modelsystem to investigate how neofunctionalization occurs inrapidly evolving gene families and the role alterations indomain structure may take in such a process.

The snake venom metalloproteinases (SVMPs) are a largemultigene toxin family that encode differing multidomainproteins capable of inducing a diverse array of functionsthat include: hemorrhage, coagulopathy, fibrinolysis, apo-ptosis, and the activation of factor X and prothrombin(cf. Fox and Serrano 2005). As a result of their extensiverepresentation in viperid venoms (cf. Bazaa et al. 2005; Sanzet al. 2008; Wagstaff et al. 2009) and their aforementionedfunctional diversity, SVMPs are one of the principal causesof the life-threatening hemorrhagic pathologies observedfollowing viperid snakebite. SVMPs are typically character-ized by the presence of a catalytic ‘‘H-box’’ amino acid mo-tif (HEX2HX2GX2HD) in the metalloproteinase (MP)domain, and conventionally further categorized into classes(P-I, P-II, and P-III) based upon the presence or absence ofadditional nonproteinase domains extending the MP do-main (Hite et al. 1994; Fox and Serrano 2005, 2008): TheP-I class comprises only a MP domain, P-IIs are those se-quentially extended by a disintegrin (DIS) domain, andP-IIIs by a DIS-like and cysteine-rich domain (fig. 1). Dissim-ilarities in the numbering of structural cysteine residuesand the absence of an RGD or RGD-like integrin-bindingamino acid motif differentiate the DIS-like domain of P-IIIsfrom the DIS domain of P-IIs (Fox and Serrano 2005, 2008).Subclassifications of P-II and P-III classes have also been ad-vocated on the basis of evidence of posttranslational mod-ification of SVMP toxins, including the proteolyticprocessing of specific domains and the formation of multi-meric structures (Fox and Serrano 2005, 2008) (fig. 1). Al-though all three major SVMP classes are capable ofinducing hemorrhage, the inclusion of additional domains

appears to correspond to increases in hemorrhagic activity(Fox and Serrano 2005, 2008, 2009). Furthermore, knownand suspected alterations in the domain structures ofSVMPs are thought to account for their functional diver-sity, with certain activities seemingly associated with spe-cific domains, such as fibrinolysis (P-I MP domain), potentplatelet aggregation (P-II DIS domain), and activation ofprothrombin and factor X (P-III structure) (cf. Calveteet al. 2005; Fox and Serrano 2005).

Because of their pathological importance, there is in-tense interest surrounding the evolution of SVMPs follow-ing their recruitment into venom. Early studiesinvestigating their evolutionary history implied a commonancestry with the mammalian matrix metalloproteinases(Bjarnason and Fox 1994; Hite et al. 1994; Moura-da-Silvaet al. 1996), preceding divergence by positive selection(Glassey and Civetta 2004). More recently, the phylogeneticplacement of SVMP recruitment was inferred to have oc-curred at the base of the advanced snake (Caenophidia)radiation, prior to the divergence of the Viperidae frommost other Caenophidians (Fry et al. 2008). This recruitedprotein was likely a P-III ancestor, with subsequent diver-gence of molecular scaffolds responsible for the generationof the P-II and P-I classes (Moura-da-Silva et al. 1996), re-cently speculated to be the result of neofunctionalizationof the DIS-like domains of duplicated P-III genes (Calveteet al. 2005). However, Moura-da-Silva et al. (1996) wereconstrained by the very limited molecular data set availableat the time and were unable to obtain significant supportfor a number of important nodes on their gene tree, includ-ing the monophyly of the P-II class and the P-I and P-II clas-ses. To date, no comprehensive analysis has beenundertaken to further elucidate the evolutionary historyof this medically important toxin family, with a limitednumber of studies investigating either the phylogenetic re-lationships of SVMPs isolated from closely related speciesor of specific classes or domains (e.g., Tsai et al. 2000; Chenet al. 2003; Calvete et al. 2005; Guo et al. 2007; Juarez et al.

FIG. 1. Schematic of SVMP classes and their posttranslationally modified forms in venom. SP—signal peptide, P—predomain, Pro—prodomain,Metalloproteinase—metalloproteinase domain, DIS—disintegrin domain, DIS-like—disintegrin-like domain, and CYS—cysteine-rich domain.Intact—P-Ia, P-IIb, and P-IIIa; proteolytically processed into multiple products—P-IIa and P-IIIb; and dimeric—P-IIc and P-IIIc.

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2638

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

2008). Notably, it remains untested whether the multipledomain structures observed in SVMPs have undergone thesame evolutionary history (i.e., independently of recombi-nation), and if so, what influence alterations in molecularstructure have bestowed on the rate of evolution and neo-functionalization of SVMP genes. Here, we trace the evo-lutionary history of SVMPs and their constituent domainsby implementing Bayesian inference analyses on multipledomain partitions of an extensive SVMP data set, beforeutilizing tests of adaptive molecular evolution on pointsof the tree where domain alterations were inferred to haveoccurred. These analyses permit investigation of the rolegene duplication and alterations in domain structure haveon the tempo of evolution and neofunctionalization ofcomplex multigene families.

Materials and Methods

Echis SVMPsVenom gland cDNA libraries were constructed for four spe-cies of saw-scaled viper, Echis ocellatus, E. coloratus,E. pyramidum leakeyi, and E. carinatus sochureki—repre-senting the four major species groups in the genus (Pooket al. 2009), using procedures previously outlined (Wagstaffand Harrison 2006; Casewell et al. 2009). Briefly, expressedsequence tags were generated and bioinformatically pro-cessed using the PartiGene pipeline (Parkinson et al.2004), including high stringency CLOBB clustering (Parkin-son et al. 2002; Wagstaff and Harrison 2006), into clustersrepresenting putative gene products. Subsequently, a singleclone from each cluster that exhibited Blast similarity toSVMPs was selected for full-length sequencing via primerwalking. Multiple nucleotide reads were stitched togetherin SeqMan (DNASTAR—Lasergene software suite),trimmed to the open reading frame, and translated intoamino acids. GenBank accession numbers of the Echis se-quences generated are displayed in the appropriate figures.

Non-Echis SVMPsNucleotide and amino acid SVMP sequences were obtainedthrough a search of the public databases (GenBank andUniProt accession numbers of sequences used are dis-played in the appropriate figures). Due to the greater avail-ability of amino acid sequences, nucleotide sequences weretranslated and incorporated into the amino acid data setbefore alignment in ClustalW (Thompson et al. 1994). Iden-tical sequences and those containing truncations or frame-shifts as the result of insertions or deletions were excludedin MEGA4 (Tamura et al. 2007). Due to the frequency ofpartial length sequences obtained from the public data-bases, sequences, which failed to demonstrate sequencesimilarity to the catalytic site (H-box motif) of the MP do-main (Fox and Serrano 2005), were excluded from the anal-ysis to ensure sufficient sequence overlap. In addition, P-ISVMP amino acid sequences derived exclusively fromproteomic studies were excluded due to classificationuncertainties as true P-I SVMPs or proteolytically processedP-II SVMPs. Following exclusions, non-Serpentes outgroup

sequences were identified by sequence similarity searchesagainst non-Serpentes sequence databases. Outgroup se-quences (GenBank: BC161221 and XM_001233495) andthe Echis data set were added to the nonredundant Ser-pentes data set, prior to realignment in ClustalW and finalmanual adjustments.

Phylogenetic and Recombination AnalysesPrior to phylogenetic analyses, data sets were partitionedinto: 1) full-length, 2) MP domain, 3) disintegrin/disintegrin-like (DIS) domain, 4) MP and DIS domains,5) MP, DIS, and cysteine-rich (CYS) domains, and 6)DIS and CYS domains (cf. fig. 1; domains identified asper Fox and Serrano 2005, 2008). The full data set was alsopartitioned into a data set consisting of P-I and P-II SVMPs(containing three representative P-IIIs for phylogeneticplacement) and another consisting of P-III SVMPs (con-taining two representative P-I and P-II SVMPs). Each dataset was tested for evidence of recombination in the re-combination detection program RDP3 v3.44 (Heathet al. 2006), using standard parameters and RDP, GENE-CONV, MaxChi, Chimaera, and 3Seq search models(Smith 1992; Padidam et al. 1999; Martin and Rybicki2000; Posada and Crandall 2001; Boni et al. 2007). Subse-quently, gene trees were produced from generated align-ments using optimized models of sequence evolutioncombined with Bayesian inference. Considering that com-plex models of sequence evolution have been demon-strated to extract additional phylogenetic signal fromdata (cf. Castoe et al. 2005; Castoe and Parkinson2006), we subjected the data sets to analysis in ModelGen-erator v0.85 (Keane et al. 2006). The model favored underthe Akaike information criterion (AIC) (Posada and Buck-ley 2004) was selected for all partitions. Bayesian inferenceanalyses were undertaken using Markov Chain MonteCarlo randomization in MrBayes v3.1 (Ronquist and Huel-senbeck 2003) on the freely available bioinformatic plat-form Bioportal (www.bioportal.uio.no; Kumar et al. 2009).Each data set was run in duplicate using four chains simul-taneously (three heated and one cold) for 5 � 106 gen-erations, sampling every 500th cycle from the chainand using default settings in regards to priors. We usedthe program Tracer v1.4 (Drummond and Rambaut2007) to estimate effective sample sizes for all parametersand to construct plots of ln(L) against generation to verifythe point of convergence (burn-in); trees generated priorto the completion of burn-in were discarded.

Detection of Adaptive Molecular EvolutionTo test for episodes of positive selection following the lossof SVMP domains, the branch-site codon substitutionmodel was implemented in CODEML in PAML4 (Yang2007). Two data sets were generated from the full aminoacid data set used in Bayesian analysis to test for selectionfollowing the loss of 1) the P-III cysteine-rich domain toform P-IIs (all viperid P-III and P-II SVMPs were retainedand P-I and non-viperid SVMPs [excluding outgroups] wereexcluded) and 2) the P-II DIS domain to form P-Is (all

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2639

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

viperid P-II and P-I SVMPs were retained and P-III and non-viperid SVMPs were excluded). The two data sets wereback-translated into DNA sequences (cf. supplementarytable 1, Supplementary Material online; for the small num-ber of amino acids sequences excluded due to the absenceof DNA sequence) in MEGA4 (Tamura et al. 2007) and par-titioned into first-, second-, and third-codon positions toincorporate any differences in patterns of sequence evolu-tion in MrModeltest v2.3 (Nylander 2004). Models selectedwere implemented for Bayesian analyses using the param-eters previously described. Bayesian DNA gene trees gener-ated were checked for homology to amino acid trees, andforeground branches allocated to clades containing morethan one DNA sequence where domain losses were inferredto have occurred. The CODEML alternative (branch-sitemodel A) and null model (Yang et al. 2005; Zhang et al.2005) were run in duplicate and compared with each otherfor eight sets of data: foreground branches representing se-quences in 1) six independent P-I class branches (num-bered 1–6 in supplementary fig. 1, SupplementaryMaterial online), 2) the P-II class branch (numbered 7 insupplementary fig. 2, Supplementary Material online),and 3) the P-II class branch with exclusion of the DIS do-main. Twice the log-likelihood difference between the twomodels was compared with the 50:50 mixture of pointmass 0 and v2 (degrees of freedom 5 1, critical values3.84 and 5.99 at 5% and 1% significance levels; Zhanget al. 2005). To exclude the possibility that positive selec-tion is acting independently of domain loss, we randomlyselected 20 branches from the DNA gene trees to act asa null test; nodes on the two trees were numbered consec-utively, and a random number generator used to selectbranches (cf. supplementary figs. 1 and 2, SupplementaryMaterial online for those selected). Tests of adaptive mo-lecular evolution were subsequently undertaken as previ-ously described. Differences between the test statistic(twice the log-likelihood difference) obtained from posi-tively selected null and domain loss branches were statis-tically analyzed: The tests statistics were log transformed tomeet normality according to the Anderson–Darling test(Anderson and Darling 1952) and tested for significance(P . 0.95) in two-tailed unequal variance t-test (Ruxton2006), as the result of significant differences in group var-iance (Bartlett 1937).

Where evidence of positive selection in the domain lossbranches was obtained by branch-site analysis, codon po-sitions identified by the Bayes Empirical Bayes (BEB)method (Yang et al. 2005) as under positive selection(where P , 0.05) were structurally mapped into SVMP3D macromolecular structures using YASARA (www.yasara.org). Due to the absence of P-II crystal structures, codonpositions identified in the MP domain of P-I and P-II SVMPforeground branches were mapped into a P-I structure act-ing as the domain template (BAP1 from Bothrops asper,PDB: 1ND1). Codons identified in the DIS domain of se-quences on the P-II foreground branch were mapped intoa DIS structure (Trimestatin from Protobothrops [formerlyTrimeresurus] flavoviridis, PDB: 1J2L). The accessible surface

area (ASA) was predicted for all individual residues presentin both macromolecular structures using Real-SPINE 3.0(Faraggi et al. 2009) with standard parameters. To testwhether positive selection is predominately acting uponsurface-exposed residues, ASA scores of positively selectedamino acid residues in P-I and P-II domain loss brancheswere statistically compared with entire residues presentin each macromolecular structure by two-tailed equal var-iance t-tests.

Results

Phylogenetic ParametersThe SVMP data set, containing 738 amino acid positions(n 5 190), was partitioned into the following data setsto rigorously test the hypothesis that differing SVMP do-mains have undergone the same evolutionary history: 1)full-length (738 amino acid positions), 2) MP domain(226), 3) DIS domain (110), 4) MP and DIS domains(336), 5) MP, DIS, and CYS domains (481) and 6) DISand CYS domains (255). The data set was also partitionedinto: P-I and P-II SVMPs (594 amino acid positions, n5 86)and P-III SVMPs (738 amino acid positions, n 5 96) to in-crease the resolution of nodes within major SVMP classes.The SVMP alignments revealed no evidence of recombina-tion following analyses in the recombination detection pro-gram RDP3 (Heath et al. 2006) using standard parameters(data not shown). Models of sequence evolution assignedby ModelGenerator (Keane et al. 2006) for the amino acidBayesian analyses are displayed in supplementary table 2,Supplementary Material online. Tracer (Drummond andRambaut 2007) revealed that the point of convergence(burn-in) for each Bayesian analysis had occurred priorto the first 1.5 � 106 generations for all parameters, butwe conservatively discarded these generations and calcu-lated the consensus trees from the remaining 75% ofthe posterior distribution. All parameters of the Traceranalyses had effective sample sizes above 200 and in mostcases by a large margin.

Evolutionary History of SVMPsConsensus gene trees produced by the data sets(representing different SVMP domains) revealed theconsistent presence of highly supported identical cladesand largely consistent tree topologies, particularly in thosedata sets encompassing multiple domains (full-length; MPand DIS; and MP, DIS, and CYS, data not shown). Perhaps un-surprisingly, the highest support values for the consistent to-pology found throughout the analyses were observed in theMP and disintegrin/disintegrin-like analysis (fig. 2), where thecysteine-rich domain (redundant for classes P-I and P-II) andthe conserved and proteolytically processed pre- and prodo-mains were excluded. Due to the absence of recombinationor strongly supported contradictory nodes in different do-main analyses, we accept the hypothesis that different SVMPdomains have undergone the same evolutionary history.

The consensus gene tree for the SVMP toxin family(fig. 2) displays a strongly supported monophyletic group

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2640

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

FIG. 2. Bayesian SVMP amino acid gene tree. Nodes with black circles received a Bayesian posterior probability of 1.00 and nodes with graycircles .0.95. Outgroup sequences are Xenopus tropicalis (BC161221) and Gallus gallus (XM_001233495).

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2641

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

consisting of Viperidae SVMPs from all three classes (P-I,P-II, and P-III), whereas non-viperid SVMPs are solely rep-resented by the P-III class. The evolutionary history of vi-perid SVMPs is characterized by repeated domain loss; theloss of the P-III cysteine-rich domain precedes the forma-tion of a P-II structure, which in turn precedes the evolu-tion of the P-I scaffold through the loss of the DIS domain(fig. 2) (Moura-da-Silva et al. 1996). The P-III class consen-sus gene tree (fig. 3) displays evidence of numerous stronglysupported clades containing SVMPs from members ofmultiple genera, highlighting the importance of geneduplication in the evolutionary history of this toxin family.The P-I/P-II consensus gene tree (fig. 4) contains a numberof strongly supported clades that are uniquely composed ofeither P-I or P-II SVMPs, alongside multiple clades thatcontain both P-I and P-II representatives. These resultshighlight apparent multiple convergent evolution of theP-I SVMP structure; loss of the P-II DIS domain to formthe P-I scaffold has evolved independently on at least eightoccasions within the Viperidae.

Adaptive Molecular EvolutionTests of adaptive molecular evolution utilized back-translated amino acid data sets consisting of 2082DNA positions of P-III and P-II SVMPs (n 5 128) and1635 DNA positions of P-II and P-I SVMPs (n 5 76).Models of sequence evolution assigned by MrModeltest(Nylander 2004) for Bayesian analyses are displayed insupplementary table 2, Supplementary Material online.The DNA consensus gene trees (supplementary figs. 1and 2, Supplementary Material online) produced nodesconsistent with the amino acid analyses (figs. 2 and 4)where domain loss was inferred to have occurred. Fore-ground branches for tests of positive selection in CO-DEML (Yang 2007) were selected on clades containingtwo representatives where episodes of domain loss wereinferred to have occurred (cf. fig. 4 and supplementaryfigs. 1 and 2, Supplementary Material online); once fol-lowing the loss of the P-III cysteine-rich domain to forma P-II structure (P-III and P-II data set), and six times fol-lowing the convergent loss of the P-II DIS domain toform a P-I (P-II and P-I data set). Foreground brancheswere tested against background branches for evidenceof positive selection following the divergence of toxinsexhibiting domain loss. The P-II foreground branchand four of the six P-I branches exhibited highly signif-icant (P, 0.01) evidence of positive selection over back-ground branches (table 1). Because the DIS domain ofP-IIs and the DIS-like domain of P-IIIs exhibit consider-able sequence disparity following their divergence (Foxand Serrano 2005, 2008), we repeated the P-III and P-IIanalysis excluding these domains to test whether the MPdomain of these SVMP classes had also evolved by pos-itive selection—the results remained highly significant (P, 0.01) (table 1). Null tests of positive selection on 20randomly selected branches in the SVMP trees (cf.supplementary figs. 1 and 2, Supplementary Materialonline) revealed evidence of positive selection in ten

branches. Although this represents a lower percentage(50%) than branches where domain loss was inferredto have occurred (71%—5 of 7 branches), it is apparentthat positive selection acting on SVMPs is not exclusivelyassociated with domain loss. However, it is notable thatwhen statistically comparing the positive selection teststatistic between the two groups of branches exhibitingevidence of positive selection (i.e., domain loss bran-ches—cf. table 1 and null branches—cf. supplementarytable 3, Supplementary Material online), a significantincrease (P 5 0.040) was observed in branches wheredomain loss was inferred to have occurred (cf. supple-mentary table 4, Supplementary Material online for teststatistics). These results robustly imply selectivepressures acting on SVMPs are strongest following theloss of domains.

Domain loss foreground branches that exhibited evi-dence of positive selection were analyzed with the BEBmethod in CODEML (Yang et al. 2005) to detect the aminoacid residues that are the subject of positive selection. Res-idues in each branch that tested significantly (P , 0.05; cf.supplementary table 5, Supplementary Material online fora list) were mapped to P-I (MP domain) and DIS 3D mac-romolecular structures (fig. 5). Nineteen residues wereidentified in the four P-I foreground branches (MP domainonly), whereas 53 residues (MP domain—38, DIS do-main—15) were identified in the P-II foreground branch(table 1). It is worth noting that of the 15 DIS domain res-idues identified here, 11 correspond to those previouslyidentified by tests of positive selection specific to the evo-lution of this domain (Juarez et al. 2008). For both analysesundertaken here, the majority of the identified residues arepredicted to be surface exposed on the macromolecularstructure (fig. 5). Notably, the residues identified from bothsets of branch tests exhibited highly significant increases inReal-SPINE 3.0 ASA scores (Faraggi et al. 2009) when com-pared with the total residues present in each macromolec-ular structure (P 5 0.021 for P-I branch residues and P 5

0.010 for P-II branch residues—cf. supplementary table 6,Supplementary Material online for test statistics), provid-ing strong evidence that highly surface-exposed residuesare the focus of positive selection.

Discussion

Recruitment and Diversification of SVMPsMultiple phylogenetic analyses of different SVMP domainsrevealed consistent evolutionary relationships (data notshown) and absence of recombination, implying differentdomains have undergone the same evolutionary historyand therefore do not confound traditional phylogeneticanalyses. Our results confirm that the ancestral recruit-ment of a P-III-like SVMP likely occurred at the base ofthe Caenophidian radiation, as previously suggested byMoura-da-Silva et al. (1996) and Fry et al. (2008), priorto substantial gene duplication and diversification(fig. 2). The alternative hypothesis, where independentrecruitment events have occurred once in the Viperidae

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2642

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

and again at the base of the remaining Caenophidians, haspreviously been rejected (Fry and Wuster 2004). Notably,the paraphyly of non-viperid SVMPs (i.e., Elapidae andAtractaspidae P-III SVMPs) near the base of the tree (fig.2) strongly implies that recruitment and primary gene du-plication occurred prior to the divergence of the Caenophi-dians, with the majority of subsequent increases in genediversity occurring in the Viperidae. The identification ofrobustly supported nodes supporting the placement ofP-I and P-II SVMPs within the viperid P-IIIs (fig. 2) confirmsthe diversification of SVMP molecular scaffolds has

occurred following the separation of the viperids fromthe remaining Caenophidia; these observations explainthe absence of P-I and P-II SVMPs from non-viperid speciesand suggest their functional activities are exclusive to viper-id venom. Despite the production of consistent tree topol-ogies derived from multiple domain analyses, themonophyly of the clade containing the P-I and P-II classesis not significantly (.0.95) supported in these analyses (cf.fig. 2). However, DNA analysis of the Viperidae P-III and P-IIclasses produced a strongly supported monophyletic P-IIclade (supplementary fig. 2, Supplementary Material

FIG. 3. Bayesian P-III SVMP amino acid gene tree. Nodes with black circles received a Bayesian posterior probability of 1.00 and nodes with graycircles .0.95. Phylogenetic placement of P-I and P-II class SVMPs are boxed in gray. Identified evidence of posttranslational modifications(dimerism and proteolytic processing) and the respective P-III subclass (Fox and Serrano 2005, 2008) are mapped to clades. Outgroupsequences are Xenopus tropicalis (BC161221) and Gallus gallus (XM_001233495).

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2643

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

online). Although a small number of P-II sequences wereexcluded from this analysis (cf. supplementary table 1, Sup-plementary material online; absence of DNA sequence), in-cluding the basal Agkistrodon bilineatus toxin sequence(GenBank: P0C6E3) (cf. fig. 2), the results here and else-where (cf. fig. 4) imply that P-I and P-II SVMPs representa monophyletic group, despite the absence of significantnode support in the full amino acid data set.

Evolution of P-II SVMPsThe evolutionary history of the viperid SVMPs implies theorigin of the P-II structure has occurred through the loss ofthe P-III cysteine-rich domain followed by mutation of the

DIS-like domain, forming the DIS domain (Moura-da-Silvaet al. 1996; Calvete et al. 2003; Juarez et al. 2008). The DISdomain is characterized by the loss of specific cysteine res-idues which remain in the DIS-like domains of their P-IIIprecursors (Calvete et al. 2003; Fox and Serrano 2005; Juarezet al. 2008). Notably, the most basal P-II SVMPs have beenidentified as those that do not undergo proteolyticprocessing (cf. fig. 4) and, although possibly coincidental,it is worth noting that dimeric clades of SVMPs exist nearthe base of both the P-III and P-II/P-I diversifications (figs. 3and 4), suggesting that the formation of a dimeric structuremay facilitate the subsequent diversification and modifica-tion of SVMP domain structures. Subsequent evolutionary

FIG. 4. Bayesian P-I and P-II SVMP amino acid gene tree. Nodes with black circles received a Bayesian posterior probability of 1.00 and nodeswith gray circles .0.95. Phylogenetic placement of P-I class SVMPs are boxed in gray. Identified evidence of dimerism and non-proteolyticprocessing and the respective P-II subclass (Fox and Serrano 2005, 2008) are mapped to clades. Phylogenetic placement of numbers representthe nodes selected as foreground branches for adaptive molecular evolution analyses: 1–6—formation of P-I structures; 7—formation of P-IIstructures. Outgroup sequences are Xenopus tropicalis (BC161221) and Gallus gallus (XM_001233495).

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2644

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

loss of dimerism may provide a functional advantage byreducing toxin size and enabling more rapid physiologicaldiffusion in prey (Doley and Kini 2009).

Evolution of P-I SVMPsThe formation of the P-I molecular scaffold has occurredthrough the loss of the P-II DIS domain on at least eightindependent occasions (fig. 4). Notably, genus-specific

domain losses have occurred in the genera Echis, Macrovi-pera, Deinagkistrodon, and at least twice in Bothrops. Giventhe incomplete sampling of snake venom SVMPs, this likelyunderrepresents the true number of DIS domain losses thathave occurred throughout the evolutionary history of theViperidae. The presence of multiple clades containing bothP-I and P-II SVMPs imply these domain losses are conver-gent, with parallel loss occurring independently in multiplegenera. This result is surprising because the evolution of theP-II structure appears to have occurred once at the base ofthe Viperidae radiation (fig. 2) approximately 62 Ma (Wus-ter et al. 2008); P-I evolution has apparently occurred muchlater following 1) the split of Macrovipera from Bitis (;40Ma), 2) the Echis radiation (;18 Ma) in the Viperinae, and3) the split of Bothrops from Crotalus in the Crotalinae(;22 Ma) (Wuster et al. 2008) (fig. 4).

Adaptive Evolution of P-II SVMPsTests of adaptive molecular evolution revealed that loss ofSVMP domain structures (cysteine-rich domain from P-IIIsand subsequently the DIS domain from P-IIs) precede sig-nificant increases in positive selection on the novel molec-ular scaffold (table 1). Notably, this truncation of SVMPtoxins appears to facilitate the adaptive evolution of sur-face-exposed amino acid residues (fig. 5) likely responsiblefor protein–protein interaction and function. Our resultssupport the hypothesis that the P-II molecular scaffoldhas evolved through the loss of the cysteine-rich domainfollowed by accelerated mutation of the DIS-like domain(fig. 5C) (Moura-da-Silva et al. 1996; Calvete et al. 2005).This accelerated evolution, causing the formation of theDIS domain, has also facilitated protein neofunctionaliza-tion. SVMP disintegrins have developed novel, potent,platelet aggregation inhibitory activities, due to the evolu-tion and surface exposure of integrin-binding motifs thatare absent in the disintegrins-like domains of P-IIIs (Calveteet al. 2005; Fox and Serrano 2005, 2008). Notably, the morebasal unprocessed P-IIb subclass exhibits considerablylower platelet aggregation inhibitory activities than theirderived proteolytically processed P-IIa counterparts (Foxand Serrano 2005)—providing strong evidence that opti-mization of this neofunctionalization has occurred (fig.4). However, positive selection has also acted upon numer-ous residues present in the MP domain (fig. 5B), implying

Table 1. Statistics From Tests of Adaptive Molecular Evolution on Domain Loss Foreground Branches.

Data Set 2 3 Log Likelihood Significance (P) Positive Sites

Evolutionary loss of DIS domain P-II and P-I—branch 1 15.66 <0.001 MP domain—2P-II and P-I—branch 2 8.68 0.003 MP domain—3*P-II and P-I—branch 3 <0.01 ns —P-II and P-I—branch 4 0.98 ns —P-II and P-I—branch 5 79.90 <0.001 MP domain—5P-II and P-I—branch 6 58.13 <0.001 MP domain—9

Evolutionary loss of CYS domain P-III and P-II—branch 7 (DIS domain included) 130.03 <0.001 MP domain—31P-III and P-II—branch 7 (DIS domain included) 198.57 <0.001 MP domain—38

DIS domain—15

NOTE.—Log likelihood is the difference in test statistic between the alternative and null model, and significance is the significance of the model when compared with itsneutral partner under the v2 distribution. Positive sites indicate the number of sites identified by the BEB method (Yang et al. 2005) as under positive selection (P , 0.05).MP—metalloproteinase domain, DIS—disintegrin domain, CYS—cysteine-rich domain, ns—not significant. *Only one of the three residues was mapped to themacromolecular structure (cf. fig. 5)—the remaining two sites were identified in a region that extends past the C-terminus of the structure.

FIG. 5. The effect of positive selection on SVMPs following the lossof domains. Codons identified by sequence analysis as underpositive selection (colored) are mapped to 3D macromolecularstructures at 0� and rotated 180�. (A) codons identified in themetalloproteinase domain of P-I foreground branches (cf. fig. 4):green—branch 1, blue—branch 2, cyan—branch 5, and magen-ta—branch 6. (B) codons identified in the metalloproteinasedomain and (C) the DIS domain of the P-II foreground branch(branch 7) are displayed in red.

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2645

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

that selective pressures are not exclusively associated withthe neofunctionalization of the DIS domain. These resultsare also consistent with observations from proteolyticallyprocessed P-II and P-III SVMPs (cf. fig. 1); the MP domain ofP-IIIs is thought to be rapidly degraded following process-ing, whereas in P-IIs, the domain is stable and functionallyactive (Modesto et al. 2005; Fox and Serrano 2008).Considering basal P-II SVMPs are not proteolyticallyprocessed (fig. 4), we propose that the P-II precursor(i.e., a P-III SVMP) was incapable of proteolytically process-ing the newly evolved DIS domain, thereby also permittingthe MP domain to selectively neofunctionalize throughretention. Subsequently, mutations likely facilitated theproteolytic processing of the DIS domain to form two func-tionally distinct domain products (Modesto et al. 2005),both of which have neofunctionalized or optimized theirfunctional potency compared with domain homologuespresent in ancestral P-III SVMPs (Fox and Serrano 2005,2008; Modesto et al. 2005).

Adaptive Evolution of P-I SVMPsPositive selection has also driven the evolution of the MPdomain in P-I structures, following the convergent loss ofDIS domains (table 1). Of the 19 sites identified by tests of

adaptive molecular evolution, 17 occur in regions of the MPdomain adjacent (up to four codons separation) to thoseidentified as being under selection in the P-II structure (cf.fig. 5). Notably, only two of these residues are identical, im-plying that evolution of the P-I MP domain is predomi-nately occurring through the mutation of residues thathave not previously been the subject of selection. As withthe P-II foreground branch, these identified residues exhibitsignificantly higher ASA scores than would be expected bychance (P 5 0.021), suggesting positive selection is princi-pally acting upon the surface-exposed regions of the mac-romolecular structure (fig. 5A). P-I SVMPs predominatelyexhibit hemorrhagic and/or fibrinolytic activities, yet theyare widely regarded to be the least hemorrhagic of theSVMP classes (Fox and Serrano 2005, 2008, 2009). However,fibrinolytic functionality is seemingly associated with P-ISVMPs—especially when considering a number of the in-dependent P-I clades determined here contain SVMPs thathave been characterized as exhibiting fibrin(ogen)olytic ac-tivity (Baker et al. 1995; Rodrigues et al. 2000; Bernardeset al. 2008; Jia et al. 2009). We therefore hypothesize thatevolution is convergently driving fibrinolytic neofunction-alization of the P-I class through the truncation of the P-IIstructure, perhaps as a response to dietary selection

FIG. 6. Summary schematic detailing the evolutionary history of the SVMP toxin family including identified domain loss events. SP—signalpeptide, P—predomain, Pro—prodomain, Proteinase—metalloproteinase domain, DIS—disintegrin domain, DIS-like—disintegrin-like domain,and CYS—cysteine-rich domain. Short coding disintegrins, pre- and prodomain truncations, and partial disintegrin loss have previously beendescribed (cf. Okuda et al. 2002; Wagstaff et al. 2006; Fry et al. 2008). Proteinase loss was observed in sequence data generated for this study(GenBank: GU012129).

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2646

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

pressures for different prey types (cf. Daltry et al. 1996;Barlow et al. 2009; Gibbs and Mackessy 2009). Furthermore,it is apparent that evolutionary pressures appear to be re-ducing the severity of SVMP-induced hemorrhagic activity(from P-III to P-I) in favor of altering domain structures todiversify function. However, it is notable that P-II and P-IIIclass SVMPs have been retained in every genus where theevolution of the P-I molecular scaffold has occurred (cf. figs.3 and 4). We therefore suggest that, through gene dupli-cation, the diversification of multiple SVMP class isoformsexhibiting structural and functional diversity present anevolutionary advantage for prey capture over the ‘‘optimi-zation’’ of a single SVMP structure.

SVMPs and Models of Gene EvolutionThe evolutionary history of the SVMPs provides a valu-able insight into how complex gene families evolve bygene duplication. We have identified how alterationsin domain structure (through the loss of toxin domains)underlie the accelerated evolution of surface-exposedresidues that likely result in the neofunctionalizationof toxins. Current models of gene neofunctionalizationpropose the evolution of secondary functions occurprior to gene duplication, permitting the maintenanceof multiple genes through selection for divergent func-tions (Hughes 1994; Bergthorsson et al. 2007). TheSVMPs provide support for these models, particularlywhen considering the evolution of P-I structures fromP-IIs: proteolytically processed P-IIs are capable of form-ing multiple domain products (MP and DIS domains)with different activities (Modesto et al. 2005)—assumingthat the DIS domain (which has evolved from a P-III pre-cursor DIS-like domain) encodes a primary function andthe MP domain a secondary function, selective pressuresacting to promote the secondary function are likely toinduce gene amplification, and, ultimately, facilitatethe loss of the primary function in the duplicate genethrough loss of the DIS domain.

ConclusionBy exploring the evolutionary history of complex multigenefamilies, insights into the processes that govern the func-tional evolution of genes can be obtained. The SVMPs arean extreme example of multigene protein family evolution—their history is punctuated by posttranslational modi-fications, accelerated evolution by positive selection,and, perhaps most importantly, multiple episodes of do-main alteration, with the domain losses described hereand elsewhere (Okuda et al. 2002; Wagstaff et al. 2006;Fry et al. 2008) resulting in frequent alterations to the mo-lecular scaffold (fig. 6). The SVMPs therefore providea model system to investigate the influencing role of do-main loss on gene family evolution. Here, we have demon-strated how the neofunctionalization of duplicate genesencoded by complex multigene families can be facilitatedby domain loss. The ensuing generation of novel molecularscaffolds likely present an evolutionary advantage, with

subsequent accelerated evolution of duplicate genes bypositive selection permitting the retention of multiplegenes capable of encoding functionally distinct proteins.

Supplementary MaterialSupplementary tables 1–6 and figures 1 and 2 are availableat Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).

AcknowledgmentsWe thank the following persons and institutions whohelped us during the course of this study: P. Rowley (Liv-erpool School of Tropical Medicine); D. Egan and P. Ver-cammen (Breeding Centre for Endangered ArabianWildlife, United Arab Emirates); A. Hedley, M. Blaxter(Natural Environmental Research Council [NERC] Molec-ular Genetics Facility, University of Edinburgh); andT. Booth, B. Tiwari, and J. Soares (NERC EnvironmentalBioinformatics Centre, Oxford). This work was fundedby Research Studentship NER/S/A/2006/14086 fromthe NERC to N.R.C, access to the NERC Molecular Genet-ics Facility at the University of Edinburgh (ref MGF 150)to W.W., the Leverhulme Trust (Grant F/00 174/I) toW.W. and R.A.H., and the Biotechnology and BiologicalSciences Research Council to R.A.H. and S.C.W. (BB/F012675/1).

ReferencesAnderson TW, Darling DA. 1952. Asymptotic theory of certain

‘‘goodness of fit’’ criteria based on stochastic processes. AnnMath Stat. 23:193–212.

Baker BJ, Wongvibulsin S, Nyborg J, Tu AT. 1995. Nucleotide-sequence encoding the snake-venom fibrinolytic enzymeatroxase obtained from a Crotalus atrox venom gland cDNAlibrary. Arch Biochem Biophys. 317:357–364.

Barlow A, Pook CE, Harrison RA, Wuster W. 2009. Co-evolution ofdiet and prey-specific venom activity supports the role ofselection in snake venom evolution. Proc R Soc B Biol Sci.276:2443–2449.

Bartlett MS. 1937. Properties of sufficiency and statistical tests. ProcR Soc A (Lond). 160:268–282.

Bazaa A, Marrakchi N, El Ayeh M, Sanz L, Calvete JJ. 2005. Snakevenomics: comparative analysis of the venom proteomes of theTunisian snakes Cerastes cerastes, Cerastes vipera and Macro-vipera lebetina. Proteomics 5:4223–4235.

Bergthorsson U, Andersson DI, Roth JR. 2007. Ohno’s dilemma:evolution of new genes under continuous selection. Proc NatlAcad Sci U S A. 104:17004–17009.

Bernardes CP, Santos-Filho NA, Costa TR, et al. (13 co-authors).2008. Isolation and structural characterization of a newfibrin(ogen)olytic metalloproteinase from Bothrops moojenisnake venom. Toxicon 51:574–584.

Bjarnason JB, Fox JW. 1994. Hemorrhagic metalloproteinases fromsnake venoms. Pharmacol Ther. 62:325–372.

Boni MF, Posada D, Feldman MW. 2007. An exact nonparametricmethod for inferring mosaic structure in sequence triplets.Genetics 176:1035–1047.

Calvete JJ, Marcinkiewicz C, Monleon D, Esteve V, Celda B, Juarez P,Sanz L. 2005. Snake venom disintegrins: evolution of structureand function. Toxicon 45:1063–1074.

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2647

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

Calvete JJ, Moreno-Murciano MP, Theakston DG, Kisiel DG,Marcinkiewicz C. 2003. Snake venom disintegrins: novel dimericdisintegrins and structural diversification by disulphide bondengineering. Biochem J. 372:725–734.

Casewell NR, Harrison RA, Wuster W, Wagstaff SC. 2009.Comparative venom gland transcriptome surveys of the saw-scaled vipers (Viperidae: Echis) reveal substantial intra-familygene diversity and novel venom transcripts. BMC Genomics.10:564.

Casewell NR, Wagstaff SC, Harrison RA, Wuster W. 2011. Gene treeparsimony of multi-locus snake venom protein families revealsspecies tree conflict as a result of multiple parallel gene loss. MolBiol Evol. 28(3):91–110.

Castoe TC, Parkinson CL. 2006. Bayesian mixed models and thephylogeny of pitvipers (Viperidae: Serpentes). Mol PhylogenetEvol. 39:91–110.

Castoe TC, Sasa M, Parkinson CL. 2005. Modelling nucleotideevolution at the mesoscale: the phylogeny of the Neotropical pitvipers of the Porthidium group (Viperidae: Crotalinae). MolPhylogenet Evol. 37:881–898.

Chen R-Q, Jin Y, Wu J-B, Zhou X-D, Lu Q-M, Wang W-Y, Xiong Y-L.2003. A new protein structure of P-II class snake venommetalloproteinases: it comprises metalloproteinase and disinte-grin domains. Biochem Biophys Res Commun. 310:182–187.

Daltry JC, Wuster W, Thorpe RS. 1996. Diet and snake venomevolution. Nature 379:537–540.

Doley R, Kini RM. 2009. Protein complexes in snake venom. Cell MolLife Sci. 66:2851–2871.

Drummond AJ, Rambaut A. 2007. BEAST: bayesian evolutionaryanalysis by sampling trees. BMC Evol Biol. 7:214.

Ducancel F, Matre V, Dupont C, Lajeunesse E, Wollberg Z, Bdolah A,Kochva E, Boulain JC, Menez A. 1993. Cloning and sequence-analysis of cDNAs encoding precursors of sarafotoxins—evi-dence for an unusual rosary-type organization. J Biol Chem.268:3052–3055.

Faraggi E, Xue B, Zhou Y. 2009. Improving the prediction accuracy ofresidue solvent accessibility and real-value backbone torsionangles of proteins by guided-learning through a two-layer neuralnetwork. Proteins 74:847–856.

Fox JW, Serrano SMT. 2005. Structural considerations of the snakevenom metalloproteinases, key members of the M12 reprolysinfamily of metalloproteinases. Toxicon 45:969–985.

Fox JW, Serrano SMT. 2008. Insights into and speculations aboutsnake venom metalloproteinase (SVMP) synthesis, folding anddisulfide bond formation and their contribution to venomcomplexity. FEBS J. 275:3016–3030.

Fox JW, Serrano SMT. 2009. Timeline of key events in snake venommetalloproteinase research. J Proteomics. 72:200–209.

Fry BG, Roelants K, Winter K, et al. (11 co-authors). 2010. Novelvenom proteins produced by differential domain-expressionstrategies in beaded lizards and gila monsters (genus Heloderma).Mol Biol Evol. 27:395–407.

Fry BG, Scheib H, van der Weerd L, Young B, McNaughtan J,Ramjan SFR, Vidal N, Poelmann RE, Norman JA. 2008. Evolutionof an arsenal. Structural and functional diversification of thevenom system in the advanced snakes (Caenophidia). Mol CellProteomics. 7:215–246.

Fry BG, Wuster W. 2004. Assembling an arsenal: origin and evolutionof the snake venom proteome inferred from phylogeneticanalysis of toxin sequences. Mol Biol Evol. 21:870–883.

Fry BG, Wuster W, Kini RM, Brusic V, Khan A, Venkataraman D,Rooney AP. 2003. Molecular evolution and phylogeny of theelapid snake venom three-finger toxins. J Mol Evol. 57:110–129.

Gibbs HL, Mackessy SP. 2009. Functional basis of a molecularadaptation: prey-specific toxic effects of venom from Sistrurusrattlesnakes. Toxicon 53:672–679.

Glassey B, Civetta A. 2004. Positive selection at reproductive ADAMgenes with potential intercellular binding activity. Mol Biol Evol.21:851–859.

Guo X-X, Zeng L, Lee W-H, Zhang Y, Jin Y. 2007. Isolation andcloning of a metalloproteinase from king cobra snake venom.Toxicon 49:954–965.

Heath L, van der Walt E, Varsani A, Martin DP. 2006. Recombinationpatterns in aphthoviruses mirror those found in picornaviruses.J Virol. 80:11827–11832.

Hite LA, Jia L-G, Bjarnason JB, Fox JW. 1994. cDNA sequences forfour snake venom metalloproteinases: structure, classification,and their relationship to mammalian reproductive proteins.Arch Biochem Biophys. 308:182–191.

Hughes AL. 1994. The evolution of functionally novel proteins aftergene duplication. Proc R Soc B Biol Sci. 256:119–124.

Jia Y, Lucena S, Cantu E, Sanchez EE, Perez JC. 2009. cDNA cloning,expression and fibrin(ogen)olytic activity of two low-molecularweight snake venom metalloproteinases. Toxicon 54:233–243.

Juarez P, Comas I, Gonzalez-Candelas F, Calvete JJ. 2008. Evolution ofsnake venom disintegrins by positive Darwinian selection. MolBiol Evol. 25:2391–2407.

Keane TM, Creevey CJ, Pentony MM, Naughton TJ, McInerney JO.2006. Assessment of methods for amino acid matrix selectionand their use on empirical data shows that ad hoc assumptionsfor choice of matrix are not justified. BMC Evol Biol. 6:29.

Kimura M, Ohta T. 1974. On some principles governing molecularevolution. Proc Natl Acad Sci U S A. 71:2848–2852.

Kini RM, Chan YM. 1999. Accelerated evolution and molecular surfaceof venom phospholipase A2 enzymes. J Mol Evol. 48:125–132.

Kordis D, Gubensek F. 2000. Adaptive evolution of animal toxinmultigene families. Gene 261:43–52.

Kumar S, Skjæveland A, Orr RJS, Enger P, Ruden T, Mevik B-H, Burki F,Botnen A, Shalchian-Tabrizi K. 2009. AIR: a batch-oriented webprogram package for construction of supermatrices ready forphylogenomic analyses. BMC Bioinformatics. 10:357.

Lynch M, Conery JS. 2000. The evolutionary fate and consequencesof duplicate genes. Science 290:1151–1155.

Lynch VJ. 2007. Inventing an arsenal: adaptive evolution andneofunctionalization of snake venom phospholipase A2 genes.BMC Evol Biol. 7:2.

Martin D, Rybicki E. 2000. RDP: detection of recombination amongstaligned sequences. Bioinformatics 16:562–563.

Modesto JCA, Junqueira-de-Azevedo ILM, Neves-Ferreira AGC,Fritzen M, Oliva MLV, Ho PL, Perales J, Chudzinski-Tavassi AM. 2005. Insularinase A, a prothrombin activator fromBothrops insularis venom, is a metalloproteinase derived froma gene encoding protease and disintegrin domains. J Biol Chem.386:589–600.

Moura-da-Silva AM, Theakston RDG, Crampton JM. 1996. Evolutionof disintegrin cysteine-rich and mammalian matrix-degradingmetalloproteinases: gene duplication and divergence of a com-mon ancestor rather than convergent evolution. J Mol Evol.43:263–269.

Nei M, Gu X, Sitnikova T. 1997. Evolution by the birth-and-deathprocess in multigene families of the vertebrate immune system.Proc Natl Acad Sci U S A. 94:7799–7806.

Nylander JAA. 2004. MrModeltest v2. Program distributed by theauthor. Uppsala (Sweden): Evolutionary Biology Centre, UppsalaUniversity.

Ohno S. 1970. Evolution by gene duplication. New York: Springer.Ohno S. 1973. Ancient linkage groups and frozen accidents. Nature

244:259–262.Ohta T. 1991. Multigene families and the evolution of complexity.

J Mol Evol. 33:34–41.

Casewell et al. · doi:10.1093/molbev/msr091 MBE

2648

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021

Okuda D, Koike H, Morita T. 2002. A new gene structure of thedisintegrin family: a subunit of dimeric disintegrin has a shortcoding region. Biochemistry 41:14248–14254.

Padidam M, Sawyer S, Fauquet CM. 1999. Possible emergence of newgeminiviruses by frequent recombination. Virology 265:218–225.

Parkinson J, Anthony A, Wasmuth J, Schmid R, Hedley A, Blaxter M.2004. PartiGene—constructing partial genomes. Bioinformatics20:1398–1404.

Parkinson J, Guiliano DB, Blaxter M. 2002. Making sense of ESTsequences by CLOBBing them. BMC Bioinformatics. 3:31.

Pook CE, Joger U, Stumpel N, Wuster W. 2009. When continentscollide: phylogeny, historical biogeography and systematics ofthe medically important viper genus Echis (Squamata: Ser-pentes: Viperidae). Mol Phylogenet Evol. 53:792–807.

Posada D, Buckley TR. 2004. Model selection and model averaging inphylogenetics: advantages of akaike information criterion andbayesian approaches over likelihood ratio tests. Syst Biol.53:793–808.

Posada D, Crandall KA. 2001. Evaluation of methods for detectingrecombination from DNA sequences: computer simulations.Proc Natl Acad Sci U S A. 98:13757–13762.

Richards TA, Cavalier-Smith T. 2005. Myosin domain evolution andthe primary divergence of eukaryotes. Nature 436:1113–1118.

Rodrigues VM, Soares AM, Guerra-Sa R, Rodrigues V, Fontes MRM,Giglio JR. 2000. Structural and functional characterisation ofNeuwiedase, a nonhemorrhagic fibrin(ogen)olytic metallopro-tease from Bothrops neuwiedi snake venom. Arch BiochemBiophys. 381:213–224.

Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: bayesian phylogeneticinference under mixed models. Bioinformatics 19:1572–1574.

Ruxton GD. 2006. The unequal variance t-test is an underusedalternative to Student’s t-test and the Mann–Whitney U test.Behav Ecol. 17:688–690.

Sanz L, Escolano J, Ferretti M, Biscoglio MJ, Rivera E, Crescenti EJ,Angulo Y, Lomonte B, Gutierrez JM, Calvete JJ. 2008. Snakevenomics of the South and Central American Bushmasters.Comparison of the toxin composition of Lachesis muta gatheredfrom proteomic versus transcriptomic analysis. J Proteomics.71:46–60.

Smith JM. 1992. Analyzing the mosaic structure of genes. J Mol Evol.34:126–129.

Soares MR, Oliveria-Carvalho AL, Wermelinger LS, Zingali RB, Ho PL,Junqueira-de-Azevedo IDM, Diniz MRV. 2005. Identification of

novel bradykinin-potentiating peptides and C-type natriureticpeptide from Lachesis muta venom. Toxicon 46:31–38.

Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: MolecularEvolutionary Genetics Analysis (MEGA) software version 4.0.Mol Biol Evol. 24(8):1596–1599.

Thompson JD, Higgins DG, Gibson TJ. 1994. Clustal W: improvingthe sensitivity of progressive multiple sequence alignmentthrough sequence weighting, position-specific-gap penaltiesand weight matrix choice. Nucleic Acids Res. 22:4673–4680.

Tsai I-H, Wang Y-M, Chiang T-Y, Chen Y-L, Huang R-J. 2000.Purification, cloning and sequence analysis for pro-metal-loproteinase-disintegrin variants from Deinagkistrodon acutusvenom and subclassification of the small venom metalloprotei-nases. Eur J Biochem. 267:1359–1367.

Wagstaff SC, Favreau P, Cheneval O, Laing GD, Wilkinson MC,Miller RL, Stocklin R, Harrison RA. 2008. Molecular character-isation of endogenous snake venom metalloproteinase inhib-itors. Biochem Biophys Res Commun. 365:650–656.

Wagstaff SC, Harrison RA. 2006. Venom gland EST analysis of thesaw-scaled viper, Echis ocellatus, reveals novel a9b1 integrin-binding motifs in venom metalloproteinases and a new group ofputative toxins, renin-like aspartic proteases. Gene 377:21–32.

Wagstaff SC, Laing GD, Theakston RDG, Papaspyridis C,Harrison RA. 2006. Bioinformatics and multiepitope DNAimmunization to design rational snake antivenom. PLoS Med.3(6):e184.

Wagstaff SC, Sanz L, Juarez P, Harrison RA, Calvete JJ. 2009.Combined snake venomics and venom gland transcriptomicanalysis of the ocellated carpet viper, Echis ocellatus.J Proteomics. 71:609–623.

Wuster W, Peppin L, Pook CE, Walker DE. 2008. A nesting of vipers:phylogeny and historical biogeography of the Viperidae(Squamata: Serpentes). Mol Phylogenet Evol. 49:445–459.

Yang Z. 2007. PAML4: Phylogenetic Analysis by Maximum Likeli-hood. Mol Biol Evol. 24:1586–1591.

Yang Z, Wong WSW, Nielsen R. 2005. Bayes empirical Bayesinference of amino acid sites under positive selection. Mol BiolEvol. 22:1107–1118.

Zhang J, Nielsen R, Yang Z. 2005. Evaluation of an improved branch-site likelihood method for detecting positive selection at themolecular level. Mol Biol Evol. 22:2472–2479.

Zhang J, Rosenberg HF, Nei M. 1998. Positive Darwinian selectionafter gene duplication in primate ribonuclease genes. Proc NatlAcad Sci U S A. 95:3708–3713.

Evolution of Duplicate Toxin Genes · doi:10.1093/molbev/msr091 MBE

2649

Dow

nloaded from https://academ

ic.oup.com/m

be/article/28/9/2637/1012429 by guest on 09 Decem

ber 2021