The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The...

12
The phylogenetics of the global population of potato virus Y and its necrogenic recombinants Adrian J. Gibbs 1, *, Kazusato Ohshima 2 , Ryosuke Yasaka 2,† , Musa Mohammadi 3,‡ , Mark J. Gibbs 4 , and Roger A. C. Jones 5,6 1 Emeritus Faculty, Australian National University, Canberra, ACT 2601, Australia, 2 Laboratory of Plant Virology, Department of Applied Biological Sciences, Faculty of Agriculture, Saga University, 1-banchi, Honjo-machi, Saga 840-8502, Japan, 3 Department of Plant Protection, Vali-e-asr University of Rafsanjan, Rafsanjan, Iran, 4 79 Carruthers St, Curtin, ACT 2605, Australia, 5 Department of Agriculture and Food Western Australia, Institute of Agriculture, Faculty of Science, The University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia and 6 3 Baron-Hay Court, South Perth, WA 6151, Australia *Corresponding author: E-mail: [email protected] http://orcid.org/0000-0002-2240-1767 http://orcid.org/0000-0003-0725-417X Abstract Potato virus Y (PVY) is a major pathogen of potatoes and other solanaceous crops worldwide. It is most closely related to potyviruses first or only found in the Americas, and it almost certainly originated in the Andes, where its hosts were domes- ticated. We have inferred the phylogeny of the published genomic sequences of 240 PVY isolates collected since 1938 world- wide, but not the Andes. All fall into five groupings, which mostly, but not exclusively, correspond with groupings already devised using biological and taxonomic data. Only 42 percent of the sequences are not recombinant, and all these fall into one or other of three phylogroups; the previously named C (common), O (ordinary), and N (necrotic) groups. There are also two other distinct groups of isolates all of which are recombinant; the R-1 isolates have N (5 0 terminal minor) and O (major) parents, and the R-2 isolates have R-1 (major) and N (3 0 terminal minor) parents. Many isolates also have additional minor intra- and inter-group recombinant genomic regions. The complex interrelationships between the genomes were resolved by progressively identifying and removing recombinants using partitioned sequences of synonymous codons. Least squared dating and BEAST analyses of two datasets of gene sequences from non-recombinant heterochronously-sampled isolates (seventy-three non-recombinant major ORFs and 166 partial ORFs) found the 95% confidence intervals of the TMRCA esti- mates overlap around 1,000 CE (Common Era; AD). We attempted to identify the most accurate datings by comparing the es- timated phylogenetic dates with historical events in the worldwide adoption of potato and other PVY hosts as crops, but found that more evidence from gene sequences of non-potato isolates, especially from South America, was required. Key words: Potato virus Y; phylogenetics; recombination; least squares dating; probabilistic dating. 1. Introduction Potato virus Y (PVY) is the type species of the genus Potyvirus, one of the largest, most widespread, and economically impor- tant genera of plant viruses. It is a major world pathogen of potatoes, and other solanaceous crops, such as tobacco, tomato, and pepper, and is probably the most damaging virus of the world’s potato crop (Loebenstein et al. 2001; Stevenson et al. 2001; Kerlan 2006; Kerlan and Moury 2008; Gray et al. 2010; V C The Author 2017. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] 1 Virus Evolution, 2017, 3(1): vex002 doi: 10.1093/ve/vex002 Research article Downloaded from https://academic.oup.com/ve/article-abstract/3/1/vex002/3060989 by Music Library, School of Music, National Institute of the Arts, Australian National University user on 15 December 2017

Transcript of The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The...

Page 1: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

The phylogenetics of the global population of potato

virus Y and its necrogenic recombinantsAdrian J Gibbs1 Kazusato Ohshima2 Ryosuke Yasaka2daggerMusa Mohammadi3Dagger Mark J Gibbs4 and Roger A C Jones56

1Emeritus Faculty Australian National University Canberra ACT 2601 Australia 2Laboratory of PlantVirology Department of Applied Biological Sciences Faculty of Agriculture Saga University 1-banchiHonjo-machi Saga 840-8502 Japan 3Department of Plant Protection Vali-e-asr University of RafsanjanRafsanjan Iran 479 Carruthers St Curtin ACT 2605 Australia 5Department of Agriculture and Food WesternAustralia Institute of Agriculture Faculty of Science The University of Western Australia 35 StirlingHighway Crawley WA 6009 Australia and 63 Baron-Hay Court South Perth WA 6151 Australia

Corresponding author E-mail adrian_j_gibbshotmailcomdaggerhttporcidorg0000-0002-2240-1767Daggerhttporcidorg0000-0003-0725-417X

Abstract

Potato virus Y (PVY) is a major pathogen of potatoes and other solanaceous crops worldwide It is most closely related topotyviruses first or only found in the Americas and it almost certainly originated in the Andes where its hosts were domes-ticated We have inferred the phylogeny of the published genomic sequences of 240 PVY isolates collected since 1938 world-wide but not the Andes All fall into five groupings which mostly but not exclusively correspond with groupings alreadydevised using biological and taxonomic data Only 42 percent of the sequences are not recombinant and all these fall intoone or other of three phylogroups the previously named C (common) O (ordinary) and N (necrotic) groups There are alsotwo other distinct groups of isolates all of which are recombinant the R-1 isolates have N (50 terminal minor) and O (major)parents and the R-2 isolates have R-1 (major) and N (30 terminal minor) parents Many isolates also have additional minorintra- and inter-group recombinant genomic regions The complex interrelationships between the genomes were resolvedby progressively identifying and removing recombinants using partitioned sequences of synonymous codons Least squareddating and BEAST analyses of two datasets of gene sequences from non-recombinant heterochronously-sampled isolates(seventy-three non-recombinant major ORFs and 166 partial ORFs) found the 95 confidence intervals of the TMRCA esti-mates overlap around 1000 CE (Common Era AD) We attempted to identify the most accurate datings by comparing the es-timated phylogenetic dates with historical events in the worldwide adoption of potato and other PVY hosts as crops butfound that more evidence from gene sequences of non-potato isolates especially from South America was required

Key words Potato virus Y phylogenetics recombination least squares dating probabilistic dating

1 Introduction

Potato virus Y (PVY) is the type species of the genus Potyvirusone of the largest most widespread and economically impor-tant genera of plant viruses It is a major world pathogen of

potatoes and other solanaceous crops such as tobacco tomatoand pepper and is probably the most damaging virus of theworldrsquos potato crop (Loebenstein et al 2001 Stevenson et al2001 Kerlan 2006 Kerlan and Moury 2008 Gray et al 2010

VC The Author 2017 Published by Oxford University PressThis is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (httpcreativecommonsorglicensesby-nc40) which permits non-commercial re-use distribution and reproduction in any medium provided the original work is properly citedFor commercial re-use please contact journalspermissionsoupcom

1

Virus Evolution 2017 3(1) vex002

doi 101093vevex002Research article

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Ogawa et al 2012 Karasev and Gray 2013 Jones 2014) PVY wasfirst described by Smith (1931) its C strain by Bawden (1936)who considered it to be a distinct but related virus (potato virusC PVC) and its N strain by Nobrega and Silberschmidt (1944)PVC was subsequently recognized as the common strain of PVYby Bawden and Sheffield (1944) and PVYrsquos original or ldquoordinaryrdquostrain then became its O strain The O and C strains from potatowere distinguished by their reactions with the potatoes Ny andNc hypersensitivity genes and the third strain the N straincaused systemic veinal necrosis in tobacco The biologicalstrains named O C and N were later adopted (De Bokx and vander Want 1987 Jones 1990) and were mostly congruent withtheir phylogenetics (Boonham et al 2002 Moury et al 2002)Isolates from pepper and tomato mostly belong to the C phylo-genetic group (Moury 2010) whereas those from tobacco weremostly placed in the N and O phylogenetic groups (Tian et al2011) However as more genomes have been sequenced andmore potato hypersensitivity genes found (Singh et al 2008Moury 2010 Karasev et al 2011 Karasev and Gray 2013 Jones2014 Rowley Gray and Karasev 2015 Kehoe and Jones 2016)continued use of the same names for biological and phyloge-netic groups has proved to be increasingly confusing due to thelack of complete coincidence between them A new strain no-menclature system for sub-dividing the phylogenetic groups us-ing Latinised numerals has therefore been proposed (Jones2014 Jones and Kehoe 2016 Kehoe and Jones 2016) Their group-ings correlate with the five broad lsquophylogroupsrsquo we discuss inthis article the traditional C O and N groups plus two groups ofrecombinants R1 and R2

The genomic sequences of more than 200 isolates of PVY arenow publicly available in the international databases Theycome from all continents except Antarctica but none have beencollected from crops in the potatorsquos main center of domestica-tion in the Andean region of Bolivia and Peru where nine potatospecies are cultivated in contrast to only one (Solanum tubero-sum) outside the Andean region (Hawkes 1978 Brown 1993Brown and Henfling 2014) The phylogenetic history of PVY iscomplex It has undoubtedly involved spread within and be-tween wild and domesticated potato species wild ancestors oftomato pepper tobacco and other solanceous crops in Southor Central America and also within plantings of potato andother cultivated solanaceous plants throughout the world(Klinkowski and Schmelzer 1960 Silberschmidt 1960 Brucher1969 Jones 1981 Spetz et al 2003) It has also involved recombi-nation between lineages (Glais Tribodet and Kerlan 2002Moury et al 2002 Lorenzen et al 2006 Schubert Fomitchevaand Sztangret-Wisniewska 2007 Ogawa et al 2008 Singh et al2008 Hu et al 2009a b Visser and Bellstedt 2009 Moury andSimon 2011 Cuevas et al 2012 Karasev and Gray 2013) Many ofthese recombinants cause tuber necrosis (Beczner et al 1984Boonham et al 2002 Karasev and Gray 2013)

Here we report an analysis updating the phylogenetic his-tory of PVY It is based on the genome sequences available inthe international databases in January 2016 They were fromisolates collected from around the world but none were fromthe potato croprsquos domestication center in the Andes wheregreater diversity would be expected The majority of the isolateswere from potatoes but several also came from pepper tomatoor tobacco As recombination confounds phylogenetic analysisour strategy was to separate the genomic sequences that aremostly non-recombinant (n-rec) from those that had major re-combinant (rec) regions To simplify the separation of rec fromn-rec sequences the PVY ORF sequences were partitioned intoalignments of codons that only varied synonymously (syn

codons) and those that had also varied non-synonymously (n-syn codons) and these were then analyzed separately Thisstrategy identified two sets of n-rec sequences one of full-length ORFs and the other partial ORFs for dating the PVY phy-logeny Our dating analyses like those of Visser Bellstedt andPirie (2012) find that the PVY population outside the Andean re-gion mostly diverged over the past few centuries This was afterthe Spanish colonization of South America following the defeatof the Inca empire in 1532 and after the introduction of onespecies of potato (S tuberosum) to Europe in the second half ofthe 16th century and later to other continents (Salaman andHawkes 1949 Salaman 1954 Brown 1993 Hawkes andFransisco-Ortega 1993 Brown and Henfling 2014) We foundlike Visser Bellstedt and Pirie (2012) that the damaging rec var-iants emerged and spread only during the past century

2 Methods and data sources

Two hundred and forty complete genomic sequences of PVY(Supplementary Table S1) were obtained from Genbank inJanuary 2016 Each was edited using BioEdit (Hall 1999) to ex-tract its main ORF These were aligned using the encodedamino acids as guide by the TranslatorX online server (AbascalZardoya and Telford 2010 httptranslatorxcouk) with itsMAFFT option (Katoh and Standley 2013) to give an alignmentof 9201 nucleotides (available at http1925598146_resourcese-textsblobs240PVYORFszip) BlastN and BlastP (Altschulet al 1990) online facilities of Genbank were used with the Chile3 sequence and representative PVY sequences from the N andC phylogroups to search for related sequences A simple pair-wise sliding-window method DnDscan (Gibbs et al 2006) avail-able at http1925598146_resourcese-textsblobsDnDscan1ZIP was used to identify codons in the alignments that hadonly evolved synonymously or had also evolved non-synonymously These were partitioned using SEQSPLIT v10 aFortran program written by the late John Armstrong (availableat http1925598146_resourcese-textsblobsSeqSplitZIP)Sequences were tested for the presence of phylogenetic anoma-lies using the full suite of options in RDP4 (Maynard Smith 1992Holmes Worobey and Rambaut 1999 Padidam Sawyer andFauquet 1999 Gibbs et al 2000 Martin and Rybicki 2000McGuire and Wright 2000 Posada and Crandall 2001 Martinet al 2005 Boni Posada and Feldman 2007 Lemey et al 2009Martin et al 2015) regions found to be anomalous by three orfewer methods and lt 106 random probability were ignored Forone test codons in the aligned sequences with gaps were re-moved using POSORT (available at http1925598146_resourcese-textsREADME-POSORTpdf)

Models for ML analysis were tested using TOPALi (Milneet al 2009) and the ProtTest 3 server at httpdarwinuvigoes(Darriba et al 2011) the best fit models were found to beGTRthornU4thorn I (Tavare 1986) for nucleotide sequences andLGthornU4thorn I (Le and Gascuel 2008) for amino acid sequencesPhylogenetic trees were inferred using the neighbor-joining (NJ)facility in ClustalX (Jeanmougin et al 1998) the SplitsTreemethod (Huson and Bryant 2006) and PhyML 30 (ML) (Guindonand Gascuel 2003) and the support for their topologies assessedusing the log-likelihood support for the trees and the SH-support (Shimodaira and Hasegawa 1999) for their nodesNucleotide diversities (Nei and Li 1979) were computed usingDAMBE5 online (Xia 2013) Trees were drawn using FigtreeVersion 13 (httptreebioedacuksoftwarefigtree) and pairsof trees were compared using PATRISTIC (Fourment and Gibbs2006) to test for mutational saturation and were confirmed by

2 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

the method of Xia (2013) Most virus isolate collection dates(Supplementary Table S1) were obtained from Genbank files orfrom Visser Bellstedt and Charleston (2012) and dates forsome N and NTN isolates were not previously published(Ohshima Kmdashunpublished data) the dating of sequenceEU563512 is mentioned in the Section 4 The temporal signal insets of aligned sequences and the dates of the most commonrecent ancestor (TMRCA) and other nodes of inferred phyloge-nies were estimated by TempEst (Rambaut et al 2016) thelsquoLeast Squares Datingrsquo method of To et al (2015) using Versionlsd-03beta and by the probabilistic methods of BEAST v182(Drummond et al 2012) In BEAST analyses Bayes factors wereused to select the best-fitting molecular-clock model and coa-lescent priors for the tree topology and node times and wecompared strict and relaxed (uncorrelated exponential anduncorrelated lognormal) molecular clocks as well as five demo-graphic models (constant population size expansion growthexponential growth logistic growth and the Bayesian skylineplot) Posterior distributions of parameters including the treewere estimated from Markov Chain Monte Carlo samples takenevery 104 from 108 steps after discarding the first 10 percentand checked using Tracer v16 (httptreebioedacuksoftwaretracer) These provided the TMRCAs and other dateswere obtained from the lsquomaximum clade credibility treersquonamely the tree that was commonest among those observedThe adequacy of the temporal signal in our data was alsochecked by using ten independently date-randomized replicatesin both the least squared dating (LSD) and BEAST analyses(Ramsden Holmes and Charleston 2009 Duchene 2015)

3 Results

The principal ORFs of 240 genomic sequences of PVY gave analignment 9201 nucleotides long with 67 percent of the 3067codons invariate The ORFs formed five major groupings in amid-point rooted NJ tree (Fig 1A) Their relationships closely re-sembled those found in a maximum likelihood phylogeny re-ported by Kehoe and Jones (2016) who examined seventy-threeisolates from among the 240 that we examined In both our NJphylogeny and the published ML phylogeny the C phylogrouphad the longest branches and formed the basal tree One of itslineages diverges to give the O group and three others N R-1and R-2 see Fig 1 legend for the correspondences between ourphylogroups and those of Kehoe and Jones (2016) In both phy-logenies many sister sequences have very asymmetric relation-ships when summed to the midpoint root the shortestbranches are less than 25 percent of the longest This may indi-cate that evolutionary rates have varied or more likely thatsome of the sequences are recombinant The same arrangementof the same phylogroups was inferred by SplitsTree analysis(Fig 2A) with the C and O phylogroups linked to others by acomplex web of interrelationships indicating that many of theORFs are recombinants some of which are closely similar asshown by clustering of the links between the groups

The ORFs were also directly searched for phylogeneticanomalies using the RDP suite of programs and more than halfgave significant evidence of recombination but with the assign-ment of lsquoparentalrsquo sequences often uncertain as many alterna-tive combinations of ORFs were identified as possible lsquoparentsrsquo

31 The search for the n-rec PVY ORFs

The search for the n-rec ORFs was simplified by using align-ments obtained by partitioning the variable codons into

sequences of the syn codons (504 percent) and the n-syn co-dons (429 percent) they contained NJ trees of these two align-ments were closely similar topologically to those of thecomplete sequences (Fig 1A) and when compared in apatristic-distance graph (Fig S1) they showed no evidence ofmutational saturation along the main axis of the tree and thiswas confirmed using the binary test for saturation (Xia 2013)However there were discrete clusters of points off the main di-agonal as would be expected for rec sequences and the maindiagonal was very broad with the spread being greater in the n-syn axis (X) than in the syn axis (Y) again indicating the pres-ence of recombinants

ML trees obtained from the complete ORF sequences andfrom their syn and n-syn codons were also closely similar how-ever the syn codon tree had a greater statistical support thanthose calculated from the n-syn codons or the complete se-quences log-likelihoods of 453906 and 608258 and10985539165 respectively Furthermore more nodes in thesyn sequence tree had SH-support gt09 than the same nodes inthe other trees

A preliminary search of the syn codon sequences using RDPas well as NJ trees and patristic distance graphs showed that allmembers of the R-1 and R-2 phylogroups had some large rec re-gions with the same lsquoparentsrsquo A representative selection of 48sequences (Supplementary Table S1) was used to resolve thesegroupings The selection included all sixteen sequences of the Cphylogroup together with eight sequences from each of theother phylogroups chosen to be as representative as possible ofthe basal divergences of those phylogroups (ie those with theshortest branch lengths and therefore least changed after thephylogroup had been established) Sequences with the same

Figure 1 The branches of NJ phylogenies calculated from the main genomic

ORFs of (A) 240 isolates of PVY and (B) 103 of the isolates that showed no signifi-

cant evidence of recombination (see text) The marked clusters are of groups

named in Kehoe and Jones (2016) cluster C is of groups C1(II) and C2(III) O is of

O(I) and O5(X) N is of N(IV) XIII and NA-N(IX) NTN-1 is of NTN-NW SYR-I(XII)

NTN-B(VI) NTN-NW SYR-II(XI) N-Wi(VII) and NO(VIII) and NTN-2 of

NTN-A(V) Chile 3 is isolate Accession Code FJ214726

A J Gibbs et al | 3

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

pattern of recombination and parentage were progressivelyidentified and removed The most strongly supported phyloge-netic anomaly was shared by all eight R-2 sequences They hadan R-1 phylogroup sequence (EF026076) as the major parent andan N phylogroup minor parent (AJ890346) the latter region wasfrom around nt 5707 to the 30 terminus of the complete ORFs (38percent of the ORF) It was identified by seven out of nine RDPmethods with probabilities ranging from 1027 to 10112 All R-2sequences were therefore removed and the remaining forty se-quences again analyzed by RDP which then found the moststrongly supported and shared anomaly to be in all R-1 se-quences These had an O phylogroup major parent (EF026074)and an N phylogroup minor parent (X97895) that had in most(see below) been provided from the 50 terminus to nt 2205 (24percent of the ORF) this region was identified by seven out ofnine RDP methods with probabilities ranging from 1023 to1084 The syn and n-syn sequences of the remaining thirty-twoC O and N phylogroup ORFs gave almost identical ML and NJphylogenetic trees and a patristic distance graph of those trees

had most points close to a single diagonal the statistical sup-port for the syn ML tree was again greater than for the n-syn MLtree (log-likelihood 247234 and 292999 respectively)

Finally the syn codon sequences from C and O phylogroupisolates that were not among the forty-eight representative se-quences were added to the remaining thirty-two C and O synsequences to produce an alignment of 120 sequences This wasthen searched using RDP for other more specific rec regions (iesequence specific not phylogroup specific) and seventeen se-quences were found to have significant inter- and intra-phylogroup rec regions (ie three C phylogroup sequencesseven from the O phylogroup and seven from the N phy-logroup) One of those N phylogroup sequences was X97895from isolate N605 which has been used in several studies as areference sequence but was found by us to be a recombinantbetween AJ890346 and DQ157180 representing the two mainlineages of the N phylogroup Figure 1B shows the NJ tree of thecomplete ORFs of the remaining 103 sequences This phylogenyis much more symmetrical than that of the original 240 se-quences (Fig 1A) and its shortest branches (tip to root) arearound 70 percent of the length of the longest Figure 2B showstheir SplitsTree graph which again has a much simpler linkagestructure in its main branches than the 240 sequence tree(Fig 2A) Figure S1 also shows the patristic distance graph com-paring the NJ trees of the 103 syn and n-syn sequences and thisalso confirms that these n-rec sequences have evolved in a bio-logically coherent manner with most points aligned with andclose to a diagonal that has a slope around 08 and shows noevidence of mutational saturation

32 The R-1 and R-2 phylogroups

The genome maps of the three rec phylogroups are shown inFig 3 Phylogroup N is the sister lineage to the C and O phy-logroups and most genomes of these three phylogroups arenon-recombinant whereas all the R-1 and R-2 phylogroup se-quences are recombinants and have parents in the O and Nphylogroups

The ORFs of all forty-three R-1 phylogroup sequences aremost closely related to ORFs of the O phylogroup especiallythat of sequence EF026074 However all have a 50 terminal re-gion (24 percent of the complete ORF) that is closely related tothe homologous region of N phylogroup ORFs especially that ofX97895 In around half the ORFs (eg DQ157179 HQ912870JF927762) the N phylogroup region is from its 50 end to around nt2205 and in others (eg AJ890349 HQ912863 JN935419 andKJ801915) it is from nts 301 to 2205 and is preceded by a shortregion closest to O sequences In two of the sequences(HM991454 and JQ969040) the N phylogroup region is mostclosely related to AB331517 not X97896 These differences indi-cate that more than one event and O parent was involved in es-tablishing the R-1 phylogroup Some of the R-1 sequences (egAJ889868 and KJ634023) also have a short rec region around nts5600ndash6300 that is most closely related to that region of R-2isolates

The sequences of all seventy-six R-2 ORFs are closest to thatof an R-1 phylogroup sequence EF026076 but all have a 30 re-gion (nts 5707-end 38 percent of the ORF) that is most closelyrelated to the homologous region of AJ890346 an N phylogroupsequence In addition around one-fifth of the R-1 and R-2 se-quences also have smaller individual regions that in RDP analy-ses are recorded as significant inter- and intra-phylogrouprecombinations

Figure 2 The branches of SplitsTree phylogenies of the main genomic ORFs of

(A) 240 isolates of PVY and (B) 103 of those isolates that showed no significant

evidence of recombination The marked clusters are the same as those in Fig 1

Figure 3 A cartoon summarizing the relationships and genomic maps of the

five major phylogroups of the PVY isolates shown in Figs 1 and 2

4 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

The C phylogroup ORFs sequences including that of theChile 3 isolate are the most variable nucleotide diversitypfrac14 00974 6 00460 Those of the other four phylogroups aremuch less variable indicating that they have probably divergedmore recently the N phylogroup population is the most vari-able pfrac14 00256 6 00121 compared with the O phylogroup00211 6 00100 R-1 phylogroup 00196 6 00093 and R-2 theleast variable 00188 600089

33 Rooting the PVY phylogeny

BlastN and BlastP searches using representative sequencesfrom all of the n-rec PVY phylogroups found the most closelyrelated genomic sequences to be those of sunflower chlo-rotic mottle virus (JN863233) pepper severe mosaic virus(AM181350) and bidens mosaic virus (KF649336) In both MLand NJ trees of seventy-three dated n-rec complete ORFs(see below) including these three viruses as an outgroup theN phylogroup was placed as sister to all other PVY phy-logroups as in Fig 1B However when ML or NJ trees werecalculated from the encoded amino acid sequences or fromthe 166 core nucleotide sequences (see below) together withthe homologous regions of the three outgroup sequences theChile 3 sequence (FJ214726) was placed as sister to all otherPVY lineages This difference was found not only when com-plete aligned sequences were used but also when all indels(95 percent of the sequences) had been removed Nonethelessall the major nodes in trees found from complete anddegapped sequences of both ORF and amino acid treeshadgt097 SH-type statistical support Thus the exact positionof the basal node of the present PVY phylogeny is unresolvedbut is either side of a single robustly supported node linkingthe Chile 3 sequence and the N phylogroup to all other PVYsThe reason for this rooting uncertainty is unknown the indi-vidual mutational differences between the Chile 3 N phy-logroup and all other sequences are spread throughout the fulllength of the sequences Furthermore the points representingthe pairwise syn and n-syn comparisons of the Chile 3 se-quence and other PVY isolates in a patristic distance graph

(Fig S1) were close to the main diagonal trend indicating thatthe DnDs ratios of the Chile 3 isolate are not unusual but sim-ilar to those of other PVY isolates

34 Dating the PVY phylogeny

The 240 PVY genomic sequences we analyzed provided two setsof heterochronously dated sequences (Supplementary TableS1) Seventy-three non-recombinant isolates from the C O andN phylogroups provided a set of full length ORFs the earliestsample was collected in 1938 CE (Common Era or AD) the mostrecent in 2013 CE and their mean sample collection date was19994 CE As described above the R-1 and R-2 genomes are re-combinants that share the central core region of their genomes(nts 2206ndash5706) with those of the O phylogroup Sample collec-tion dates are known for ninety-eight R-1 and R-2 isolates(Supplementary Table S1) Thus together with the core regionsof the seventy-three dated recombinant sequences their sharedcentral region provides another dataset of 171 sequences for es-timating PVY phylogenetic divergence dates it covers the samerange of sampling dates but with a mean of 20041 CEAlthough there are twice as many sequences they are only 38percent of the length of the complete ORFs The alignment ofthe 171 core regions was re-examined by RDP and five found tohave a short 50 terminal region of N phylogroup sequence (clos-est to JQ969036) so these were removed to leave 166 sequencesin the dataset ML and NJ trees inferred from these core regionsdiffered only in minor details and resembled closely those ofthe seventy-three n-rec ORFs (Fig 1) except that they placed allthe O R-1 and R-2 sequences in single tight cluster that did notcleanly resolve into O R-1 and R-2 lineages (Fig 4) The poorresolution within this cluster of O R-1 R-2 sequences probablyresults from the small diversity within the cluster nucleotidediversity pfrac14 0018 6 0008 compared with 0048 6 0023 in all 166core sequences and 0075 6 0035 in the seventy-three n-reccomplete ORFs Nonetheless the distribution of O R-1 and R-2sequences within the lineages (Fig 4) indicate clearly that O iso-lates are ancestral to R-1 isolates and R-2 isolates are the mostrecent

Figure 4 A cartoon summarizing the divergence dates of a ML phylogeny of seventy-three dated n-rec ORF sequences (Table 1 - line 1) estimated by the LSD method

and the dates from the MCC tree of a BEAST analysis of the same data Bars indicate the 95 CI ranges for both analyses Arrows indicate the period 1935 CEndash2016 CE

during which the isolates were collected

A J Gibbs et al | 5

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 2: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

Ogawa et al 2012 Karasev and Gray 2013 Jones 2014) PVY wasfirst described by Smith (1931) its C strain by Bawden (1936)who considered it to be a distinct but related virus (potato virusC PVC) and its N strain by Nobrega and Silberschmidt (1944)PVC was subsequently recognized as the common strain of PVYby Bawden and Sheffield (1944) and PVYrsquos original or ldquoordinaryrdquostrain then became its O strain The O and C strains from potatowere distinguished by their reactions with the potatoes Ny andNc hypersensitivity genes and the third strain the N straincaused systemic veinal necrosis in tobacco The biologicalstrains named O C and N were later adopted (De Bokx and vander Want 1987 Jones 1990) and were mostly congruent withtheir phylogenetics (Boonham et al 2002 Moury et al 2002)Isolates from pepper and tomato mostly belong to the C phylo-genetic group (Moury 2010) whereas those from tobacco weremostly placed in the N and O phylogenetic groups (Tian et al2011) However as more genomes have been sequenced andmore potato hypersensitivity genes found (Singh et al 2008Moury 2010 Karasev et al 2011 Karasev and Gray 2013 Jones2014 Rowley Gray and Karasev 2015 Kehoe and Jones 2016)continued use of the same names for biological and phyloge-netic groups has proved to be increasingly confusing due to thelack of complete coincidence between them A new strain no-menclature system for sub-dividing the phylogenetic groups us-ing Latinised numerals has therefore been proposed (Jones2014 Jones and Kehoe 2016 Kehoe and Jones 2016) Their group-ings correlate with the five broad lsquophylogroupsrsquo we discuss inthis article the traditional C O and N groups plus two groups ofrecombinants R1 and R2

The genomic sequences of more than 200 isolates of PVY arenow publicly available in the international databases Theycome from all continents except Antarctica but none have beencollected from crops in the potatorsquos main center of domestica-tion in the Andean region of Bolivia and Peru where nine potatospecies are cultivated in contrast to only one (Solanum tubero-sum) outside the Andean region (Hawkes 1978 Brown 1993Brown and Henfling 2014) The phylogenetic history of PVY iscomplex It has undoubtedly involved spread within and be-tween wild and domesticated potato species wild ancestors oftomato pepper tobacco and other solanceous crops in Southor Central America and also within plantings of potato andother cultivated solanaceous plants throughout the world(Klinkowski and Schmelzer 1960 Silberschmidt 1960 Brucher1969 Jones 1981 Spetz et al 2003) It has also involved recombi-nation between lineages (Glais Tribodet and Kerlan 2002Moury et al 2002 Lorenzen et al 2006 Schubert Fomitchevaand Sztangret-Wisniewska 2007 Ogawa et al 2008 Singh et al2008 Hu et al 2009a b Visser and Bellstedt 2009 Moury andSimon 2011 Cuevas et al 2012 Karasev and Gray 2013) Many ofthese recombinants cause tuber necrosis (Beczner et al 1984Boonham et al 2002 Karasev and Gray 2013)

Here we report an analysis updating the phylogenetic his-tory of PVY It is based on the genome sequences available inthe international databases in January 2016 They were fromisolates collected from around the world but none were fromthe potato croprsquos domestication center in the Andes wheregreater diversity would be expected The majority of the isolateswere from potatoes but several also came from pepper tomatoor tobacco As recombination confounds phylogenetic analysisour strategy was to separate the genomic sequences that aremostly non-recombinant (n-rec) from those that had major re-combinant (rec) regions To simplify the separation of rec fromn-rec sequences the PVY ORF sequences were partitioned intoalignments of codons that only varied synonymously (syn

codons) and those that had also varied non-synonymously (n-syn codons) and these were then analyzed separately Thisstrategy identified two sets of n-rec sequences one of full-length ORFs and the other partial ORFs for dating the PVY phy-logeny Our dating analyses like those of Visser Bellstedt andPirie (2012) find that the PVY population outside the Andean re-gion mostly diverged over the past few centuries This was afterthe Spanish colonization of South America following the defeatof the Inca empire in 1532 and after the introduction of onespecies of potato (S tuberosum) to Europe in the second half ofthe 16th century and later to other continents (Salaman andHawkes 1949 Salaman 1954 Brown 1993 Hawkes andFransisco-Ortega 1993 Brown and Henfling 2014) We foundlike Visser Bellstedt and Pirie (2012) that the damaging rec var-iants emerged and spread only during the past century

2 Methods and data sources

Two hundred and forty complete genomic sequences of PVY(Supplementary Table S1) were obtained from Genbank inJanuary 2016 Each was edited using BioEdit (Hall 1999) to ex-tract its main ORF These were aligned using the encodedamino acids as guide by the TranslatorX online server (AbascalZardoya and Telford 2010 httptranslatorxcouk) with itsMAFFT option (Katoh and Standley 2013) to give an alignmentof 9201 nucleotides (available at http1925598146_resourcese-textsblobs240PVYORFszip) BlastN and BlastP (Altschulet al 1990) online facilities of Genbank were used with the Chile3 sequence and representative PVY sequences from the N andC phylogroups to search for related sequences A simple pair-wise sliding-window method DnDscan (Gibbs et al 2006) avail-able at http1925598146_resourcese-textsblobsDnDscan1ZIP was used to identify codons in the alignments that hadonly evolved synonymously or had also evolved non-synonymously These were partitioned using SEQSPLIT v10 aFortran program written by the late John Armstrong (availableat http1925598146_resourcese-textsblobsSeqSplitZIP)Sequences were tested for the presence of phylogenetic anoma-lies using the full suite of options in RDP4 (Maynard Smith 1992Holmes Worobey and Rambaut 1999 Padidam Sawyer andFauquet 1999 Gibbs et al 2000 Martin and Rybicki 2000McGuire and Wright 2000 Posada and Crandall 2001 Martinet al 2005 Boni Posada and Feldman 2007 Lemey et al 2009Martin et al 2015) regions found to be anomalous by three orfewer methods and lt 106 random probability were ignored Forone test codons in the aligned sequences with gaps were re-moved using POSORT (available at http1925598146_resourcese-textsREADME-POSORTpdf)

Models for ML analysis were tested using TOPALi (Milneet al 2009) and the ProtTest 3 server at httpdarwinuvigoes(Darriba et al 2011) the best fit models were found to beGTRthornU4thorn I (Tavare 1986) for nucleotide sequences andLGthornU4thorn I (Le and Gascuel 2008) for amino acid sequencesPhylogenetic trees were inferred using the neighbor-joining (NJ)facility in ClustalX (Jeanmougin et al 1998) the SplitsTreemethod (Huson and Bryant 2006) and PhyML 30 (ML) (Guindonand Gascuel 2003) and the support for their topologies assessedusing the log-likelihood support for the trees and the SH-support (Shimodaira and Hasegawa 1999) for their nodesNucleotide diversities (Nei and Li 1979) were computed usingDAMBE5 online (Xia 2013) Trees were drawn using FigtreeVersion 13 (httptreebioedacuksoftwarefigtree) and pairsof trees were compared using PATRISTIC (Fourment and Gibbs2006) to test for mutational saturation and were confirmed by

2 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

the method of Xia (2013) Most virus isolate collection dates(Supplementary Table S1) were obtained from Genbank files orfrom Visser Bellstedt and Charleston (2012) and dates forsome N and NTN isolates were not previously published(Ohshima Kmdashunpublished data) the dating of sequenceEU563512 is mentioned in the Section 4 The temporal signal insets of aligned sequences and the dates of the most commonrecent ancestor (TMRCA) and other nodes of inferred phyloge-nies were estimated by TempEst (Rambaut et al 2016) thelsquoLeast Squares Datingrsquo method of To et al (2015) using Versionlsd-03beta and by the probabilistic methods of BEAST v182(Drummond et al 2012) In BEAST analyses Bayes factors wereused to select the best-fitting molecular-clock model and coa-lescent priors for the tree topology and node times and wecompared strict and relaxed (uncorrelated exponential anduncorrelated lognormal) molecular clocks as well as five demo-graphic models (constant population size expansion growthexponential growth logistic growth and the Bayesian skylineplot) Posterior distributions of parameters including the treewere estimated from Markov Chain Monte Carlo samples takenevery 104 from 108 steps after discarding the first 10 percentand checked using Tracer v16 (httptreebioedacuksoftwaretracer) These provided the TMRCAs and other dateswere obtained from the lsquomaximum clade credibility treersquonamely the tree that was commonest among those observedThe adequacy of the temporal signal in our data was alsochecked by using ten independently date-randomized replicatesin both the least squared dating (LSD) and BEAST analyses(Ramsden Holmes and Charleston 2009 Duchene 2015)

3 Results

The principal ORFs of 240 genomic sequences of PVY gave analignment 9201 nucleotides long with 67 percent of the 3067codons invariate The ORFs formed five major groupings in amid-point rooted NJ tree (Fig 1A) Their relationships closely re-sembled those found in a maximum likelihood phylogeny re-ported by Kehoe and Jones (2016) who examined seventy-threeisolates from among the 240 that we examined In both our NJphylogeny and the published ML phylogeny the C phylogrouphad the longest branches and formed the basal tree One of itslineages diverges to give the O group and three others N R-1and R-2 see Fig 1 legend for the correspondences between ourphylogroups and those of Kehoe and Jones (2016) In both phy-logenies many sister sequences have very asymmetric relation-ships when summed to the midpoint root the shortestbranches are less than 25 percent of the longest This may indi-cate that evolutionary rates have varied or more likely thatsome of the sequences are recombinant The same arrangementof the same phylogroups was inferred by SplitsTree analysis(Fig 2A) with the C and O phylogroups linked to others by acomplex web of interrelationships indicating that many of theORFs are recombinants some of which are closely similar asshown by clustering of the links between the groups

The ORFs were also directly searched for phylogeneticanomalies using the RDP suite of programs and more than halfgave significant evidence of recombination but with the assign-ment of lsquoparentalrsquo sequences often uncertain as many alterna-tive combinations of ORFs were identified as possible lsquoparentsrsquo

31 The search for the n-rec PVY ORFs

The search for the n-rec ORFs was simplified by using align-ments obtained by partitioning the variable codons into

sequences of the syn codons (504 percent) and the n-syn co-dons (429 percent) they contained NJ trees of these two align-ments were closely similar topologically to those of thecomplete sequences (Fig 1A) and when compared in apatristic-distance graph (Fig S1) they showed no evidence ofmutational saturation along the main axis of the tree and thiswas confirmed using the binary test for saturation (Xia 2013)However there were discrete clusters of points off the main di-agonal as would be expected for rec sequences and the maindiagonal was very broad with the spread being greater in the n-syn axis (X) than in the syn axis (Y) again indicating the pres-ence of recombinants

ML trees obtained from the complete ORF sequences andfrom their syn and n-syn codons were also closely similar how-ever the syn codon tree had a greater statistical support thanthose calculated from the n-syn codons or the complete se-quences log-likelihoods of 453906 and 608258 and10985539165 respectively Furthermore more nodes in thesyn sequence tree had SH-support gt09 than the same nodes inthe other trees

A preliminary search of the syn codon sequences using RDPas well as NJ trees and patristic distance graphs showed that allmembers of the R-1 and R-2 phylogroups had some large rec re-gions with the same lsquoparentsrsquo A representative selection of 48sequences (Supplementary Table S1) was used to resolve thesegroupings The selection included all sixteen sequences of the Cphylogroup together with eight sequences from each of theother phylogroups chosen to be as representative as possible ofthe basal divergences of those phylogroups (ie those with theshortest branch lengths and therefore least changed after thephylogroup had been established) Sequences with the same

Figure 1 The branches of NJ phylogenies calculated from the main genomic

ORFs of (A) 240 isolates of PVY and (B) 103 of the isolates that showed no signifi-

cant evidence of recombination (see text) The marked clusters are of groups

named in Kehoe and Jones (2016) cluster C is of groups C1(II) and C2(III) O is of

O(I) and O5(X) N is of N(IV) XIII and NA-N(IX) NTN-1 is of NTN-NW SYR-I(XII)

NTN-B(VI) NTN-NW SYR-II(XI) N-Wi(VII) and NO(VIII) and NTN-2 of

NTN-A(V) Chile 3 is isolate Accession Code FJ214726

A J Gibbs et al | 3

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

pattern of recombination and parentage were progressivelyidentified and removed The most strongly supported phyloge-netic anomaly was shared by all eight R-2 sequences They hadan R-1 phylogroup sequence (EF026076) as the major parent andan N phylogroup minor parent (AJ890346) the latter region wasfrom around nt 5707 to the 30 terminus of the complete ORFs (38percent of the ORF) It was identified by seven out of nine RDPmethods with probabilities ranging from 1027 to 10112 All R-2sequences were therefore removed and the remaining forty se-quences again analyzed by RDP which then found the moststrongly supported and shared anomaly to be in all R-1 se-quences These had an O phylogroup major parent (EF026074)and an N phylogroup minor parent (X97895) that had in most(see below) been provided from the 50 terminus to nt 2205 (24percent of the ORF) this region was identified by seven out ofnine RDP methods with probabilities ranging from 1023 to1084 The syn and n-syn sequences of the remaining thirty-twoC O and N phylogroup ORFs gave almost identical ML and NJphylogenetic trees and a patristic distance graph of those trees

had most points close to a single diagonal the statistical sup-port for the syn ML tree was again greater than for the n-syn MLtree (log-likelihood 247234 and 292999 respectively)

Finally the syn codon sequences from C and O phylogroupisolates that were not among the forty-eight representative se-quences were added to the remaining thirty-two C and O synsequences to produce an alignment of 120 sequences This wasthen searched using RDP for other more specific rec regions (iesequence specific not phylogroup specific) and seventeen se-quences were found to have significant inter- and intra-phylogroup rec regions (ie three C phylogroup sequencesseven from the O phylogroup and seven from the N phy-logroup) One of those N phylogroup sequences was X97895from isolate N605 which has been used in several studies as areference sequence but was found by us to be a recombinantbetween AJ890346 and DQ157180 representing the two mainlineages of the N phylogroup Figure 1B shows the NJ tree of thecomplete ORFs of the remaining 103 sequences This phylogenyis much more symmetrical than that of the original 240 se-quences (Fig 1A) and its shortest branches (tip to root) arearound 70 percent of the length of the longest Figure 2B showstheir SplitsTree graph which again has a much simpler linkagestructure in its main branches than the 240 sequence tree(Fig 2A) Figure S1 also shows the patristic distance graph com-paring the NJ trees of the 103 syn and n-syn sequences and thisalso confirms that these n-rec sequences have evolved in a bio-logically coherent manner with most points aligned with andclose to a diagonal that has a slope around 08 and shows noevidence of mutational saturation

32 The R-1 and R-2 phylogroups

The genome maps of the three rec phylogroups are shown inFig 3 Phylogroup N is the sister lineage to the C and O phy-logroups and most genomes of these three phylogroups arenon-recombinant whereas all the R-1 and R-2 phylogroup se-quences are recombinants and have parents in the O and Nphylogroups

The ORFs of all forty-three R-1 phylogroup sequences aremost closely related to ORFs of the O phylogroup especiallythat of sequence EF026074 However all have a 50 terminal re-gion (24 percent of the complete ORF) that is closely related tothe homologous region of N phylogroup ORFs especially that ofX97895 In around half the ORFs (eg DQ157179 HQ912870JF927762) the N phylogroup region is from its 50 end to around nt2205 and in others (eg AJ890349 HQ912863 JN935419 andKJ801915) it is from nts 301 to 2205 and is preceded by a shortregion closest to O sequences In two of the sequences(HM991454 and JQ969040) the N phylogroup region is mostclosely related to AB331517 not X97896 These differences indi-cate that more than one event and O parent was involved in es-tablishing the R-1 phylogroup Some of the R-1 sequences (egAJ889868 and KJ634023) also have a short rec region around nts5600ndash6300 that is most closely related to that region of R-2isolates

The sequences of all seventy-six R-2 ORFs are closest to thatof an R-1 phylogroup sequence EF026076 but all have a 30 re-gion (nts 5707-end 38 percent of the ORF) that is most closelyrelated to the homologous region of AJ890346 an N phylogroupsequence In addition around one-fifth of the R-1 and R-2 se-quences also have smaller individual regions that in RDP analy-ses are recorded as significant inter- and intra-phylogrouprecombinations

Figure 2 The branches of SplitsTree phylogenies of the main genomic ORFs of

(A) 240 isolates of PVY and (B) 103 of those isolates that showed no significant

evidence of recombination The marked clusters are the same as those in Fig 1

Figure 3 A cartoon summarizing the relationships and genomic maps of the

five major phylogroups of the PVY isolates shown in Figs 1 and 2

4 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

The C phylogroup ORFs sequences including that of theChile 3 isolate are the most variable nucleotide diversitypfrac14 00974 6 00460 Those of the other four phylogroups aremuch less variable indicating that they have probably divergedmore recently the N phylogroup population is the most vari-able pfrac14 00256 6 00121 compared with the O phylogroup00211 6 00100 R-1 phylogroup 00196 6 00093 and R-2 theleast variable 00188 600089

33 Rooting the PVY phylogeny

BlastN and BlastP searches using representative sequencesfrom all of the n-rec PVY phylogroups found the most closelyrelated genomic sequences to be those of sunflower chlo-rotic mottle virus (JN863233) pepper severe mosaic virus(AM181350) and bidens mosaic virus (KF649336) In both MLand NJ trees of seventy-three dated n-rec complete ORFs(see below) including these three viruses as an outgroup theN phylogroup was placed as sister to all other PVY phy-logroups as in Fig 1B However when ML or NJ trees werecalculated from the encoded amino acid sequences or fromthe 166 core nucleotide sequences (see below) together withthe homologous regions of the three outgroup sequences theChile 3 sequence (FJ214726) was placed as sister to all otherPVY lineages This difference was found not only when com-plete aligned sequences were used but also when all indels(95 percent of the sequences) had been removed Nonethelessall the major nodes in trees found from complete anddegapped sequences of both ORF and amino acid treeshadgt097 SH-type statistical support Thus the exact positionof the basal node of the present PVY phylogeny is unresolvedbut is either side of a single robustly supported node linkingthe Chile 3 sequence and the N phylogroup to all other PVYsThe reason for this rooting uncertainty is unknown the indi-vidual mutational differences between the Chile 3 N phy-logroup and all other sequences are spread throughout the fulllength of the sequences Furthermore the points representingthe pairwise syn and n-syn comparisons of the Chile 3 se-quence and other PVY isolates in a patristic distance graph

(Fig S1) were close to the main diagonal trend indicating thatthe DnDs ratios of the Chile 3 isolate are not unusual but sim-ilar to those of other PVY isolates

34 Dating the PVY phylogeny

The 240 PVY genomic sequences we analyzed provided two setsof heterochronously dated sequences (Supplementary TableS1) Seventy-three non-recombinant isolates from the C O andN phylogroups provided a set of full length ORFs the earliestsample was collected in 1938 CE (Common Era or AD) the mostrecent in 2013 CE and their mean sample collection date was19994 CE As described above the R-1 and R-2 genomes are re-combinants that share the central core region of their genomes(nts 2206ndash5706) with those of the O phylogroup Sample collec-tion dates are known for ninety-eight R-1 and R-2 isolates(Supplementary Table S1) Thus together with the core regionsof the seventy-three dated recombinant sequences their sharedcentral region provides another dataset of 171 sequences for es-timating PVY phylogenetic divergence dates it covers the samerange of sampling dates but with a mean of 20041 CEAlthough there are twice as many sequences they are only 38percent of the length of the complete ORFs The alignment ofthe 171 core regions was re-examined by RDP and five found tohave a short 50 terminal region of N phylogroup sequence (clos-est to JQ969036) so these were removed to leave 166 sequencesin the dataset ML and NJ trees inferred from these core regionsdiffered only in minor details and resembled closely those ofthe seventy-three n-rec ORFs (Fig 1) except that they placed allthe O R-1 and R-2 sequences in single tight cluster that did notcleanly resolve into O R-1 and R-2 lineages (Fig 4) The poorresolution within this cluster of O R-1 R-2 sequences probablyresults from the small diversity within the cluster nucleotidediversity pfrac14 0018 6 0008 compared with 0048 6 0023 in all 166core sequences and 0075 6 0035 in the seventy-three n-reccomplete ORFs Nonetheless the distribution of O R-1 and R-2sequences within the lineages (Fig 4) indicate clearly that O iso-lates are ancestral to R-1 isolates and R-2 isolates are the mostrecent

Figure 4 A cartoon summarizing the divergence dates of a ML phylogeny of seventy-three dated n-rec ORF sequences (Table 1 - line 1) estimated by the LSD method

and the dates from the MCC tree of a BEAST analysis of the same data Bars indicate the 95 CI ranges for both analyses Arrows indicate the period 1935 CEndash2016 CE

during which the isolates were collected

A J Gibbs et al | 5

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 3: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

the method of Xia (2013) Most virus isolate collection dates(Supplementary Table S1) were obtained from Genbank files orfrom Visser Bellstedt and Charleston (2012) and dates forsome N and NTN isolates were not previously published(Ohshima Kmdashunpublished data) the dating of sequenceEU563512 is mentioned in the Section 4 The temporal signal insets of aligned sequences and the dates of the most commonrecent ancestor (TMRCA) and other nodes of inferred phyloge-nies were estimated by TempEst (Rambaut et al 2016) thelsquoLeast Squares Datingrsquo method of To et al (2015) using Versionlsd-03beta and by the probabilistic methods of BEAST v182(Drummond et al 2012) In BEAST analyses Bayes factors wereused to select the best-fitting molecular-clock model and coa-lescent priors for the tree topology and node times and wecompared strict and relaxed (uncorrelated exponential anduncorrelated lognormal) molecular clocks as well as five demo-graphic models (constant population size expansion growthexponential growth logistic growth and the Bayesian skylineplot) Posterior distributions of parameters including the treewere estimated from Markov Chain Monte Carlo samples takenevery 104 from 108 steps after discarding the first 10 percentand checked using Tracer v16 (httptreebioedacuksoftwaretracer) These provided the TMRCAs and other dateswere obtained from the lsquomaximum clade credibility treersquonamely the tree that was commonest among those observedThe adequacy of the temporal signal in our data was alsochecked by using ten independently date-randomized replicatesin both the least squared dating (LSD) and BEAST analyses(Ramsden Holmes and Charleston 2009 Duchene 2015)

3 Results

The principal ORFs of 240 genomic sequences of PVY gave analignment 9201 nucleotides long with 67 percent of the 3067codons invariate The ORFs formed five major groupings in amid-point rooted NJ tree (Fig 1A) Their relationships closely re-sembled those found in a maximum likelihood phylogeny re-ported by Kehoe and Jones (2016) who examined seventy-threeisolates from among the 240 that we examined In both our NJphylogeny and the published ML phylogeny the C phylogrouphad the longest branches and formed the basal tree One of itslineages diverges to give the O group and three others N R-1and R-2 see Fig 1 legend for the correspondences between ourphylogroups and those of Kehoe and Jones (2016) In both phy-logenies many sister sequences have very asymmetric relation-ships when summed to the midpoint root the shortestbranches are less than 25 percent of the longest This may indi-cate that evolutionary rates have varied or more likely thatsome of the sequences are recombinant The same arrangementof the same phylogroups was inferred by SplitsTree analysis(Fig 2A) with the C and O phylogroups linked to others by acomplex web of interrelationships indicating that many of theORFs are recombinants some of which are closely similar asshown by clustering of the links between the groups

The ORFs were also directly searched for phylogeneticanomalies using the RDP suite of programs and more than halfgave significant evidence of recombination but with the assign-ment of lsquoparentalrsquo sequences often uncertain as many alterna-tive combinations of ORFs were identified as possible lsquoparentsrsquo

31 The search for the n-rec PVY ORFs

The search for the n-rec ORFs was simplified by using align-ments obtained by partitioning the variable codons into

sequences of the syn codons (504 percent) and the n-syn co-dons (429 percent) they contained NJ trees of these two align-ments were closely similar topologically to those of thecomplete sequences (Fig 1A) and when compared in apatristic-distance graph (Fig S1) they showed no evidence ofmutational saturation along the main axis of the tree and thiswas confirmed using the binary test for saturation (Xia 2013)However there were discrete clusters of points off the main di-agonal as would be expected for rec sequences and the maindiagonal was very broad with the spread being greater in the n-syn axis (X) than in the syn axis (Y) again indicating the pres-ence of recombinants

ML trees obtained from the complete ORF sequences andfrom their syn and n-syn codons were also closely similar how-ever the syn codon tree had a greater statistical support thanthose calculated from the n-syn codons or the complete se-quences log-likelihoods of 453906 and 608258 and10985539165 respectively Furthermore more nodes in thesyn sequence tree had SH-support gt09 than the same nodes inthe other trees

A preliminary search of the syn codon sequences using RDPas well as NJ trees and patristic distance graphs showed that allmembers of the R-1 and R-2 phylogroups had some large rec re-gions with the same lsquoparentsrsquo A representative selection of 48sequences (Supplementary Table S1) was used to resolve thesegroupings The selection included all sixteen sequences of the Cphylogroup together with eight sequences from each of theother phylogroups chosen to be as representative as possible ofthe basal divergences of those phylogroups (ie those with theshortest branch lengths and therefore least changed after thephylogroup had been established) Sequences with the same

Figure 1 The branches of NJ phylogenies calculated from the main genomic

ORFs of (A) 240 isolates of PVY and (B) 103 of the isolates that showed no signifi-

cant evidence of recombination (see text) The marked clusters are of groups

named in Kehoe and Jones (2016) cluster C is of groups C1(II) and C2(III) O is of

O(I) and O5(X) N is of N(IV) XIII and NA-N(IX) NTN-1 is of NTN-NW SYR-I(XII)

NTN-B(VI) NTN-NW SYR-II(XI) N-Wi(VII) and NO(VIII) and NTN-2 of

NTN-A(V) Chile 3 is isolate Accession Code FJ214726

A J Gibbs et al | 3

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

pattern of recombination and parentage were progressivelyidentified and removed The most strongly supported phyloge-netic anomaly was shared by all eight R-2 sequences They hadan R-1 phylogroup sequence (EF026076) as the major parent andan N phylogroup minor parent (AJ890346) the latter region wasfrom around nt 5707 to the 30 terminus of the complete ORFs (38percent of the ORF) It was identified by seven out of nine RDPmethods with probabilities ranging from 1027 to 10112 All R-2sequences were therefore removed and the remaining forty se-quences again analyzed by RDP which then found the moststrongly supported and shared anomaly to be in all R-1 se-quences These had an O phylogroup major parent (EF026074)and an N phylogroup minor parent (X97895) that had in most(see below) been provided from the 50 terminus to nt 2205 (24percent of the ORF) this region was identified by seven out ofnine RDP methods with probabilities ranging from 1023 to1084 The syn and n-syn sequences of the remaining thirty-twoC O and N phylogroup ORFs gave almost identical ML and NJphylogenetic trees and a patristic distance graph of those trees

had most points close to a single diagonal the statistical sup-port for the syn ML tree was again greater than for the n-syn MLtree (log-likelihood 247234 and 292999 respectively)

Finally the syn codon sequences from C and O phylogroupisolates that were not among the forty-eight representative se-quences were added to the remaining thirty-two C and O synsequences to produce an alignment of 120 sequences This wasthen searched using RDP for other more specific rec regions (iesequence specific not phylogroup specific) and seventeen se-quences were found to have significant inter- and intra-phylogroup rec regions (ie three C phylogroup sequencesseven from the O phylogroup and seven from the N phy-logroup) One of those N phylogroup sequences was X97895from isolate N605 which has been used in several studies as areference sequence but was found by us to be a recombinantbetween AJ890346 and DQ157180 representing the two mainlineages of the N phylogroup Figure 1B shows the NJ tree of thecomplete ORFs of the remaining 103 sequences This phylogenyis much more symmetrical than that of the original 240 se-quences (Fig 1A) and its shortest branches (tip to root) arearound 70 percent of the length of the longest Figure 2B showstheir SplitsTree graph which again has a much simpler linkagestructure in its main branches than the 240 sequence tree(Fig 2A) Figure S1 also shows the patristic distance graph com-paring the NJ trees of the 103 syn and n-syn sequences and thisalso confirms that these n-rec sequences have evolved in a bio-logically coherent manner with most points aligned with andclose to a diagonal that has a slope around 08 and shows noevidence of mutational saturation

32 The R-1 and R-2 phylogroups

The genome maps of the three rec phylogroups are shown inFig 3 Phylogroup N is the sister lineage to the C and O phy-logroups and most genomes of these three phylogroups arenon-recombinant whereas all the R-1 and R-2 phylogroup se-quences are recombinants and have parents in the O and Nphylogroups

The ORFs of all forty-three R-1 phylogroup sequences aremost closely related to ORFs of the O phylogroup especiallythat of sequence EF026074 However all have a 50 terminal re-gion (24 percent of the complete ORF) that is closely related tothe homologous region of N phylogroup ORFs especially that ofX97895 In around half the ORFs (eg DQ157179 HQ912870JF927762) the N phylogroup region is from its 50 end to around nt2205 and in others (eg AJ890349 HQ912863 JN935419 andKJ801915) it is from nts 301 to 2205 and is preceded by a shortregion closest to O sequences In two of the sequences(HM991454 and JQ969040) the N phylogroup region is mostclosely related to AB331517 not X97896 These differences indi-cate that more than one event and O parent was involved in es-tablishing the R-1 phylogroup Some of the R-1 sequences (egAJ889868 and KJ634023) also have a short rec region around nts5600ndash6300 that is most closely related to that region of R-2isolates

The sequences of all seventy-six R-2 ORFs are closest to thatof an R-1 phylogroup sequence EF026076 but all have a 30 re-gion (nts 5707-end 38 percent of the ORF) that is most closelyrelated to the homologous region of AJ890346 an N phylogroupsequence In addition around one-fifth of the R-1 and R-2 se-quences also have smaller individual regions that in RDP analy-ses are recorded as significant inter- and intra-phylogrouprecombinations

Figure 2 The branches of SplitsTree phylogenies of the main genomic ORFs of

(A) 240 isolates of PVY and (B) 103 of those isolates that showed no significant

evidence of recombination The marked clusters are the same as those in Fig 1

Figure 3 A cartoon summarizing the relationships and genomic maps of the

five major phylogroups of the PVY isolates shown in Figs 1 and 2

4 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

The C phylogroup ORFs sequences including that of theChile 3 isolate are the most variable nucleotide diversitypfrac14 00974 6 00460 Those of the other four phylogroups aremuch less variable indicating that they have probably divergedmore recently the N phylogroup population is the most vari-able pfrac14 00256 6 00121 compared with the O phylogroup00211 6 00100 R-1 phylogroup 00196 6 00093 and R-2 theleast variable 00188 600089

33 Rooting the PVY phylogeny

BlastN and BlastP searches using representative sequencesfrom all of the n-rec PVY phylogroups found the most closelyrelated genomic sequences to be those of sunflower chlo-rotic mottle virus (JN863233) pepper severe mosaic virus(AM181350) and bidens mosaic virus (KF649336) In both MLand NJ trees of seventy-three dated n-rec complete ORFs(see below) including these three viruses as an outgroup theN phylogroup was placed as sister to all other PVY phy-logroups as in Fig 1B However when ML or NJ trees werecalculated from the encoded amino acid sequences or fromthe 166 core nucleotide sequences (see below) together withthe homologous regions of the three outgroup sequences theChile 3 sequence (FJ214726) was placed as sister to all otherPVY lineages This difference was found not only when com-plete aligned sequences were used but also when all indels(95 percent of the sequences) had been removed Nonethelessall the major nodes in trees found from complete anddegapped sequences of both ORF and amino acid treeshadgt097 SH-type statistical support Thus the exact positionof the basal node of the present PVY phylogeny is unresolvedbut is either side of a single robustly supported node linkingthe Chile 3 sequence and the N phylogroup to all other PVYsThe reason for this rooting uncertainty is unknown the indi-vidual mutational differences between the Chile 3 N phy-logroup and all other sequences are spread throughout the fulllength of the sequences Furthermore the points representingthe pairwise syn and n-syn comparisons of the Chile 3 se-quence and other PVY isolates in a patristic distance graph

(Fig S1) were close to the main diagonal trend indicating thatthe DnDs ratios of the Chile 3 isolate are not unusual but sim-ilar to those of other PVY isolates

34 Dating the PVY phylogeny

The 240 PVY genomic sequences we analyzed provided two setsof heterochronously dated sequences (Supplementary TableS1) Seventy-three non-recombinant isolates from the C O andN phylogroups provided a set of full length ORFs the earliestsample was collected in 1938 CE (Common Era or AD) the mostrecent in 2013 CE and their mean sample collection date was19994 CE As described above the R-1 and R-2 genomes are re-combinants that share the central core region of their genomes(nts 2206ndash5706) with those of the O phylogroup Sample collec-tion dates are known for ninety-eight R-1 and R-2 isolates(Supplementary Table S1) Thus together with the core regionsof the seventy-three dated recombinant sequences their sharedcentral region provides another dataset of 171 sequences for es-timating PVY phylogenetic divergence dates it covers the samerange of sampling dates but with a mean of 20041 CEAlthough there are twice as many sequences they are only 38percent of the length of the complete ORFs The alignment ofthe 171 core regions was re-examined by RDP and five found tohave a short 50 terminal region of N phylogroup sequence (clos-est to JQ969036) so these were removed to leave 166 sequencesin the dataset ML and NJ trees inferred from these core regionsdiffered only in minor details and resembled closely those ofthe seventy-three n-rec ORFs (Fig 1) except that they placed allthe O R-1 and R-2 sequences in single tight cluster that did notcleanly resolve into O R-1 and R-2 lineages (Fig 4) The poorresolution within this cluster of O R-1 R-2 sequences probablyresults from the small diversity within the cluster nucleotidediversity pfrac14 0018 6 0008 compared with 0048 6 0023 in all 166core sequences and 0075 6 0035 in the seventy-three n-reccomplete ORFs Nonetheless the distribution of O R-1 and R-2sequences within the lineages (Fig 4) indicate clearly that O iso-lates are ancestral to R-1 isolates and R-2 isolates are the mostrecent

Figure 4 A cartoon summarizing the divergence dates of a ML phylogeny of seventy-three dated n-rec ORF sequences (Table 1 - line 1) estimated by the LSD method

and the dates from the MCC tree of a BEAST analysis of the same data Bars indicate the 95 CI ranges for both analyses Arrows indicate the period 1935 CEndash2016 CE

during which the isolates were collected

A J Gibbs et al | 5

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 4: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

pattern of recombination and parentage were progressivelyidentified and removed The most strongly supported phyloge-netic anomaly was shared by all eight R-2 sequences They hadan R-1 phylogroup sequence (EF026076) as the major parent andan N phylogroup minor parent (AJ890346) the latter region wasfrom around nt 5707 to the 30 terminus of the complete ORFs (38percent of the ORF) It was identified by seven out of nine RDPmethods with probabilities ranging from 1027 to 10112 All R-2sequences were therefore removed and the remaining forty se-quences again analyzed by RDP which then found the moststrongly supported and shared anomaly to be in all R-1 se-quences These had an O phylogroup major parent (EF026074)and an N phylogroup minor parent (X97895) that had in most(see below) been provided from the 50 terminus to nt 2205 (24percent of the ORF) this region was identified by seven out ofnine RDP methods with probabilities ranging from 1023 to1084 The syn and n-syn sequences of the remaining thirty-twoC O and N phylogroup ORFs gave almost identical ML and NJphylogenetic trees and a patristic distance graph of those trees

had most points close to a single diagonal the statistical sup-port for the syn ML tree was again greater than for the n-syn MLtree (log-likelihood 247234 and 292999 respectively)

Finally the syn codon sequences from C and O phylogroupisolates that were not among the forty-eight representative se-quences were added to the remaining thirty-two C and O synsequences to produce an alignment of 120 sequences This wasthen searched using RDP for other more specific rec regions (iesequence specific not phylogroup specific) and seventeen se-quences were found to have significant inter- and intra-phylogroup rec regions (ie three C phylogroup sequencesseven from the O phylogroup and seven from the N phy-logroup) One of those N phylogroup sequences was X97895from isolate N605 which has been used in several studies as areference sequence but was found by us to be a recombinantbetween AJ890346 and DQ157180 representing the two mainlineages of the N phylogroup Figure 1B shows the NJ tree of thecomplete ORFs of the remaining 103 sequences This phylogenyis much more symmetrical than that of the original 240 se-quences (Fig 1A) and its shortest branches (tip to root) arearound 70 percent of the length of the longest Figure 2B showstheir SplitsTree graph which again has a much simpler linkagestructure in its main branches than the 240 sequence tree(Fig 2A) Figure S1 also shows the patristic distance graph com-paring the NJ trees of the 103 syn and n-syn sequences and thisalso confirms that these n-rec sequences have evolved in a bio-logically coherent manner with most points aligned with andclose to a diagonal that has a slope around 08 and shows noevidence of mutational saturation

32 The R-1 and R-2 phylogroups

The genome maps of the three rec phylogroups are shown inFig 3 Phylogroup N is the sister lineage to the C and O phy-logroups and most genomes of these three phylogroups arenon-recombinant whereas all the R-1 and R-2 phylogroup se-quences are recombinants and have parents in the O and Nphylogroups

The ORFs of all forty-three R-1 phylogroup sequences aremost closely related to ORFs of the O phylogroup especiallythat of sequence EF026074 However all have a 50 terminal re-gion (24 percent of the complete ORF) that is closely related tothe homologous region of N phylogroup ORFs especially that ofX97895 In around half the ORFs (eg DQ157179 HQ912870JF927762) the N phylogroup region is from its 50 end to around nt2205 and in others (eg AJ890349 HQ912863 JN935419 andKJ801915) it is from nts 301 to 2205 and is preceded by a shortregion closest to O sequences In two of the sequences(HM991454 and JQ969040) the N phylogroup region is mostclosely related to AB331517 not X97896 These differences indi-cate that more than one event and O parent was involved in es-tablishing the R-1 phylogroup Some of the R-1 sequences (egAJ889868 and KJ634023) also have a short rec region around nts5600ndash6300 that is most closely related to that region of R-2isolates

The sequences of all seventy-six R-2 ORFs are closest to thatof an R-1 phylogroup sequence EF026076 but all have a 30 re-gion (nts 5707-end 38 percent of the ORF) that is most closelyrelated to the homologous region of AJ890346 an N phylogroupsequence In addition around one-fifth of the R-1 and R-2 se-quences also have smaller individual regions that in RDP analy-ses are recorded as significant inter- and intra-phylogrouprecombinations

Figure 2 The branches of SplitsTree phylogenies of the main genomic ORFs of

(A) 240 isolates of PVY and (B) 103 of those isolates that showed no significant

evidence of recombination The marked clusters are the same as those in Fig 1

Figure 3 A cartoon summarizing the relationships and genomic maps of the

five major phylogroups of the PVY isolates shown in Figs 1 and 2

4 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

The C phylogroup ORFs sequences including that of theChile 3 isolate are the most variable nucleotide diversitypfrac14 00974 6 00460 Those of the other four phylogroups aremuch less variable indicating that they have probably divergedmore recently the N phylogroup population is the most vari-able pfrac14 00256 6 00121 compared with the O phylogroup00211 6 00100 R-1 phylogroup 00196 6 00093 and R-2 theleast variable 00188 600089

33 Rooting the PVY phylogeny

BlastN and BlastP searches using representative sequencesfrom all of the n-rec PVY phylogroups found the most closelyrelated genomic sequences to be those of sunflower chlo-rotic mottle virus (JN863233) pepper severe mosaic virus(AM181350) and bidens mosaic virus (KF649336) In both MLand NJ trees of seventy-three dated n-rec complete ORFs(see below) including these three viruses as an outgroup theN phylogroup was placed as sister to all other PVY phy-logroups as in Fig 1B However when ML or NJ trees werecalculated from the encoded amino acid sequences or fromthe 166 core nucleotide sequences (see below) together withthe homologous regions of the three outgroup sequences theChile 3 sequence (FJ214726) was placed as sister to all otherPVY lineages This difference was found not only when com-plete aligned sequences were used but also when all indels(95 percent of the sequences) had been removed Nonethelessall the major nodes in trees found from complete anddegapped sequences of both ORF and amino acid treeshadgt097 SH-type statistical support Thus the exact positionof the basal node of the present PVY phylogeny is unresolvedbut is either side of a single robustly supported node linkingthe Chile 3 sequence and the N phylogroup to all other PVYsThe reason for this rooting uncertainty is unknown the indi-vidual mutational differences between the Chile 3 N phy-logroup and all other sequences are spread throughout the fulllength of the sequences Furthermore the points representingthe pairwise syn and n-syn comparisons of the Chile 3 se-quence and other PVY isolates in a patristic distance graph

(Fig S1) were close to the main diagonal trend indicating thatthe DnDs ratios of the Chile 3 isolate are not unusual but sim-ilar to those of other PVY isolates

34 Dating the PVY phylogeny

The 240 PVY genomic sequences we analyzed provided two setsof heterochronously dated sequences (Supplementary TableS1) Seventy-three non-recombinant isolates from the C O andN phylogroups provided a set of full length ORFs the earliestsample was collected in 1938 CE (Common Era or AD) the mostrecent in 2013 CE and their mean sample collection date was19994 CE As described above the R-1 and R-2 genomes are re-combinants that share the central core region of their genomes(nts 2206ndash5706) with those of the O phylogroup Sample collec-tion dates are known for ninety-eight R-1 and R-2 isolates(Supplementary Table S1) Thus together with the core regionsof the seventy-three dated recombinant sequences their sharedcentral region provides another dataset of 171 sequences for es-timating PVY phylogenetic divergence dates it covers the samerange of sampling dates but with a mean of 20041 CEAlthough there are twice as many sequences they are only 38percent of the length of the complete ORFs The alignment ofthe 171 core regions was re-examined by RDP and five found tohave a short 50 terminal region of N phylogroup sequence (clos-est to JQ969036) so these were removed to leave 166 sequencesin the dataset ML and NJ trees inferred from these core regionsdiffered only in minor details and resembled closely those ofthe seventy-three n-rec ORFs (Fig 1) except that they placed allthe O R-1 and R-2 sequences in single tight cluster that did notcleanly resolve into O R-1 and R-2 lineages (Fig 4) The poorresolution within this cluster of O R-1 R-2 sequences probablyresults from the small diversity within the cluster nucleotidediversity pfrac14 0018 6 0008 compared with 0048 6 0023 in all 166core sequences and 0075 6 0035 in the seventy-three n-reccomplete ORFs Nonetheless the distribution of O R-1 and R-2sequences within the lineages (Fig 4) indicate clearly that O iso-lates are ancestral to R-1 isolates and R-2 isolates are the mostrecent

Figure 4 A cartoon summarizing the divergence dates of a ML phylogeny of seventy-three dated n-rec ORF sequences (Table 1 - line 1) estimated by the LSD method

and the dates from the MCC tree of a BEAST analysis of the same data Bars indicate the 95 CI ranges for both analyses Arrows indicate the period 1935 CEndash2016 CE

during which the isolates were collected

A J Gibbs et al | 5

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 5: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

The C phylogroup ORFs sequences including that of theChile 3 isolate are the most variable nucleotide diversitypfrac14 00974 6 00460 Those of the other four phylogroups aremuch less variable indicating that they have probably divergedmore recently the N phylogroup population is the most vari-able pfrac14 00256 6 00121 compared with the O phylogroup00211 6 00100 R-1 phylogroup 00196 6 00093 and R-2 theleast variable 00188 600089

33 Rooting the PVY phylogeny

BlastN and BlastP searches using representative sequencesfrom all of the n-rec PVY phylogroups found the most closelyrelated genomic sequences to be those of sunflower chlo-rotic mottle virus (JN863233) pepper severe mosaic virus(AM181350) and bidens mosaic virus (KF649336) In both MLand NJ trees of seventy-three dated n-rec complete ORFs(see below) including these three viruses as an outgroup theN phylogroup was placed as sister to all other PVY phy-logroups as in Fig 1B However when ML or NJ trees werecalculated from the encoded amino acid sequences or fromthe 166 core nucleotide sequences (see below) together withthe homologous regions of the three outgroup sequences theChile 3 sequence (FJ214726) was placed as sister to all otherPVY lineages This difference was found not only when com-plete aligned sequences were used but also when all indels(95 percent of the sequences) had been removed Nonethelessall the major nodes in trees found from complete anddegapped sequences of both ORF and amino acid treeshadgt097 SH-type statistical support Thus the exact positionof the basal node of the present PVY phylogeny is unresolvedbut is either side of a single robustly supported node linkingthe Chile 3 sequence and the N phylogroup to all other PVYsThe reason for this rooting uncertainty is unknown the indi-vidual mutational differences between the Chile 3 N phy-logroup and all other sequences are spread throughout the fulllength of the sequences Furthermore the points representingthe pairwise syn and n-syn comparisons of the Chile 3 se-quence and other PVY isolates in a patristic distance graph

(Fig S1) were close to the main diagonal trend indicating thatthe DnDs ratios of the Chile 3 isolate are not unusual but sim-ilar to those of other PVY isolates

34 Dating the PVY phylogeny

The 240 PVY genomic sequences we analyzed provided two setsof heterochronously dated sequences (Supplementary TableS1) Seventy-three non-recombinant isolates from the C O andN phylogroups provided a set of full length ORFs the earliestsample was collected in 1938 CE (Common Era or AD) the mostrecent in 2013 CE and their mean sample collection date was19994 CE As described above the R-1 and R-2 genomes are re-combinants that share the central core region of their genomes(nts 2206ndash5706) with those of the O phylogroup Sample collec-tion dates are known for ninety-eight R-1 and R-2 isolates(Supplementary Table S1) Thus together with the core regionsof the seventy-three dated recombinant sequences their sharedcentral region provides another dataset of 171 sequences for es-timating PVY phylogenetic divergence dates it covers the samerange of sampling dates but with a mean of 20041 CEAlthough there are twice as many sequences they are only 38percent of the length of the complete ORFs The alignment ofthe 171 core regions was re-examined by RDP and five found tohave a short 50 terminal region of N phylogroup sequence (clos-est to JQ969036) so these were removed to leave 166 sequencesin the dataset ML and NJ trees inferred from these core regionsdiffered only in minor details and resembled closely those ofthe seventy-three n-rec ORFs (Fig 1) except that they placed allthe O R-1 and R-2 sequences in single tight cluster that did notcleanly resolve into O R-1 and R-2 lineages (Fig 4) The poorresolution within this cluster of O R-1 R-2 sequences probablyresults from the small diversity within the cluster nucleotidediversity pfrac14 0018 6 0008 compared with 0048 6 0023 in all 166core sequences and 0075 6 0035 in the seventy-three n-reccomplete ORFs Nonetheless the distribution of O R-1 and R-2sequences within the lineages (Fig 4) indicate clearly that O iso-lates are ancestral to R-1 isolates and R-2 isolates are the mostrecent

Figure 4 A cartoon summarizing the divergence dates of a ML phylogeny of seventy-three dated n-rec ORF sequences (Table 1 - line 1) estimated by the LSD method

and the dates from the MCC tree of a BEAST analysis of the same data Bars indicate the 95 CI ranges for both analyses Arrows indicate the period 1935 CEndash2016 CE

during which the isolates were collected

A J Gibbs et al | 5

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 6: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

TempEst analyses of the two datasets (73 n-rec ORFs and 166cores) found that both had a strong temporal signal correlationcoefficients 0436 and 0228 Pfrac14 000012 and 00032 for the com-plete and core sequences respectively They had small resid-uals with no trend and these were mostly associated with the Cphylogroup sequences and that of Chile 3 The estimatedTMRCAs were 14116 CE and 10768 CE respectively (Fig 5)Separate TempEst analyses of the seventy-three n-rec cores andninety-three rec cores in the 166 core dataset found that theseventy-three n-rec core sequences had a temporal correlationof 0448 (Pfrac14 000007) with a TMRCA of 14598 CE whereas the re-gression for the ninety-three rec core sequences was 00356(Pfrac14 073) Thus the temporal signal detected by TempEst in the166 core sequences seems to be mostly if not exclusively in itsseventy-three n-rec core sequences

The LSD method was used in its lsquoconstrainedrsquo mode (ie anancestral node must be older than its daughter nodes) to calcu-late the dates evolutionary rates and confidence intervalsfor the TMRCAs of the ML and NJ trees of the seventy-three ORFsequences and also their encoded amino acid sequences(Table 1) Estimates of the TMRCA dates obtained from completesequences were close to the mean of those from partitioned synand n-syn codon sequences (Table 1) However the TMRCA dateestimates from ML trees were 80ndash120 years earlier than thoseobtained from NJ trees and there were even larger differencesbetween the TMRCA date estimates from ORF versus aminoacid sequences the former were 250ndash290 years more ancientthan those from amino acid sequences Table 1 also records theestimated dates of the five principal nodes in the phylogeniesRandomizing the collection dates confirmed that the data con-tained a clear temporal signal as the ML phylogeny of seventy-three n-rec complete sequences had a TMRCA date of 1085 CEbut gave a mean TMRCA date from twenty collection date ran-domizations of 12555 BCE (standard deviation 7076 yearsrange 10 BCEndash17036 BCE) Thus the TMRCA date calculated withthe true sampling dates is outside the range of dates obtainedwith randomized collection dates and 19 standard deviationsfrom their mean

To et al (2015) stated that although LSD can analyze treesobtained by any method lsquomore accurate results are

expected from trees obtained using maximum-likelihoodmethodsrsquo which with our data gave the earliest TMRCA esti-mates and these were most precise in that their 95 percentconfidence limits were much smaller (Table 1) They estimatethat the present PVY population diverged from a common an-cestor around 1085 CE with commensurately more recentdates for the divergences within the phylogeny (Fig 6)

LSD analysis of the 166 core sequences gave TMRCA dates(Table 1) for the ML and NJ trees close to those estimated for theseventy-three complete sequences and all major nodeshadgt09 SH-type statistical support including that of the com-bined cluster of O R-1 and R-2 phylogroup sequences and itsmajor nodes The mean root date from twenty collection daterandomizations of the ML data was 11491 BCE (standard devia-tion 5716 years range 1292 BCEndash15269 BCE) again confirmingthe strong temporal signal

The lineages within the cluster of O R-1 and R-2 sequencesin the ML and NJ trees of core sequences when collapsed hadidentical groupings (Fig 4) although the dates estimated forthese lineages differed (Table 1 and Fig 4) For example the MLtree estimated the origins of O isolates R-1 isolates and R-2isolates to be 19182 CE 19467 CE and 19694 CE respectively(Fig 4) whereas the NJ tree estimates were all earlier and wererespectively 19062 CE 19333 CE and 19603 CE

In BEAST analyses of the two datasets the best-supportedmodel for mutational substitution was GTRthorn IthornC4 while thebest model for the rate of substitution was the lsquorelaxed uncorre-lated lognormalrsquo clock and a population of constant size Alldatasets passed date-randomization tests The phylogenies in-ferred by BEAST had essentially the same topology as those in-ferred by ML and NJ however the dates (Table 2) weresignificantly earlier than those estimated by the regressionmethods The TMRCA of the seventy-three n-rec ORF sequenceswas 1590 BCE and was twenty-four CE for the 166 cores The to-pology and dates of the other nodes were obtained from themaximum clade credibility tree were commensurately earlierand showed exactly the same pattern of slightly unresolvedclustering of the cluster of O R-1 R-2 core sequences as in theML and NJ trees (Fig 4A) with their origins estimated to be18329 CE 18806 and 1932 respectively (Fig 4B) The 166 core

Figure 5 The NJ phylogeny of the central core regions of the genomes of 166 dated C O N R-1 and R-2 isolates (A) the Accession Codes of the C isolates are blue O

red R-1 green R-2 orange and N yellow (B) Summary of the phylogroup composition of the collapsed clusters andashf The node dates are from LSD estimates of MJ and NJ

phylogenies and from the MCC phylogeny of a BEAST analysis These node dates were used to calculate the date scales assuming linearity Labeled arrows indicate the

dates of the likely origins of the O R-1 and R-2 populations

6 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 7: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

sequences were partitioned into those from the seventy-threefrom n-rec sequences and those from the ninety-three from recsequences they gave TMRCA estimates of 214 BCE and 7 CE re-spectively and passed the date randomization test indicatingthat they contained significant temporal signals in contrast tothe results of the TempEst tests

Part of Fig 5 summarizes and compares the TMRCAs andconfidence intervals estimated by the different methods andshows that despite the TMRCA estimates covering a six-foldrange the 95 confidence limits of most overlap between 500CE and 1500 CE

Finally a simple comparison was made of the basal branchlength of the PVY cluster as a proportion of the basal branchlength of the potyviruses A ML phylogeny was calculated fromfour representative PVY ORF sequences [AB331515 (N phy-logroup) EU563512 (C) FJ214726 (C) HM367075 (O)] aligned withthose of the other 103 representative potyviruses and rymovi-ruses used for Fig 1 of Gibbs Nguyen and Ohshima (2015) Themean patristic distances of the sequences connected throughthe root of the phylogeny (ie from Narcissus degeneration vi-rus NC_008824 Onion yellow dwarf virus NC_005029 Shallotyellow stripe virus NC_007433 Vallota speciosa virus isolateMarijiniup 7 JQ723475 to all the other potyviruses) was 2258substitutionssite and through the root of the PVY sequences(ie from AB331515 (N) to all the other PVYs) was 0227 substitu-tionssite a ratio of 9941

Figure 6 A cartoon summarizing the TMRCA dates and 95 CIs (vertical scale)

of PVY estimated by TempEst LSD (ML and NJ trees) and BEAST (MCC tree) anal-

yses of the seventy-three n-rec ORF and 161-core sequences together with his-

torical events that may have influenced the evolution of the virus

Table 2 BEAST dates of TMRCA nodes in phylogenies of PVY sequences

Dataset 73 n-rec completeORFs

All 166 core regions 73 n-rec core regions 93 rec core regions

Sequence length (nt) 9201 3501 3501 3501No of sequences 73 166 73 93Sampling date range 1938ndash2013 1938ndash2013 1938ndash2012 1970ndash2013TMRCAa (years) 3603 (1411ndash6566) 1989 (1084ndash3096) 2227 (951ndash3952) 2006 (688ndash3755)TMRCAb (dates) 1590 (602 to 4553) 24 (929 to1083) 214 (1062 to 1939) 7 (1325 to 1742)Substitution rate

(ntsiteyear) b

597 105 (250 105ndash956 105)

999 105 (688 105ndash132 104)

924 105 (433 105ndash139 104)

866 105 (371 105ndash137 104)

aTMRCA lsquotime to the most recent common ancestorrsquo years before 2013 95 credibility intervals (CI) in parenthesesbTMRCA dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC) 95 credibility intervals (CI) in parentheses

Table 1 LSD dates of the TMRCA and major nodes in phylogenies of 73 n-rec PVY sequences

Datasetb Datac Algorithmd Sites used Node datesa

Node 1e Node 2 Node 3 Node 4 Node 5 Node 6 Ratef (104ssy)

ORF Nucs ML All 10854 (1251ndash123) 12693 15512 16514 18982 19018 207 (098ndash255)n-syn codons 11583 (1265ndash792) 13208 15753 16805 19066 19034 278 (194ndash323)syn codons 10176 (1234ndash151) 12437 15383 16293 18813 1904 177 (092ndash221)

NJ All 12049 (1496ndash-17272) 12808 15152 14431 1890 18349 111 ((005ndash171)n-syn codons 12435 (1513 to 230) 13091 1543 14628 19019 18396 153 (055ndash228)syn codons 11525 (1467 to 72 108) 1242 14387 14598 18466 18284 091 (1 106ndash146)

AAs ML All 13636 (1501ndash238) 14271 1603 17021 1943 19336 105 (041ndash126)NJ All 14599 (1642ndash70) 1513 1615 17023 19331 19104 086 (023ndash119)

Core Nucs ML All 11512 (1296ndash703) 12599 16266 16228 19182 19257 202 (134ndash243)NJ All 13077 (1521 to 71) 13077 16248 15446 19062 18975 119 (040ndash157)

aNode dates positive dates are CE (Common Erafrac14AD) negative are BCE (frac14BC)bDataset ORFmdashmajor open reading frame Core nucleotides 2206 5706cData Nucs nucleotides AAs encoded amino acidsdAlgorithm ML maximum likelihood (PhyML) NJ neighbor-joining (ClustalX)eNodes numbered as in Fig 4 95 confidence intervals for Node 1 onlyfRate evolutionary rate substitutionssiteyear 95 confidence intervals for Node 1

A J Gibbs et al | 7

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 8: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

4 Discussion

This article reports that using syn codon sequences simplifiedthe resolution of relationships in a complex population of PVYgenomes The phylogenies and RDP analyses calculated from thesyn codon sequences were simpler to interpret than those calcu-lated from the comparable n-syn codon or complete sequencesand had greater statistical support Syn codon changes reflect thetemporal component of phylogenetic change of gene populationsbetter than n-syn codons as changes of the latter are also linkedto changes of the encoded protein and incoherent changes ofsyn and n-syn codons in complete sequences may confoundanalysis of either especially when rec sequences are presentSignificantly Ohshima et al (2016) also found that Bayesian esti-mates of the ages of cucumber mosaic cucumovirus (CMV) popu-lations were more precise if made using syn codon sequencesrather than complete sequences CMV has three genomic seg-ments which encode five ORFs and they found that the age esti-mates for different ORFs using syn codon sequences had acoefficient of variation around half that of complete sequencesand the 95 CI ranges of those estimates was around 25ndash50 per-cent smaller than those of the original complete sequences

The strategy of progressively removing the most strongly sup-ported cluster specific recombinants resolved the complexity ofthe relationships of the rec and n-rec ORFs In Fig 1A (ie a NJ tree)and in Fig 2 of Kehoe and Jones (2016) (ie an ML tree) the N phy-logroup forms an outlier of the R-2 cluster We have shown thatthis topology is false as the dominance of the recombinant link-ages between the phylogroups seriously compromise agglomera-tive methods of tree-building in the following way Whensequences of all PVY phylogroups are present in an analysis theR-1 sequences are more closely related to their O parent than theirN parent as the former provided a greater percentage (76 percent)of their sequence and therefore the R-1 and O phylogroups linkLikewise the R-2 sequences are more closely related to their R-1parent than their N parent as the former provided 62 percent oftheir sequence and so again the R-2 phylogroup links to the R-1phylogroup rather than the N parent Finally the N phylogroupclusters with the R-2 ORFs as they share two-thirds of their se-quence whereas the true links of the N phylogroup as the sisterlineage to all the other phylogroups are more distant Thus the in-ferred inter-phylogroup relationships when all are present arefalse (Figs 1A and 2A) and the true topology of the phylogroups isonly revealed when the recombinants are removed

In the second part of this project we attempted to date themajor features of the PVY phylogeny using different heterochro-nous tip-dating methods The resulting estimates althoughoverlapping as judged by their 95 CIs differed significantly inscale so that the TMRCAs estimated especially for the basalnodes by the probabilistic method were between two to fourtimes earlier than those estimated by the regression methodsSo here we discuss whether events of the history of PVY and itshosts are congruent with those of the PVY phylogeny and pro-vide in essence independent node dating that supports someestimates more than others

All our analyses supported a single PVY phylogeny that hasseveral distinctive features (Fig 6) PVY is a recent member ofthe lsquoPVY lineagersquo of potyviruses most of whose members wereisolated first or only from plants in the Americas (Gibbs andOhshima 2010) and although PVY was first identified in the UKit almost certainly came from the Americas Furthermore thebasal divergences of the PVY phylogeny produce three lineagesand one is the Chile 3 isolate found in Capsicum baccatum only inSouth America (Moury 2010) Thus the basal (TMRCA) node in

the PVY phylogeny most likely represents an event that oc-curred in South America The other two basal PVY lineageshave been found throughout the world but not yet in SouthAmerica The flora and fauna of South America was biologicallyisolated from Europe until the start of the lsquoColumbianExchangersquo in the late 15th century CE (Crosby 1972) when theearly trans-Atlantic maritime traders first took American ani-mals and plants including potato tubers to Europe and viceversa Thus the basal nodes of the PVY phylogeny are likely torepresent events that predate the late 15th century and thisconclusion agrees with all our date estimates of those nodesthe most recent is 1411 CE with estimates from regression anal-yses ranging from 1085 CE to 1411 CE and those from probabil-istic analyses from 1590 BCE to 24 BCE

The second basal lineage the C phylogroup probably di-verged in Europe and consists mostly (712 isolates) of isolatesfound in tobacco tomato and pepper crops Potato pepper to-mato and tobacco were first introduced to Europe during theColumbian Exchange from their domestication centres in theAndean regions of Peru and Bolivia (potato) more widely in theAndes (tomato tobacco) and further north in the Americas (pep-per) (Bai and Lindhout 2007 Pickersgill 2007 Brown andHenfling 2014 Kraft et al 2014) There is no evidence that PVY istransmitted by sexually produced seed so it was most likely in-troduced to Europe in the small numbers of S tuberosum tubersfirst taken from South America to the Canary Islands in 1562 CEto Spain in 1570 CE and the UK in 1588 CE if so then all the ini-tial PVY population of Europe probably originated from infectedS tuberosum tubers in those cargoes The divergence to severalnon-potato hosts probably occurred outside South America al-though it is unknown to what extent different C phylogroup iso-lates are ecologically adapted to particular hosts The datesestimated by the regression methods are congruent with thisinterpretation whereas those estimated by the probabilisticmethods are not (Fig 6 nodes 3 and 4)

The third basal lineage the N phylogroup is known onlyfrom a cluster of closely related isolates so it probably radiatedrecently and at present we have no idea of when and how itfirst migrated from the Americas It has been isolated world-wide mostly from potatoes (1315 isolates)

The earliest potato breeding programs of any size began in1810 but did not become widespread until the second half ofthe 19th century Initially there were very few potato introduc-tions to Europe and for the first two centuries the crops had lit-tle genetic diversity Virus diseases that seriously debilitatedestablished cultivars but did not pass through sexually pro-duced potato seed probably encouraged selection and as a re-sult more cultivars were grown in the late 18th and early 19thcenturies These were taken from natural berries produced byself pollination Most were selfed derivatives so selection re-duced the gene pool (Glenndinning 1983) Potato blight diseasecaused by the oomycete Phytophthora infestans appeared in themid 19th century and eliminated almost all cultivars as theywere so inbred further reducing the gene pool Breeding amongblight survivors and new introductions from the Andean regionof South America led to many new cultivars being grown byearly 20th century (Glenndinning 1983) This introduction ofnew potato germplasm led to genetic diversity in the cropwhich apparently introduced the selection pressure in the formof PVY resistance genes that diversified the PVY population inthe early 20th century Our analyses indicate that the parents ofthe recombinant R-1 and R-2 strains were members of existingO and N populations therefore probably European As the Nc re-sistance gene which C strains ellicit was widely distributed in

8 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 9: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

early potato cultivars (Bawden 1936 Cockerham 1943 Bawdenand Sheffield 1944) selection for avirulence to this gene waslikely to be an important factor in the diversification of the Ophylogroup population Similarly N strains but not O strainsinfect plants with both Nc and the less common Ny and Nzgenes so selection for avirulence would again have favored di-versity in the N phylogroup population (Cockerham 1970 Jones1990Chikh-Ali et al 2014 Kehoe and Jones 2016)

PVY strains causing veinal necrosis symptoms in tobaccowere first reported in 1935 in the UK (Smith and Dennis 1940)and soon afterwards in Brazil Europe and North America(Nobrega and Silberschmidt 1944 Bawden and Kassanis 19471951 Richardson 1958 Klinkowski and Schmelzer 1960Silberschmidt 1960 Todd 1961 Kahn and Monroe 1963 Brucher1969) However whether these strains included ones in both theN and R1 phylogroups is unknown Strains inducing this symp-tom in tobacco produce mild symptoms in potato plants so theysoon became widely distributed because infected plants wereoften missed when seed potato crops were rogued (Todd 1961Jones 1990) Possibly this increased spread can be attributed toemergence of the R1 population at that time but evidence forthat is lacking and none of the sequences we analyzed camefrom early isolates Although the R-1 population is not verydamaging to potato it is difficult to control as its symptoms areso mild and it overcomes resistance genes Nc No and Nz Bycontrast the R-2 population is more damaging as it often causestuber necrosis and still overcomes these three resistancegenes

The tuber necrosis isolates R-2 phylogroup were firstsighted in the early 1980s in Europe and shown to be recombi-nants (Beczner et al 1984 Le Romancer Kerlan and Nedellec1994 Boonham et al 2002 Lorenzen et al 2006) These reportsare likely to be accurate as R-2 infections are so noticeable andby the 1980s potato crops in Europe were intensively monitoredfor disease Our estimates of the origin of R-2 isolates were19694 CE 19603 CE and 19324 CE for the ML-LSD NJ-LSD andBEAST analyses respectively Thus the ML-LSD estimate agreesmost closely with the crop record and the BEAST estimateagrees least well but all are possible Similarly PVY infectionsthat may have been caused by R-1 isolates were first reportedin 1935 and soon confirmed to be widespread The ML-LSDNJ-LSD and BEAST methods placed the origin of R-1 as 19467CE 19333 CE and 18806 CE respectively Thus the NJ-LSD esti-mate is the most likely the BEAST estimate less likely and theML-LSD impossible

All our date estimates agree that the near simultaneous radi-ation of the O and N populations of PVY (Figs 4 and 6 nodes 5and 6) coincided with the earliest potato breeding programs ofany size These programs increased the genetic diversity ofavailable potato cultivars and this would have included the Nyand Nz resistance genes They also coincided with cultural andagronomic changes to seed and ware potato crop production(Singh et al 2008 Gray et al 2010 Jones 2014) A combination ofthese factors is likely to have led to conditions favoring recom-binants This scenario fits better with the LSD dates 19182 CEand 19062 CE respectively than with the earlier BEAST date of18329 CE

Most of our PVY date estimates are congruent with those ofthe recent history of the international potato crop but this isnot surprising as the estimated dates have very broad overlap-ping 95 CI estimates One crucial event might however informus whether some TMRCA estimates are more accurate thanothers and that is whether the divergence of the C phylogroup(Fig 6 nodes 3 and 4) occurred before or after PVY was

introduced to Europe in the late 16th century the TMRCAs esti-mated by the regression methods are congruent with aEuropean divergence of the C phylogroup whereas those esti-mated by the probabilistic methods indicate a much earlier di-vergence of the C phylogroup This dilemma will only beresolved when the genomic sequences of more Chile 3-like iso-lates or of the C phylogroup are known

Our estimates of the dates for the PVY phylogeny are some-what earlier than those reported by Visser Bellstedt and Pirie(2012) who analyzed rec n-rec and partitioned PVY genomic se-quences by probabilistic methods although our 95 CI esti-mates overlap The difference may merely be the result ofpopulation sampling differences as a much larger set of se-quences was available to us and it included the early dated iso-late KP691327 (1943 CE) A further difference between ouranalyses is the date given to sequence EU563512 which asVisser Bellstedt and Pirie (2012) reported was obtained lsquofrom apotato plant vegetatively propagated since 1938rsquo and sequencedin 2007 (Dullemans et al 2011) We dated this sequence as 1938the earliest of our sequences whereas Visser Bellstedt andPirie (2012) dated it as 2007 In making this choice we checkedhow this difference affected the date estimates of the seventy-three n-rec dataset and found that using the 1938 date gave anML-LSD TMRCA of 10854 CE (95 CI 1267 CEndash321 CE) whereasusing the 2007 date gave a TMRCA of 363 CE and very muchbroader 95 CIs (918 CE to 18 109 BCE) and when the se-quence was omitted an intermediate TMRCA of 687 CE (1016CEndash9503 BCE) Thus the 95 CI range was smallest using the1938 date and justified our choice

Finally we explored the possibility of extrapolating datesover an even larger timescale Gibbs et al (2008) suggested thatthe Neolithic invention of agriculture which in Eurasia occurredin the northern LevantineMesopotamian area around 12000BCE (Bellwood 2005 Pinhasi Fort and Ammerman 2005) pro-duced the conditions for the starburst potyvirus diversificationand this provides another possible dating point for potyvirusesWe estimated that in an ML phylogeny of the major ORFs of alarge representative set of potyviruses and four PVY isolates(one N one O and two C phylogroup isolates) the ratio of themean patristic distances of the sequences connected throughthe root of the phylogeny and through the root of the PVY se-quences was 9941 indicating that a TMRCA of all potyvirusesof around 10000 BCE would give a TMRCA of PVY around 1000CE

Supplementary data

Supplementary data are available at Virus Evolution online

Acknowledgements

We are very grateful to Hien To and Olivier Gascuel for greathelp with use of their LSD software

Conflict of interest None declared

ReferencesAbascal F Zardoya R and Telford M J (2010) lsquoTranslatorX

Multiple Alignment of Nucleotide Sequences Guided by AminoAcid Translationsrsquo Nucleic Acids Research 38 W7ndash13

Altschul S F et al (1990) lsquoBasic Local Alignment Search ToolrsquoJournal of Molecular Biology 215 403ndash10

A J Gibbs et al | 9

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 10: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

Bai Y and Lindhout P (2007) lsquoDomestication and Breeding ofTomatoes What Have We Gained and What Can We Gain inthe Futurersquo Annals of Botany 100 1085ndash94

Bawden F C (1936) lsquoThe Viruses Causing Top Necrosis(Acronecrosis) of the Potatorsquo Annals of Applied Biology 23487ndash97

and Kassanis B (1947) lsquoThe Behaviour of Some NaturallyOccurring Strains of Potato Virus Yrsquo Annals of Applied Biology31 503ndash16

and (1951) lsquoSerologically Related Strains of PotatoVirus Y That Are Mutually Antagonistic in Plantsrsquo Annals ofApplied Biology 38 402ndash10 [CrossRef]Mismatch]

and Sheffield F M L (1944) lsquoThe Relationships of SomeViruses Causing Necrotic Diseases of the Potatorsquo Annals ofApplied Biology 31 33ndash40

Beczner L et al (1984) lsquoStudies on the Etiology of Tuber NecroticRingspot Disease in Potatorsquo Potato Research 27 339ndash52

Bellwood P (2005) First Farmers The Origins of AgriculturalSocieties Malden Oxford and Carlton Blackwell Publishing

Boni M F Posada D and Feldman M W (2007) lsquoAn ExactNonparametric Method for Inferring Mosaic Structure inSequence Tripletsrsquo Genetics 176 1035ndash47

Boonham N et al (2002) lsquoBiological and Sequence Comparisonsof Potato virus Y Isolates Associated with Potato Tuber NecroticRingspot Diseasersquo Plant Pathology 51 117ndash26

Brown C R (1993) lsquoOrigin and History of the Potatorsquo AmericanJournal of Potato Research 70 363ndash73

and Henfling J W (2014) 0A History of the Potato0 inNavarre R and Pavek MJ (eds) The Potato Botany Productionand Use pp 1ndash11Wallingford UK CABI

Brucher H (1969) lsquoObservations on Origin and Spread of Y-N

Virus in South Americarsquo Angewandte Botanik 43 241ndash9Chikh-Ali M et al (2014) lsquoEvidence of a Monogenic Nature of

the Nz Gene Against Potato Virus Y Strain Z (PVYZ) in PotatorsquoAmerican Journal of Potato Research 91 649ndash54

Cockerham G (1943) lsquoThe Reactions of Potato Varieties toViruses X A B and Crsquo Annals of Applied Biology 30 338ndash44

(1970) lsquoGenetical Studies on Resistance to Potato Viruses Xand Yrsquo Heredity 25 309ndash48

Crosby A E (1972) 0The Columbian Exchange Biological and CulturalConsequences of 14940 268 pp Greenwood Publishing Group

Cuevas J M et al (2012) lsquoPhylogeography and MolecularEvolution of Potato Virus Yrsquo PLoS One 7 e37853

Darriba D et al (2011) lsquoProtTest 3 Fast Selection of Best-FitModels of Protein Evolutionrsquo Bioinformatics 27 1164ndash5

De Bokx J A and van der Want J P H (eds) (1987) 0Viruses ofPotatoes and Seed-Potato Production0 2nd edn WageningenCentre for Agricultural Publishing and Documentation

Drummond A J et al (2012) lsquoBayesian Phylogenetics with BEAUtiand the BEAST 17rsquo Molecular Biology and Evolution 29 1969ndash73

Duchene S (2015) lsquoThe Performance of the Date-RandomizationTest in Phylogenetic Analyses of Time-Structured Virus DatarsquoMolecular Biology and Evolution 32 1895ndash906

Dullemans A M et al (2011) lsquoComplete Nucleotide Sequence ofa Potato Isolate of Strain Group C of Potato Virus Y from 1938rsquoArchives of Virology 156 473ndash7

Fourment M and Gibbs M J (2006) lsquoPATRISTIC a Program forCalculating Patristic Distances and Graphically Comparing theComponents of Genetic Changersquo BMC Evolutionary Biology 6 1

Gibbs A J and Ohshima K (2010) lsquoPotyviruses in the DigitalAgersquo Annual Review of Phytopathology 48 205ndash23

et al (2008) lsquoThe Prehistory of Potyviruses Their InitialRadiation Was during the Dawn of Agriculturersquo PLoS One 3 e2523

Nguyen H D and Ohshima K (2015) lsquoThe lsquoEmergencersquo ofTurnip Mosaic Virus Was Probably a lsquoGene-for-Quasi-GenersquoEventrsquo Current Opinion in Virology 10 20ndash5

Armstrong J S and Gibbs A J (2000) lsquoSister-Scanning AMonte Carlo Procedure for Assessing Signals in RecombinantSequencesrsquo Bioinformatics 16 573ndash82

et al (2006) lsquoThe Variable Codons of H3 Influenza A VirusHaemagglutinin Genesrsquo Archives of Virology 52 11ndash24

Glais L Tribodet M and Kerlan C (2002) lsquoGenomic Variabilityin Potato Potyvirus Y (PVY) Evidence that PVYNW and PVYNTN

Variants Are Single to Multiple Recombinants Between PVYO

and PVYN Isolatesrsquo Archives of Virology 147 363ndash78Glenndinning D R (1983) lsquoPotato Introductions and Breeding up

to the 20th Centuryrsquo New Phytologist 94 479ndash505Gray S et al (2010) lsquoPotato Virus Y An Evolving Concern for Potato

Crops in the United States and Canadarsquo Plant Disease 94 1384ndash97Guindon S and Gascuel O (2003) lsquoA Simple Fast and Accurate

Algorithm to Estimate Large Phylogenies by MaximumLikelihoodrsquo Systematic Biology 52 696ndash704

Hall T A (1999) lsquoBioEdit A User-Friendly Biological SequenceAlignment Editor and Analysis Program for Windows 9598NTrsquo Nucleic Acids Symposium Series 41 95ndash8

Hawkes J G (1978) 0History of the Potato0 in Harris P M (ed) ThePotato Crop pp 1ndash14New York Springer

and Fransisco-Ortega J (1993) lsquoThe Early History of thePotato in Europersquo Euphytica 70 1ndash7

Holmes E C Worobey M and Rambaut A (1999) lsquoPhylogeneticEvidence for Recombination in Dengue Virusrsquo Molecular Biologyand Evolution 16 405ndash9

Hu X et al (2009a) lsquoSequence Characteristics of Potato Virus YRecombinantsrsquo Journal of General Virology 90 3033ndash41

et al (2009b) lsquoA Novel Recombinant Strain of Potato VirusY Suggests a New Viral Genetic Determinant of Vein Necrosisin Tobaccorsquo Virus Research 143 68ndash76

Huson D H and Bryant D (2006) lsquoApplication of PhylogeneticNetworks in Evolutionary Studiesrsquo Molecular Biology andEvolution 23 254ndash67

Jeanmougin F et al (1998) lsquoMultiple Sequence Alignment withClustal Xrsquo Trends in Biochemical Sciences 23 403ndash5

Jones R A C (1981) 0The Ecology of Viruses Infecting Wild andCultivated Potatoes in the Andean Region of South America0in Thresh J M (ed) Pests Pathogens and Vegetation pp 89ndash107London Pitman

(1990) lsquoStrain Group Specific and Virus Specific HypersensitiveReactions to Infection with Potyviruses in Potato CultivarsrsquoAnnals of Applied Biology 117 93ndash105 [CrossRef]Mismatch]

(2014) 0Virus Disease Problems Facing Potato IndustriesWorldwide Viruses Found Climate Change ImplicationsRationalising Virus Strain Nomenclature and Addressing thePotato Virus Y issue0 in Navarre R and Pavek M J (eds) ThePotato Botany Production and UsesWallingford UK CABI

and Kehoe M A (2016) lsquoA Proposal to Rationalize Within-Species Plant Virus Nomenclature Benefits and Implicationsof Inactionrsquo Archives of Virology 161 2051ndash7

Kahn R P and Monroe R L (1963) lsquoDetection of the TobaccoVeinal Necrosis Strain of Potato Virus in Solanum cardenasiiand S andigenum Introduced into the United StatesrsquoPhytopathology 53 1356ndash9

Karasev A V and Gray S M (2013) lsquoGenetic Diversity of Potatovirus Y Complexrsquo American Journal of Potato Research 90 7ndash13

et al (2011) lsquoGenetic Diversity and Origin of the OrdinaryStrain of Potato virus Y (PVY) and Origin of Recombinant PVYStrainsrsquo Phytopathology 101 778ndash85

10 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 11: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

Katoh K and Standley D M (2013) lsquoMAFFT Multiple SequenceAlignment Software Version 7 Improvements in Performanceand Usabilityrsquo Molecular Biology and Evolution 30 772ndash80

Kehoe M A and Jones R A C (2016) lsquoPotato virus Y StrainNomenclature Lessons From Comparing Isolates ObtainedOver a 73 Year Periodrsquo Plant Pathology 65 322ndash33

Kerlan C (2006) 0Potato Virus Y0 in Descriptions of Plant Viruses No414 Adams J and Antoniw J (eds) Rothamsted Research UK

and Moury B (2008) 0Potato Virus Y0 in Granoff A andWebster R G (eds) Encyclopedia of Virology 3rd edn pp 287ndash96New York Academic Press

Klinkowski M and Schmelzer K (1960) lsquoA Necrotic Type ofPotato Virus Yrsquo American Potato Journal 37 221ndash8

Kraft K H et al (2014) lsquoMultiple Lines of Evidence for the Originof Domesticated Chili Pepper Capsicum annuum in MexicorsquoProceedings of the National Academy of Science USA 111 6165ndash70

Le Romancer M Kerlan C and Nedellec M (1994) lsquoBiologicalCharacterization of Various Geographical Isolates of PotatoVirus Y Inducing Superficial Necrosis on Potato Tubersrsquo PlantPathology 43 138ndash44

Le S Q and Gascuel G (2008) lsquoAn Improved General Amino AcidReplacement Matrixrsquo Molecular Biology and Evolution 25 1307ndash20

Lemey P et al (2009) lsquoIdentifying Recombinants in Human andPrimate Immunodeficiency Virus Sequence Alignments UsingQuartet Scanningrsquo BMC Bioinformatics 10 126

Loebenstein G et al (eds) (2001) Virus and Virus-Like Diseases ofPotatoes and Production of Seed-Potatoes Dordrecht KluwerAcademic Publishers

Lorenzen J H et al (2006) lsquoWhole Genome Characterization ofPotato virus Y Isolates Collected in the Western USA and TheirComparison to Isolates From Europe and Canadarsquo Archives ofVirology 151 1055ndash74

Martin D and Rybicki E (2000) lsquoRDP Detection of RecombinationAmongst Aligned Sequencesrsquo Bioinformatics 16 562ndash3

Martin D P et al (2005) lsquoA Modified BOOTSCAN Algorithm forAutomated Identification of Recombinant Sequences andRecombination Breakpointsrsquo AIDS Research and HumanRetroviruses 21 98ndash102

et al (2015) lsquoRDP4 Detection and Analysis of RecombinationPatterns in Virus Genomesrsquo Virus Evolution 1 1ndash5

Milne I et al (2009) lsquoTOPALi v2 a Rich Graphical Interface forEvolutionary Analyses of Multiple Alignments on HPC Clustersand Multi-Core Desktopsrsquo Bioinformatics 25 126ndash7

Maynard Smith J (1992) lsquoAnalyzing the Mosaic Structure ofGenesrsquo Journal of Molecular Evolution 34 126ndash9

McGuire G and Wright F (2000) lsquoTOPAL 20 ImprovedDetection of Mosaic Sequences Within Multiple AlignmentsrsquoBioinformatics 16 130ndash4

Moury B (2010) lsquoA New Lineage Sheds Light on the EvolutionaryHistory of Potato Virus Yrsquo Molecular Plant Pathology 11 161ndash8

et al (2002) lsquoEvidence for Diversifying Selection in Potatovirusy and in the Coat Protein of Other Potyvirusesrsquo Journal ofGeneral Virology 83 2563ndash73

and Simon V (2011) lsquodNdS-Based Methods DetectPositive Selection Linked to Trade-Offs Between DifferentFitness Traits in the Coat Protein of Potato Virus Yrsquo MolecularBiology and Evolution 28 2707ndash17

Nei M and Li W H (1979) lsquoMathematical Model for StudyingGenetic Variation in Terms of Restriction EndonucleasesrsquoProceedings of the National Academy of Science USA 76 5269ndash73

Nobrega N R and Silberschmidt K (1944) lsquoSobre uma provavelvariante do virus ldquoYrdquo da batatinha (Solanum virus 2 Orton)que tern a peculiaridade de provocar necroses em plantas defumorsquo Arquivos Do Instituto Biologico (Sao Paulo) 15 307ndash30

Ogawa T et al (2008) lsquoGenetic Structure of a Population ofPotato Virus Y Inducing Potato Tuber Necrotic RingspotDisease in Japan Comparison with North American andEuropean Populationsrsquo Virus Research 131 199ndash212

et al (2012) lsquoThe Genetic Structure of Populations of PotatoVirus Y in Japan Based on the Analysis of 20 Full GenomicSequencesrsquo Journal of Phytopathology 160 661ndash73

Ohshima K et al (2016) lsquoTemporal Analysis of Reassortmentand Molecular Evolution of Cucumber Mosaic Virus ExtraClues From Its Segmented Genomersquo Virology 487 188ndash97

Padidam M Sawyer S and Fauquet C M (1999) lsquoPossibleEmergence of New Geminiviruses by Frequent RecombinationrsquoVirology 265 218ndash25

Pickersgill B (2007) lsquoDomestication of Plants in the AmericasInsights From Mendelian and Molecular Geneticsrsquo Annals ofBotany 100 925ndash40

Pinhasi R Fort J and Ammerman A J (2005) lsquoTracing the Originand Spread of Agriculture in Europersquo PLoS Biology 3 e410

Posada D and Crandall K A (2001) lsquoEvaluation of Methods forDetecting Recombination From DNA Sequences ComputerSimulationsrsquo Proceedings of the National Academy of Science USA98 13757ndash62

Rambaut A et al (2016) lsquoExploring the Temporal Structure ofHeterochronous Sequences Using TempEstrsquo Virus Evolution 2vew007

Ramsden C Holmes E C and Charleston M A (2009)lsquoHantavirus Evolution in Relation to Its Rodent and InsectivoreHosts No Evidence for Codivergencersquo Molecular BiologyEvolution 26 143ndash53

Richardson D E (1958) lsquoSome Observations on the Tobacco VeinalNecrosis Strain of Potato Virus Yrsquo Plant Pathology 7 133ndash5

Rowley J S Gray S M and Karasev A V (2015) lsquoScreeningPotato Cultivars for New Sources of Resistance to Potato VirusYrsquo American Journal of Potato Research 92 38ndash48

Salaman R N (1954) lsquoThe Early European Potato Its Characterand Place of Originrsquo Journal of the Linnean Society Botany 53 1ndash27

and Hawkes J G (1949) lsquoThe Character of the EarlyEuropean Potatorsquo Proceedings of the Linnean Society London 16171ndash84

Schubert J Fomitcheva V and Sztangret-Wisniewska J (2007)lsquoDifferentiation of Potato Virus Y Strains Using Improved Setsof Diagnostic PCR-Primersrsquo Journal of Virological Methods 14066ndash74

Shimodaira H and Hasegawa M (1999) lsquoMultiple Comparisonsof Log-Likelihoods with Applications to PhylogeneticInferencersquo Molecular Biology and Evolution 16 1114ndash6

Silberschmidt K M (1960) lsquoTypes of Potato Virus Y Necrotic toTobacco History and Recent Observationrsquo American PotatoJournal 37 151

Singh R P et al (2008) lsquoDiscussion Paper The Naming ofPotato Virus Y Strains Infecting Potatorsquo Archives of Virology153 1ndash13

Smith K M (1931) lsquoOn the Composite Nature of Certain PotatoVirus Diseases of the Mosaic Group as Revealed by the Use ofPlant Indicators and Selective Methods of TransmissionrsquoProceedings of the Royal Society B 109 251ndash66

and Dennis R W G (1940) lsquoSome Notes on a SuggestedVariant of Solanum Virus 2 (Potato Virus Y)rsquo Annals of AppliedBiology 27 65ndash70

Spetz C et al (2003) lsquoMolecular Resolution of Complexes ofPotyviruses Infecting Solanaceous Crops at the Centre ofOrigin in Perursquo Journal of General Virology 84 2565ndash78

Stevenson W R et al (eds) (2001) Compendium of Potato Diseases2nd edn St Paul MN APS Press

A J Gibbs et al | 11

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6
Page 12: The phylogenetics of the global population of potato virus Y and … · 2019. 4. 30. · The phylogenetics of the global population of potato virus Y and its necrogenic recombinants

Tavare S (1986) lsquoSome Probabilistic and Statistical Problems inthe Analysis of DNA Sequencesrsquo Lectures on Mathematics in theLife Sciences 17 57ndash86

Tian Y P et al (2011) lsquoGenetic Diversity of Potato Virus YInfecting Tobacco Crops in Chinarsquo Phytopathology 10 377ndash87

To T H et al (2015) lsquoFast Dating Using Least-Squares Criteriaand Algorithmsrsquo Systematic Biology 65 82ndash97

Todd J M (1961) lsquoTobacco Veinal Necrosis on Potato in ScotlandControl of the Outbreak and Some Characters of the Virusrsquo inProceedings of the Fourth Conference on Potato Virus DiseasesBraunschweig 1960 pp 82ndash92

Visser J C and Bellstedt D U (2009) lsquoAn Assessment ofMolecular Variability and Recombination Patterns in SouthAfrican Isolates of Potato Virus Yrsquo Archives of Virology 1541891ndash900

and Pirie M D (2012) lsquoThe Recent RecombinantEvolution of a Major Crop Pathogen Potato Virus Yrsquo PLoS One7 e50631

Xia X (2013) lsquoDAMBE5 A Comprehensive Software Package forData Analysis in Molecular Biology and Evolutionrsquo MolecularBiology and Evolution 30 1720ndash8

12 | Virus Evolution 2017 Vol 3 No 1

Downloaded from httpsacademicoupcomvearticle-abstract31vex0023060989by Music Library School of Music National Institute of the Arts Australian National University useron 15 December 2017

  • vex002-TF7
  • vex002-TF8
  • vex002-TF1
  • vex002-TF2
  • vex002-TF3
  • vex002-TF4
  • vex002-TF5
  • vex002-TF6