New perspectives on the evolutionary history of hepatitis B virus genotype F

9
New perspectives on the evolutionary history of hepatitis B virus genotype F Carolina Torres, Flavia Guadalupe Piñeiro y Leone, Silvana Claudia Pezzano, Viviana Andrea Mbayed, Rodolfo Héctor Campos Cátedra de Virología, Facultad de Farmacia y Bioquímica, Universidad de Buenos Aires, Junín 956, 4° piso, Ciudad Autónoma de Buenos Aires (C1113AAD), Argentina article info Article history: Received 26 April 2010 Revised 25 August 2010 Accepted 11 January 2011 Available online 4 February 2011 Keywords: HBV genotype F Phylogenetics Phylogeography Evolution Bayesian coalescent analysis abstract Hepatitis B virus (HBV) is a globally distributed human pathogen. The aim of this work was to analyze the evolutionary history of HBV genotype F, emphasizing on the study of subgenotypes prevalent in the Southern area of South America. Complete genomes of HBV genotype F from 36 samples from Argentina and Chile were sequenced and analyzed by phylogenetic and Bayesian coalescent methods along with sequences obtained from GenBank database. The phylogeography separated not only Central American from South American isolates but also revealed that different subgenotypes are distributed in constrained although not exclusive areas of the continent. The result obtained with time-stamped complete genomes failed to explain the wide geographical distribution and the clustering observed in this genotype. Conversely, the use of Bayesian coalescent analyses with substitution rates as priors, instead of the co- estimation of tMRCA and substitution rate, allowed us to propose a far origin for the HBV genotype F based on the phylogeographical and epidemiological data. Ó 2011 Elsevier Inc. All rights reserved. 1. Introduction Hepatitis B virus (HBV) is a globally distributed human patho- gen. It is estimated that more than 2 billion people have been in- fected with HBV, and that about 350 million live with an HBV chronic infection (WHO, 2008). HBV has a partial double-stranded DNA genome of around 3200 nt, it replicates through a retrotranscription step and contains four partially overlapping open reading frames encoding the polymer- ase, surface antigen, nucleocapsid (core), e antigen (HBeAg) and X proteins. Different factors determine the nucleotide substitution rate of this virus, even with opposite effects, i.e. the lack of proof read- ing activity during the retrotranscription process and the over- lapping of open reading frames. Particularly, the former would participate in the generation of mutations (the mutation rate) and the latter would constrain the rate at witch the mutations are fixed at the population level (substitution rate) (Duffy et al., 2008). Thus, it has been estimated that the nucleotide substitution rate for HBV is in the order of 10 5 substitutions per site per year (s/s/y) (Fares and Holmes, 2002; Hannoun et al., 2000; Okamoto et al., 1987; Osiowy et al., 2006), lower than that for RNA viruses (10 3 –10 4 s/s/y) and higher than that for most DNA viruses (10 7 –10 9 s/s/y) (Holmes, 2008). However, using retrospective longitudinal analyses, HBV substi- tution rates have been calculated in a wide range (10 4 10 5 s/s/y) dependent on the HBeAg status, concluding that viruses that do not express the e antigen (HBeAg ()) would evolve faster than those that do express it (HBeAg (+)) (Hannoun et al., 2000; Wang et al., 2010). Besides, overlapping and nonoverlapping genomic regions would display different substitution rates, which might interfere in inferring the timescale of the evolutionary history of HBV (Bollyky and Holmes, 1999; Fares and Holmes, 2002; Mizokami et al., 1997; Zhou and Holmes, 2007). The eight major genotypes of HBV (A–H) described so far show a distinctive geographical distribution (Kramvis et al., 2008; Lindh et al., 1997; Norder et al., 1992). Genotypes F and H are thought to be indigenous to America since they have been found in the native population, mainly in Central and South America (Arauz-Ruiz et al., 2002; Blitz et al., 1998; Devesa et al., 2008; Nakano et al., 2001; von Meltzer et al., 2008). Genotype F isolates from different geo- graphical regions are classified into four subgenotypes (F1–F4) (Arauz-Ruiz et al., 1997; Devesa et al., 2004; Mbayed et al., 2001; Piñeiro y Leone et al., 2008). However, the origin and the evolu- tionary history of HBV and, in particular, of genotype F remain uncertain (Holmes, 2008). The aim of our work was to analyze the evolutionary history of the HBV genotype F, emphasizing on the study of subgenotypes F1b and F4, which are prevalent in the Southern area of South America. 1055-7903/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2011.01.010 Corresponding author. Fax: +54 11 45083645. E-mail addresses: [email protected] (C. Torres), [email protected] (F.G. Piñeiro y Leone), [email protected] (S.C. Pezzano), vmbayed@ffyb. uba.ar (V.A. Mbayed), [email protected] (R.H. Campos). Molecular Phylogenetics and Evolution 59 (2011) 114–122 Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Transcript of New perspectives on the evolutionary history of hepatitis B virus genotype F

Page 1: New perspectives on the evolutionary history of hepatitis B virus genotype F

Molecular Phylogenetics and Evolution 59 (2011) 114–122

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier .com/locate /ympev

New perspectives on the evolutionary history of hepatitis B virus genotype F

Carolina Torres, Flavia Guadalupe Piñeiro y Leone, Silvana Claudia Pezzano, Viviana Andrea Mbayed,Rodolfo Héctor Campos ⇑Cátedra de Virología, Facultad de Farmacia y Bioquímica, Universidad de Buenos Aires, Junín 956, 4� piso, Ciudad Autónoma de Buenos Aires (C1113AAD), Argentina

a r t i c l e i n f o a b s t r a c t

Article history:Received 26 April 2010Revised 25 August 2010Accepted 11 January 2011Available online 4 February 2011

Keywords:HBV genotype FPhylogeneticsPhylogeographyEvolutionBayesian coalescent analysis

1055-7903/$ - see front matter � 2011 Elsevier Inc. Adoi:10.1016/j.ympev.2011.01.010

⇑ Corresponding author. Fax: +54 11 45083645.E-mail addresses: [email protected] (C. To

(F.G. Piñeiro y Leone), [email protected] (S.C. Pez(V.A. Mbayed), [email protected] (R.H. Campos).

Hepatitis B virus (HBV) is a globally distributed human pathogen. The aim of this work was to analyze theevolutionary history of HBV genotype F, emphasizing on the study of subgenotypes prevalent in theSouthern area of South America. Complete genomes of HBV genotype F from 36 samples from Argentinaand Chile were sequenced and analyzed by phylogenetic and Bayesian coalescent methods along withsequences obtained from GenBank database. The phylogeography separated not only Central Americanfrom South American isolates but also revealed that different subgenotypes are distributed in constrainedalthough not exclusive areas of the continent. The result obtained with time-stamped complete genomesfailed to explain the wide geographical distribution and the clustering observed in this genotype.Conversely, the use of Bayesian coalescent analyses with substitution rates as priors, instead of the co-estimation of tMRCA and substitution rate, allowed us to propose a far origin for the HBV genotype Fbased on the phylogeographical and epidemiological data.

� 2011 Elsevier Inc. All rights reserved.

1. Introduction

Hepatitis B virus (HBV) is a globally distributed human patho-gen. It is estimated that more than 2 billion people have been in-fected with HBV, and that about 350 million live with an HBVchronic infection (WHO, 2008).

HBV has a partial double-stranded DNA genome of around 3200nt, it replicates through a retrotranscription step and contains fourpartially overlapping open reading frames encoding the polymer-ase, surface antigen, nucleocapsid (core), e antigen (HBeAg) andX proteins.

Different factors determine the nucleotide substitution rate ofthis virus, even with opposite effects, i.e. the lack of proof read-ing activity during the retrotranscription process and the over-lapping of open reading frames. Particularly, the former wouldparticipate in the generation of mutations (the mutation rate)and the latter would constrain the rate at witch the mutationsare fixed at the population level (substitution rate) (Duffyet al., 2008). Thus, it has been estimated that the nucleotidesubstitution rate for HBV is in the order of �10�5 substitutionsper site per year (s/s/y) (Fares and Holmes, 2002; Hannounet al., 2000; Okamoto et al., 1987; Osiowy et al., 2006), lowerthan that for RNA viruses (�10�3–10�4 s/s/y) and higher than

ll rights reserved.

rres), [email protected]), vmbayed@ffyb. uba.ar

that for most DNA viruses (�10�7–10�9 s/s/y) (Holmes, 2008).However, using retrospective longitudinal analyses, HBV substi-tution rates have been calculated in a wide range (�10�4–10�5 s/s/y) dependent on the HBeAg status, concluding thatviruses that do not express the e antigen (HBeAg (�)) wouldevolve faster than those that do express it (HBeAg (+)) (Hannounet al., 2000; Wang et al., 2010).

Besides, overlapping and nonoverlapping genomic regionswould display different substitution rates, which might interferein inferring the timescale of the evolutionary history of HBV(Bollyky and Holmes, 1999; Fares and Holmes, 2002; Mizokamiet al., 1997; Zhou and Holmes, 2007).

The eight major genotypes of HBV (A–H) described so far showa distinctive geographical distribution (Kramvis et al., 2008; Lindhet al., 1997; Norder et al., 1992). Genotypes F and H are thought tobe indigenous to America since they have been found in the nativepopulation, mainly in Central and South America (Arauz-Ruiz et al.,2002; Blitz et al., 1998; Devesa et al., 2008; Nakano et al., 2001;von Meltzer et al., 2008). Genotype F isolates from different geo-graphical regions are classified into four subgenotypes (F1–F4)(Arauz-Ruiz et al., 1997; Devesa et al., 2004; Mbayed et al., 2001;Piñeiro y Leone et al., 2008). However, the origin and the evolu-tionary history of HBV and, in particular, of genotype F remainuncertain (Holmes, 2008).

The aim of our work was to analyze the evolutionary history ofthe HBV genotype F, emphasizing on the study of subgenotypesF1b and F4, which are prevalent in the Southern area of SouthAmerica.

Page 2: New perspectives on the evolutionary history of hepatitis B virus genotype F

C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122 115

2. Materials and methods

2.1. Sequences

Complete genomes of HBV genotype F (HBV/F) from 36 serumsamples from unrelated patients from Argentina and Chile were se-quenced. Argentinean samples (n = 27) came from Buenos Airescity (n = 17) and the Northern region, specifically from the Prov-inces of Chaco (n = 6), Formosa (n = 2) and Salta (n = 2). Chileansamples (n = 9) came from Santiago de Chile city.

Briefly, DNA was extracted from a 200 ll serum sample byusing the QIAamp DNA Mini Kit (Qiagen) according to the manu-facturer’s instructions, and eluted in 150 ll elution buffer. Twonested PCRs were performed to obtain the complete genome ofthe isolates (Table 1). Samples were sequenced using an ABI auto-matic DNA sequencer.

Nucleotide sequences were deposited in GenBank under acces-sion numbers FJ657519 to FJ657525, FJ657528, FJ657529,FJ709457 to FJ709460, FJ709462 to FJ709465, FJ709494, EU366116,EU366118, EU366132, EU366133, EF576808, EF576812, DQ776247,DQ776248 and DQ823086 to DQ823095.

2.2. Recombination detection

HBV/F sequences introduced in this work and HBV sequencesobtained from GenBank database (excluding sequences previouslyreported as recombinant) were aligned with the ClustalX v1.83software (Thompson et al., 1997) and edited with the BioEditv7.0.9.0 software (Hall, 1999). The RDP3 v3.44 software package(Martin, 2009) was used to identify recombinant sequences. Six dif-ferent algorithms were applied (RDP, Si-Scan, Bootscan, Chimeric,MaxChi, and GENECONV) and the defaults setting were used.Sequences would be considered as recombinants and excludedfrom any further analysis if they were identified as recombinantby at least four out of six methods.

2.3. Phylogenetic and phylogeographical analyses

Phylogenetic relationships on the complete genome were eval-uated using the maximum likelihood (ML) and maximum parsi-mony (MP) methods. ML trees were obtained by heuristicsearches and tree bisection and reconnection (TBR) branch swap-ping, using the PAUP� v4b10 software package (Swofford, 2003)and an appropriate nucleotide substitution model estimated usingModeltest v3.7 (Posada and Crandall, 1998) according to theAkaike Information Criterion (AIC). The selected model was thegeneral time-reversible (GTR) + four rates categories (U4) + propor-tion of invariant sites (I). MP trees were obtained by heuristic

Table 1Primers and amplification conditions.

Reaction Primers Amp

PCR I 1st round (3259 bp) P1a (forward, 1825–1846) 5 miwithP2a (reverse, 1828–1809)

2nd round (3162 bp) HBV49 (forward, 1859–1885):50 ACTGTTCAAGCCTCCAAGCTGTGCCTT 30

HBV44 (reverse, 1805–1776):50 TGAACAGACCAATTTATGCCTACAGCCTCC 30

PCR II 1st round (846 bp) HBV53 (forward, 1639–1660):50 TGCCAACAGTCTTACATAAGMG 30

5 mi55 �C

HBV2 (reverse, 2484–2465):50 CCCACCTTATGAGTCCAAGG 30

2nd round (544 bp) HBV55 (forward, 1704–1728):50 TACATCAAAGACTGTGTATTTAAGG 30

HBV50 (reverse, 2247–2222):50 GAACTGTTTCTCTTCCAAAAGTAAG 30

a Previously described (Günther et al., 1995).

searches with 100 random addition sequences (RAS) and TBRbranch swapping, using the TNT v1.1 software (Goloboff et al.,2000). Strict consensus of the 318 most parsimonious trees ob-tained was calculated. Robustness of the phylogenetic groupingwas evaluated by bootstrap analyses using ML (1000 replicates)with the PhyML v2.4.4 software (Guindon and Gascuel, 2003),MP (1000 replicates) with the TNT software and the Neighbor Join-ing (NJ) method (10,000 replicates) with PAUP�. ML and NJ boot-strap trees were generated under the substitution modelestimated using Modeltest. The MP bootstrap test was performeddoing 100 RAS + TBR cycles for each resampled matrix.

Lastly, the program BaTS (Parker et al., 2008) was used to testfor the presence of statistically significant phylogeographical struc-ture. In order to avoid a non-independent sampling to perform theanalysis, a subset of the sequences was analyzed. The data set(n = 67) comprised sequences from known unrelated infectionsand sequences considered as not coming from a particular networkof infection given that they come from different countries of origin(Additional file 1), The strength of phylogeny–geographical regionassociation was quantified using the parsimony score (PS) (Fitch,1971) and the association index (AI) (Wang et al., 2001). As inputfor BaTS, the posterior distribution of trees arising from a Bayesiananalysis on the complete genome was used. This analysis was car-ried out with the BEAST v1.4.8 software package (Drummond andRambaut, 2007) using the uncorrelated lognormal (UCLN) molecu-lar clock model (Drummond et al., 2006) and the Bayesian skylineplot (BSP) coalescent model of five steps (Drummond et al., 2005).Analyses were run up to convergence and 10% of the sampling wasdiscarded as burn-in. Acceptable mixing was visualized by theTracer v1.4 software (Rambaut and Drummond, 2007).

2.4. Co-estimation of substitution rates and time to the most recentcommon ancestors (tMRCAs)

Sequences with an available year of isolation were introducedin a Bayesian coalescent analysis in order to co-estimate an overallsubstitution rate and the tMRCA for HBV/F and its subgenotypes.When the year of isolation was available as a range, the mean yearwas used. The data set included 63 HBV/F sequences from HBeAg(+) samples (Additional file 1). The analysis was performed underthe GTR + U4 + I model of base substitution, both on the completegenome and on the nonoverlapping regions. Different molecularclock and demographic models implemented in the BEAST v1.4.8software package, i.e. strict and UCLN molecular clock models, aswell as constant population size, exponential and expansion popu-lation growth, and BSP were considered. In addition, a model thatallows each codon position to have its own substitution rate wasused on the nonoverlapping polymerase regions. Analyses were

lification conditions

n at 94 �C, 36 cycles of 1 min at 94 �C, 5 min at 68 �C (annealing and extension),an increase of 2 min every 10 cycles, and a final extension of 15 min at 68 �C

n at 94 �C, 35 cycles of 1 min at 94 �C, 1 min at 52 �C (in the first round) and at(in the second round), 1 min at 68 �C, and a final extension of 10 min at 68 �C

Page 3: New perspectives on the evolutionary history of hepatitis B virus genotype F

116 C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122

run up to convergence, 10% of the sampling was discarded as burn-in and acceptable mixing was visualized by the Tracer v1.4 soft-ware. Uncertainty in parameter estimates was evaluated in the95% highest posterior density (HPD95%) interval. The Bayes Factorwas used to select the model that better fits to the data (Kass andRaftery, 1995; Suchard et al., 2001).

In addition, the temporal signal of the data was studied throughtwo approaches. On one hand, the evaluation of the relationshipbetween genetic divergence to the root and sampling time by usingthe Path-O-Gen v1.2 software (Rambaut, 2009). On the other hand,the repetition of the BEAST analysis using sampling times ran-domly assigned to the tips and the comparison with the results ob-tained with the real data. If the tMRCAs and substitution ratesestimated from the both BEAST analyses deviated substantiallywith no overlap of their HPD95% intervals, significant temporal sig-nal could be expected in this data set (Ramsden et al., 2009).

2.5. Estimation of tMRCAs by calibration with mean substitution rates

A Bayesian coalescent analysis was performed in order to obtaintMRCA estimates by calibration with mean substitution rates (s/s/y): 0.5 � 10�5, 0.75 � 10�5, 1.0 � 10�5, 2.5 � 10�5, 5.0 � 10�5,7.5 � 10�5, 10.0 � 10�5 and 25.0 � 10�5. Only HBeAg (+) sampleswere considered. The analysis was carried out on HBV/F sequences(n = 79) as a genotype group and on each subgenotype separately:F1b (HBV/F1b) (n = 24) and F4 (HBV/F4) (n = 15). Besides, the anal-ysis was performed on the complete genome and on the nonover-lapping regions.

Different molecular clock and demographic models imple-mented in the BEAST software package were considered and anal-yses were performed as in Section 2.4.

In addition, the demographic reconstruction of the effectivenumber of infections in function of the time was performed bythe Tracer software. The effective number of infections can be con-sidered as a measure of the relative genetic diversity and is givenas Ne � s, where Ne is the effective population size and s is the gen-eration time (Drummond and Rambaut, 2007).

Moreover, the maximum clade credibility tree (MCCT) was con-structed with the TreeAnnotator v1.4.8 software (part of the BEASTsoftware package) after discarding 10% of the sampling, and thenvisualized with the FigTree v1.1.2 software (Rambaut, 2008).

3. Results

3.1. Recombination detection

Overall, six sequences have been found as potential recombi-nants by one (n = 1) or two (n = 5) methods. Therefore, as theydid not conform to the criterion followed, no genome sequencewas considered as recombinant and then, all sequences were in-cluded in the analyses.

3.2. Phylogenetic and phylogeographical analyses

In order to evaluate alternative evidences about the phylogenyof the HBV/F, different approaches were tackled. Congruent topol-ogies were found between the phylogenetic trees obtained by theML and MP methods (Fig. 1 and Additional file 2).

The phylogenetic analysis on the complete genome of HBV/F al-lowed us to suspect a geographical association (Fig. 1). The pres-ence of phylogeographical structure was demonstrated using thePS and AI statistics. When the sequences were labeled accordingto their country of origin, both statistics strongly rejected thehypothesis of panmixis (observed PS of 21.4, expected PS of 34.4,p < 0.01; observed AI of 2.8, expected AI of 6.0, p < 0.01).

A split into two main groups (‘‘FI’’ and ‘‘FII’’) was observed. ‘‘FI’’corresponded to subgenotype F1, which split into Central Americanisolates (F1a) and South American isolates (F1b); and ‘‘FII’’ corre-sponded to subgenotypes F2–F4, which were found exclusively inSouth America (Figs. 1 and 2). Besides, HBV/F1b separated in threemonophyletic groups. A first cluster was formed almost completelyby Argentinean and Chilean intermingling isolates. A sequencefrom Venezuela and another from the USA were found in thisgroup, being the only reported complete genome sequences fromthese regions. Peruvian sequences formed a second cluster, previ-ously proposed as F1c (von Meltzer et al., 2008). Finally, a thirdcluster was formed by two Japanese isolates, although it only pre-sented a supporting bootstrap value by the ML method. An Argen-tinean sequence was located close to this group, but withoutsupporting bootstrap value.

Likewise, subgenotype F2 separated into two groups, previouslyproposed as F2a and F2b (Devesa et al., 2008). In addition, subge-notype F3 showed three groups: a Venezuelan, a Venezuelan andColombian, and a Panamanian cluster.

Besides, Argentinean and Bolivian isolates were intermingled inHBV/F4, which presented two supported groups. One cluster wasformed by isolates from the Provinces of Salta and Formosa,Buenos Aires city and Bolivia, while the other cluster wasexclusively composed of isolates from Buenos Aires city.

3.3. Co-estimation of substitution rates and tMRCAs

In order to co-estimate an overall substitution rate and thetMRCA for HBV/F and its subgenotypes, we carried out a Bayesiancoalescent analysis on a set of 63 complete genome sequencesfrom HBeAg (+) samples with an available sampling time.

In general, independently of the demographic model impli-cated, the Bayes Factor (BF) significantly favored the UCLN relaxedclock model over the strict molecular clock one. However, for boththe complete genome and the nonoverlapping regions, the BF didnot favor any combination of UCLN-demographic model (constantpopulation size, exponential and expansion population growth,and BSP) (Additional file 3). Here, we show the results obtainedwith the UCLN-BSP model, which can fit a wide range of demo-graphic scenarios (Drummond et al., 2005). It is worth noting thatall combinations of UCLN-demographic model produced similar re-sults (Additional file 3).

Thus, the mean substitution rate and the mean tMRCA for theHBV/F complete genome were estimated in 1.67 � 10�4 s/s/y and284 years (ys), respectively. The youngest mean tMRCA was ob-tained for HBV/F1b (23 ys), whereas the mean tMRCA was �55–70 ys for the other subgenotypes. Similarly, for the nonoverlappingregions, the mean substitution rate and the mean tMRCA were esti-mated in 2.00 � 10�4 s/s/y and 329 ys, respectively (Table 2). Be-sides, using a substitution rate of 1.67 � 10�4 s/s/y for thecomplete genome, the demographic reconstruction of the effectivenumber of infections showed an increase starting �15 ys ago(Additional file 4).

In addition, similar results for the substitution rate and thetMRCA were obtained on the nonoverlapping polymerase regionwhen a model that allows each codon position to have its own sub-stitution rate was used (Additional file 6).

Otherwise, the temporal structure of the data could not be dem-onstrated since the sequence divergence from the root of the treewith sampling time did not show evidence of a strong correlationor linear relationship (correlation coefficient = 0.62; r2 = 0.38).The temporal signal was also in doubt according to the overlapof the HPD95% intervals observed for the tMRCAs of the HBV/F esti-mated from the random time-stamped and the real data set. How-ever, the substitution rates did not show an overlap of its HPD95%intervals (Additional file 3).

Page 4: New perspectives on the evolutionary history of hepatitis B virus genotype F

Fig. 1. Phylogenetic analysis of HBV/F. ML phylogenetic tree constructed on the complete genome of HBV/F HBeAg (+) and HBeAg (�) samples from Buenos Aires city (BA),Provinces of Chaco (CH), Salta (Sal) and Formosa (FOR), and from Chile (Chi). The numbers at nodes or in parentheses correspond to ML/MP bootstrap values (1000 replicates)higher than 70%. The sequences reported in this work are shown in bold and HBeAg (�) samples are shown in italics. Genotype H was used as outgroup. Sequences fromGenBank are indicated by their corresponding accession numbers with an abbreviation of their country of origin. Arg: Argentina, Bol: Bolivia, Bra: Brazil, Col: Colombia, Fra:France, Jap: Japan, Nic: Nicaragua, Pan: Panama, Per: Peru, Ric: Costa Rica, ESa: El Salvador, USA: United States of America, Ven: Venezuela.

C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122 117

3.4. Estimation of the tMRCAs by calibration with mean substitutionrates

We estimated the tMRCAs by calibration with different meansubstitution rates that covered the range of those previously esti-mated for the HBV complete genomes (0.6 � 10�5–7.7 � 10�4 s/s/y)

(Hannoun et al., 2000; Osiowy et al., 2006; Wang et al., 2010; Zhouand Holmes, 2007) and for the nonoverlapping regions(4.2 � 10�5 s/s/y) (Fares and Holmes, 2002).

Once more, the UCLN relaxed molecular clock model fitted bet-ter than the strict molecular clock. Also in this case, for both thecomplete genome and the nonoverlapping regions, BF did not favor

Page 5: New perspectives on the evolutionary history of hepatitis B virus genotype F

Fig. 2. Phylogeography of HBV/F in Central and South America. Only the names of the countries of origin of the samples analyzed in this work were placed on the map. aOnlypartial sequences are reported.

Table 2Co-estimation of substitution rates and tMRCAsa.

Group Complete genome Nonoverlapping regions

tMRCA(years)

HPD95%(years)

tMRCA(years)

HPD95%(years)

F1 91 46–150 102 43–184F1a 55 33–82 54 28–85F1b 23 16–31 24 14–33F2 70 41–109 86 42–153F3 62 39–93 61 33–100F4 69 37–110 79 34–143

F 284 120–501 329 126–640Substitution

rate (s/s/y)1.67 � 10�4 9.42 � 10�5–

2.37 � 10�42.00 � 10�4 9.61 � 10�5–

3.08 � 10�4

a Mean substitution rates and tMRCAs corresponding to the Bayesian coalescentanalysis performed under the UCLN-BSP model by calibration with time-stampedsequences of HBV/F for the complete genome and the nonoverlapping regions.

118 C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122

any particular combination of UCLN-demographic model for eachmean substitution rate assayed (Additional file 5).

Faster substitution rates led to lower values of tMRCA. Com-plete genomes of HBV/F showed tMRCA values from 8474(0.5 � 10�5 s/s/y) to 163 ys (25 � 10�5 s/s/y), while those of HBV/

F1b and HBV/F4 displayed values from 1416 and 1611 ys(0.5 � 10�5 s/s/y) to 29 and 32 ys (2.5 � 10�4 s/s/y), respectively(Fig. 3 and Additional file 5). Similar results were obtained whenthe sequences of HBV/F1b and HBV/F4 were analyzed as separatedata sets (Additional file 7).

In addition, the nonoverlapping regions of HBV/F showedtMRCA values from 12,800 (0.5 � 10�5 s/s/y) to 271 ys(2.5 � 10�4 s/s/y), while those of HBV/F1b and HBV/F4 displayedvalues from 3096 and 1879 ys (0.5 � 10�5 s/s/y) to 61 and 39 ys(2.5 � 10�4 s/s/y), respectively (Fig. 3 and Additional file 3).

For the complete genome, the demographic reconstruction ofthe effective number of infections showed that the mean value atpresent was almost tenfold higher than the historical value, whichseems to have remained constant for thousands of years.

Given that the HBV substitution rates have been previously esti-mated in a range of 0.6 � 10�5–6.1 � 10�5 s/s/y for HBeAg (+) sam-ples, an ‘‘intermediate’’ nucleotide substitution rate of 1 � 10�5 s/s/y has been chosen in this work to test one of the possible scenariosfor the evolutionary history of this virus. This substitution ratewould be in agreement with an HBV/F ancestor of �4400 ys andHBV/F1b and HBV/F4 ancestors of �700 and �800 ys, respectively(Figs. 3 and 4). As well, under this condition, an increase in theeffective number of infections would have started �250 ys ago(Fig. 4). However, the results obtained with the BSP model should

Page 6: New perspectives on the evolutionary history of hepatitis B virus genotype F

C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122 119

be interpreted cautiously given that sequences have been sampledfrom a geographically structured population (Drummond et al.,2005).

4. Discussion

The HBV has a complex evolutionary history. Its study is intri-cate owing to several factors such as the small size and the over-lapping nature of the genome that make it difficult to accuratelyestimate substitution rates and divergence times (Holmes, 2008).

Only a few complete genomes belonging to all the F subgeno-types were available in the database, thus limiting its adequaterepresentation in the evolutionary analysis of HBV. Here, we intro-duce 36 complete genomes with the aim of thoroughly studyingthe evolutionary history of HBV/F.

To our knowledge, the phylogenetic analysis of HBV/F pre-sented here is the most comprehensive study about this genotypecarried out so far. By using the ML and MP methods, HBV/F showeda deep clusterization and geographical structure. The phylogeogra-phy revealed that different subgenotypes are distributed in con-strained although not exclusive areas. This is, subgenotypes F2and F3 co-circulate in the Northern area of South America, whilesubgenotype F2 was also found in Brazil. Similarly, F4 associatedwith the Central-South area of South America: Bolivia, Argentinaand the Southern area of Brazil. Whereas, the subgenotype F1 splitinto Central American (F1a) and South American (F1b) isolates andindeed it showed inner groups dividing Peruvian F1b isolates (Cen-tral area of South America) from F1b Argentinean and Chilean ones(Southern area of South America). Besides, despite the geographi-cal associations described, some exceptions were observed. Possi-bly, they represent recent migration processes since the lowprevalence of the involved subgenotypes dismiss a longstandingevolutionary history on that locations (Devesa et al., 2008; Halfonet al., 2006; Kato et al., 2005; Matsuura et al., 2009).

Thus, the clustering observed in the phylogeny of HBV/F and thepresence of specific lineages in particular regions seem to implythat a far deep historical association has taken place.

Since it has been suggested that sequences from patients inphase HBeAg (�) would have higher substitution rates (Hannounet al., 2000) and that mainly HBeAg (+) viruses would be involvedin transmission (Milich and Liang, 2003), thus, only sequences withphenotype HBeAg (+) were included in our coalescent analyses.

Hence, in order to explore different hypotheses about the time-scale of the HBV/F evolutionary history, we applied relaxed

Fig. 3. tMRCA values for different substitution rates. Substitution rates and tMRCAvalues corresponding to the Bayesian coalescent analyses performed by calibrationwith mean substitution rates under the UCLN-BSP model for the complete genome(cg) and the nonoverlapping regions (nr). Shaded area covered substitution ratesthat better fit with the phylogeographical data of HBV/F. HPD95% intervals are notshown for clarity purposes (they are available in the Additional data 5). Axes are inlog scale.

phylogenetic methods in a Bayesian frame to obtain informationabout evolutionary rates and ancestral divergence dates.

In a previous study, this methodology was used to date earlyevents of HBV diversification by using time-stamped data; how-ever, the data set used in that work comprised only a few HBV/Fsequences and those from our region were not represented (Zhouand Holmes, 2007).

Using that approach, we obtained ancestral values that wouldimply a recent diversification process of HBV/F (284 years) and ofits subgenotypes (e.g. 23 and 69 years, for HBV/F1b and HBV/F4,respectively). As well, according to that approach, the expansionof the effective number of HBV/F infections in the Americas wouldhave occurred �15 years ago, which is epidemiologically unrealis-tic. On the contrary, epidemiological data such as the presence ofspecific subgenotypes in native populations of South America (Blitzet al., 1998; Devesa et al., 2008; Nakano et al., 2001) suggest alonger evolutionary process within those groups. The scarce inter-relation of these populations with other communities dismisses amore recent introduction of the virus to those populations.

As a consequence, neither our results nor previous analysesusing time-stamped sequences seem to explain the wide distribu-tion of HBV/F in America and the diversification of subgenotypes.Therefore, the demographic and genetic processes underlying thegeographical structure would require a longer time period thanthe one calculated by using time-stamped sequences.

The high substitution rate obtained with time-stamped samplesmight be explained by the fact that the analysis of heterochronoussequences may be adversely affected if the differences in the datesassociated with the tips of the tree do not comprise a significantproportion of the age of the entire tree (Drummond et al., 2002,2003; Drummond and Rambaut, 2007; Rambaut, 2000). This is, gi-ven that fossil records of viruses are not available, the reconstruc-tion of the evolutionary history of viruses has to be based oncoalescent analyses of sequences isolated during the last 20–30,but if the ‘‘real’’ phylogeny of HBV/F was much older, then a cali-bration with that short time period would probably not lead toappropriate estimates. Besides, the temporal structure of the datacould not be demonstrated and thus, the substitution rate esti-mated from time-stamped sequences might not be accuratelycalculated.

The substitution rate of HBV is uncertain and complicated todetermine with precision (Holmes, 2008). There are few studieswhere HBV substitution rates were calculated on the complete gen-ome (Hannoun et al., 2000; Okamoto et al., 1987; Osiowy et al., 2006;Wang et al., 2010; Zhou and Holmes, 2007). And indeed, only two ofthose works treated HBeAg (+) samples separately, obtaining substi-tution rates of 0.6 � 10�5 and 3.0 � 10�5 s/s/y (for the median) andof 2.0 � 10�5 and 6.1 � 10�5 s/s/y (for the media) in longitudinalanalyses (Hannoun et al., 2000; Wang et al., 2010).

Besides, despite the influence of the method used to performthe estimations, it has been proposed that substitution rates mea-sured in the short term may reflect the intrinsic rate of mutationbetter than the long-term rate of nucleotide substitution (Holmes,2008). This long-term rate should reflect an average rate of changeof the transmissible virus, i.e. of HBeAg (+).

Consequently, in order to consider alternative hypotheses aboutthe time to the most recent common ancestor of HBV/F, the time ofpossible emergence of the subgenotypes F and the rates of evolu-tion implied, we applied the Bayesian coalescent approach byassaying a range of substitution rates. The results suggest that,for the complete genome and the nonoverlapping regions, a long-term HBV substitution rate of �1 � 10�5 s/s/y would be in agree-ment with the ancestral ages for HBV/F and its subgenotypes thatfit better with the phylogeographical and demographic data.

In addition, beyond the estimated ancestral ages for HBV/F(�4400 years), HBV/F1b (�700 years) and HBV/F4 (�800 years),

Page 7: New perspectives on the evolutionary history of hepatitis B virus genotype F

Fig. 4. MCCT and BSP for a substitution rate of 1 � 10�5 s/s/y. Maximum clade credibility tree (MCCT) and BSP performed under the UCLN-BSP model by calibration with1 � 10�5 s/s/y for complete genome of HBV/F HBeAg (+) samples. The HPD95% values are shown in the shaded areas. For the tMRCA of HBV/F, only the lower value of theHPD95% interval is shown as a shaded area (MCCT) or as a dotted line (BSP), for clarity purposes.

120 C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122

viruses can only emerge into human populations when they areadapted to do so and are provided with the appropriate host expo-sure (Holland, 1996). The emergence of subgenotypes in Americawould have required favorable transmission circumstances,including a critical host population size, an increase in the popula-tion, migration processes, changes in behavioral patterns, a war orcivil conflict, etc. (Holmes, 2008; Holland, 1996; Morse, 1995). We

speculate that during the first viral diversification events, the virushas been transmitted in a low endemic level until emergence con-ditions appeared. Since then, the current subgenotypes finally set-tled in a region. Those conditions would have been present inCentral and South America at least �300 years ago, given thatthe post-colonization and independence periods were character-ized by large military conflicts and migration processes, with an

Page 8: New perspectives on the evolutionary history of hepatitis B virus genotype F

C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122 121

incessant increase of the Latin American population since the 18thcentury (UNSD, 1999). Concurrently, some HBV/F lineages couldhave disappeared because of the extreme population reductioncaused by the conquest during the 16th and 17th centuries.

However, the knowledge of the spread of HBV/F in America re-mains elusive. It has been suggested that HBV/F could have enteredthe continent across the Bering Strait with the first settlers thatcame to America from Asia (Arauz-Ruiz et al., 2002). Nevertheless,the timing and the routes for the Peopling of America have beensubjected to controversy. Different models have been proposedwith migration dates raging from �12,000 years ago to �30,000–40,000 years ago (Bonatto and Salzano, 1997; Endicott and Ho,2008; Forster et al., 1996; Goebel et al., 2008; Schurr and Sherry,2004; Silva et al., 2002; Torroni et al., 1993). In this context, along-term substitution rate of at least �1 � 10�5 s/s/y would implyancestral times for HBV/F that are in agreement with the timescaleof a latest-stage of the Peopling of America. Alternatively, one orseveral jumps from another carrier host to humans would also con-stitute another possibility to explain the current distribution of theHBV/F in the continent.

5. Conclusions

Using a large number of sequences and an extensive phylogeneticanalysis, we described in depth the phylogeographical structure ofHBV/F, emphasizing on the study of subgenotypes F1b and F4. Be-sides, the use of Bayesian coalescent analyses by calibration withexternal substitution rates, instead of the co-estimation of tMRCAand substitution rate, allowed us to propose a far origin for theHBV/F based on the phylogeographical and epidemiological data.

Finally, up to date, it has not been possible to infer certainlywhen the first HBV/F diversification events took place or how sub-genotypes arose and spread out; however, this should not preventfrom going ahead with the research on the matter.

Acknowledgments

This work was supported by grants from Universidad de BuenosAires (SECyT-UBA2008 B037), Consejo Nacional de InvestigacionesCientíficas y Técnicas (CONICET; PIP 112-200801-01169), andAgencia Nacional de Promoción Científica y Tecnológica (ANPCyT;PICT2004 25355).

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at doi:10.1016/j.ympev.2011.01.010.

References

Arauz-Ruiz, P., Norder, H., Robertson, B.H., Magnius, L.O., 2002. Genotype H: a newAmerindian genotype of hepatitis B virus revealed in Central America. J. Gen.Virol. 83, 2059–2073.

Arauz-Ruiz, P., Norder, H., Visona, K.A., Magnius, L.O., 1997. Molecular epidemiologyof hepatitis B virus in Central America reflected in the genetic variability of thesmall S gene. J. Infect. Dis. 176, 851–858.

Blitz, L., Pujol, F.H., Swenson, P.D., Porto, L., Atencio, R., Araujo, M., Costa, L.,Monsalve, D.C., Torres, J.R., Fields, H.A., Lambert, S., Van Geyt, C., Norder, H.,Magnius, L.O., Echevarria, J.M., Stuyver, L., 1998. Antigenic diversity of hepatitisB virus strains of genotype F in Amerindians and other population groups fromVenezuela. J. Clin. Microbiol. 36, 648–651.

Bollyky, P.L., Holmes, E.C., 1999. Reconstructing the complex evolutionary history ofhepatitis B virus. J. Mol. Evol. 49, 130–141.

Bonatto, S.L., Salzano, F.M., 1997. Diversity and age of the four major mtDNAhaplogroups, and their implications for the peopling of the New World. Am. J.Hum. Genet. 61, 1413–1423.

Devesa, M., Loureiro, C.L., Rivas, Y., Monsalve, F., Cardona, N., Duarte, M.C., Poblete,F., Gutierrez, M.F., Botto, C., Pujol, F.H., 2008. Subgenotype diversity of hepatitisB virus American genotype F in Amerindians from Venezuela and the generalpopulation of Colombia. J. Med. Virol. 80, 20–26.

Devesa, M., Rodriguez, C., Leon, G., Liprandi, F., Pujol, F.H., 2004. Clade analysis andsurface antigen polymorphism of hepatitis B virus American genotypes. J. Med.Virol. 72, 377–384.

Drummond, A.J., Ho, S.Y., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogeneticsand dating with confidence. PLoS Biol. 4, e88.

Drummond, A.J., Nicholls, G.K., Rodrigo, A.G., Solomon, W., 2002. Estimatingmutation parameters, population history and genealogy simultaneously fromtemporally spaced sequence data. Genetics 161, 1307–1320.

Drummond, A.J., Pybus, O.G., Rambaut, A., Forsberg, R., Rodrigo, A.G., 2003.Measurably evolving populations. Trends Ecol. Evol. 18, 481–488.

Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis bysampling trees. BMC Evol. Biol. 7, 214.

Drummond, A.J., Rambaut, A., Shapiro, B., Pybus, O.G., 2005. Bayesian coalescentinference of past population dynamics from molecular sequences. Mol. Biol.Evol. 22, 1185–1192.

Duffy, S., Shackelton, L.A., Holmes, E.C., 2008. Rates of evolutionary change inviruses: patterns and determinants. Nat. Rev. Genet. 9, 267–276.

Endicott, P., Ho, S.Y., 2008. A Bayesian evaluation of human mitochondrialsubstitution rates. Am. J. Hum. Genet. 82, 895–902.

Fares, M.A., Holmes, E.C., 2002. A revised evolutionary history of hepatitis B virus(HBV). J. Mol. Evol. 54, 807–814.

Fitch, W.M., 1971. Toward defining the course of evolution: minimum change for aspecific tree topology. Syst. Zool. 20, 406–416.

Forster, P., Harding, R., Torroni, A., Bandelt, H.J., 1996. Origin and evolution of NativeAmerican mtDNA variation: a reappraisal. Am. J. Hum. Genet. 59, 935–945.

Goebel, T., Waters, M.R., O’Rourke, D.H., 2008. The late Pleistocene dispersal ofmodern humans in the Americas. Science 319, 1497–1502.

Goloboff, P., Farris, S., Nixon, K., 2000. TNT (Tree Analysis Using New Technology)(BETA).

Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimatelarge phylogenies by maximum likelihood. Syst. Biol. 52, 696–704.

Günther, S., Li, B.C., Miska, S., Kruger, D.H., Meisel, H., Will, H., 1995. A novel methodfor efficient amplification of whole hepatitis B virus genomes permits rapidfunctional analysis and reveals deletion mutants in immunosuppressedpatients. J. Virol. 69, 5437–5444.

Halfon, P., Bourliere, M., Pol, S., Benhamou, Y., Ouzan, D., Rotily, M., Khiri, H., Renou,C., Penaranda, G., Saadoun, D., Thibault, V., Serpaggi, J., Varastet, M., Tainturier,M.H., Poynard, T., Cacoub, P., 2006. Multicentre study of hepatitis B virusgenotypes in France. Correlation with liver fibrosis and hepatitis B e antigenstatus. J. Viral Hepat. 13, 329–335.

Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editorand analysis program for Windows 95/98/NT. Nucleic Acid Symp. Ser. 41,95–98.

Hannoun, C., Horal, P., Lindh, M., 2000. Long-term mutation rates in the hepatitis Bvirus genome. J. Gen. Virol. 81, 75–83.

Holmes, E.C., 2008. Evolutionary history and phylogeography of human viruses.Annu. Rev. Microbiol. 62, 307–328.

Holland, J.J., 1996. Evolving virus plagues. Proc. Natl Acad. Sci. USA 93, 545–546.Kass, R.E., Raftery, A.E., 1995. Bayes factors. J. Am. Stat. Assoc. 773, 795.Kato, H., Fujiwara, K., Gish, R.G., Sakugawa, H., Yoshizawa, H., Sugauchi, F., Orito, E.,

Ueda, R., Tanaka, Y., Kato, T., Miyakawa, Y., Mizokami, M., 2005. Classifyinggenotype F of hepatitis B virus into F1 and F2 subtypes. World J. Gastroenterol. 11,6295–6304.

Kramvis, A., Arakawa, K., Yu, M.C., Nogueira, R., Stram, D.O., Kew, M.C., 2008.Relationship of serological subtype, basic core promoter and precore mutationsto genotypes/subgenotypes of hepatitis B virus. J. Med. Virol. 80, 27–46.

Lindh, M., Andersson, A.S., Gusdal, A., 1997. Genotypes, nt 1858 variants, andgeographic origin of hepatitis B virus – large-scale analysis using a newgenotyping method. J. Infect. Dis. 175, 1285–1293.

Martin, D.P., 2009. Recombination detection and analysis using RDP3. Methods Mol.Biol. 537, 185–205.

Matsuura, K., Tanaka, Y., Hige, S., Yamada, G., Murawaki, Y., Komatsu, M., Kuramitsu,T., Kawata, S., Tanaka, E., Izumi, N., Okuse, C., Kakumu, S., Okanoue, T., Hino, K.,Hiasa, Y., Sata, M., Maeshiro, T., Sugauchi, F., Nojiri, S., Joh, T., Miyakawa, Y.,Mizokami, M., 2009. Distribution of hepatitis B virus genotypes among patientswith chronic infection in Japan shifting toward an increase of genotype A. J. Clin.Microbiol. 47, 1476–1483.

Mbayed, V.A., Barbini, L., Lopez, J.L., Campos, R.H., 2001. Phylogenetic analysis of thehepatitis B virus (HBV) genotype F including Argentine isolates. Arch. Virol. 146,1803–1810.

Milich, D., Liang, T.J., 2003. Exploring the biological basis of hepatitis B e antigen inhepatitis B virus infection. Hepatology 38, 1075–1086.

Mizokami, M., Orito, E., Ohba, K., Ikeo, K., Lau, J.Y., Gojobori, T., 1997. Constrainedevolution with respect to gene overlap of hepatitis B virus. J. Mol. Evol. 44(Suppl. 1), S83–S90.

Morse, S.S., 1995. Factors in the emergence of infectious diseases. Emerg. Infect. Dis.1, 7–15.

Nakano, T., Lu, L., Hu, X., Mizokami, M., Orito, E., Shapiro, C., Hadler, S., Robertson, B.,2001. Characterization of hepatitis B virus genotypes among Yucpa Indians inVenezuela. J. Gen. Virol. 82, 359–365.

Norder, H., Couroucé-Pauty, A.M., Magnius, L.O., 1992. Molecular basis of hepatitis Bvirus serotype variations within the four major subtypes. J. Gen. Virol. 73,3141–3145.

Okamoto, H., Imai, M., Kametani, M., Nakamura, T., Mayumi, M., 1987. Genomicheterogeneity of hepatitis B virus in a 54-year-old woman who contracted theinfection through materno-fetal transmission. Jpn. J. Exp. Med. 57, 231–236.

Page 9: New perspectives on the evolutionary history of hepatitis B virus genotype F

122 C. Torres et al. / Molecular Phylogenetics and Evolution 59 (2011) 114–122

Osiowy, C., Giles, E., Tanaka, Y., Mizokami, M., Minuk, G.Y., 2006. Molecularevolution of hepatitis B virus over 25 years. J. Virol. 80, 10307–10314.

Parker, J., Rambaut, A., Pybus, O.G., 2008. Correlating viral phenotypes withphylogeny: accounting for phylogenetic uncertainty. Infect. Genet. Evol. 8, 239–246.

Piñeiro y Leone, F.G., Pezzano, S.C., Torres, C., Rodriguez, C.E., Garay, M.E., Fainboim,H.A., Remondegui, C., Sorrentino, A.P., Mbayed, V.A., Campos, R.H., 2008.Hepatitis B virus genetic diversity in Argentina: dissimilar genotypedistribution in two different geographical regions; description of hepatitis Bsurface antigen variants. J. Clin. Virol.

Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of DNA substitution.Bioinformatics 14, 817–818.

Rambaut, A., 2000. Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies.Bioinformatics 16, 395–399.

Rambaut, A., 2008. FigTree v1.1.2 Software.Rambaut, A., 2009. Path-O-Gen Program. Temporal Signal Investigation Tool.Rambaut, A., Drummond, A.J., 2007. Tracer v1.4. <http://beast.bio.ed.ac.uk/Tracer>.Ramsden, C., Holmes, E.C., Charleston, M.A., 2009. Hantavirus evolution in relation

to its rodent and insectivore hosts: no evidence for codivergence. Mol. Biol.Evol. 26, 143–153.

Schurr, T.G., Sherry, S.T., 2004. Mitochondrial DNA and Y chromosome diversity andthe peopling of the Americas: evolutionary and demographic evidence. Am. J.Hum. Biol. 16, 420–439.

Silva Jr., W.A., Bonatto, S.L., Holanda, A.J., Ribeiro-Dos-Santos, A.K., Paixao, B.M.,Goldman, G.H., Abe-Sandes, K., Rodriguez-Delfin, L., Barbosa, M., Paco-Larson,M.L., Petzl-Erler, M.L., Valente, V., Santos, S.E., Zago, M.A., 2002. Mitochondrial

genome diversity of Native Americans supports a single early entry of founderpopulations into America. Am. J. Hum. Genet. 71, 187–192.

Suchard, M.A., Weiss, R.E., Sinsheimer, J.S., 2001. Bayesian selection of continuous-time Markov chain evolutionary models. Mol. Biol. Evol. 18, 1001–1013.

Swofford, D.L., 2003. PAUP�: Phylogenetic Analysis using Parsimony (� and OtherMethods). Version 4.0b10. Sinauer Associates, Sunderland, Massachusetts.

Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997.The CLUSTAL_X windows interface: flexible strategies for multiplesequence alignment aided by quality analysis tools. Nucleic Acids Res.25, 4876–4882.

Torroni, A., Schurr, T.G., Cabell, M.F., Brown, M.D., Neel, J.V., Larsen, M., Smith, D.G.,Vullo, C.M., Wallace, D.C., 1993. Asian affinities and continental radiation of thefour founding Native American mtDNAs. Am. J. Hum. Genet. 53, 563–590.

UNSD, 1999. The World at Six Billion. United Nations Statistical Division.von Meltzer, M., Vasquez, S., Sun, J., Wendt, U.C., May, A., Gerlich, W.H., Radtke, M.,

Schaefer, S., 2008. A new clade of hepatitis B virus subgenotype F1 from Peruwith unusual properties. Virus Genes 37, 225–230.

Wang, H.Y., Chien, M.H., Huang, H.P., Chang, H.C., Wu, C.C., Chen, P.J., Chang, M.H.,Chen, D.S., 2010. Distinct hepatitis B virus dynamics in the immunotolerant andearly immunoclearance phases. J. Virol. 84, 3454–3463.

Wang, T.H., Donaldson, Y.K., Brettle, R.P., Bell, J.E., Simmonds, P., 2001. Identificationof shared populations of human immunodeficiency virus type 1 infectingmicroglia and tissue macrophages outside the central nervous system. J. Virol.75, 11686–11699.

WHO, 2008. Hepatitis B. Fact Sheet No. 204. World Health Organization (WHO).Zhou, Y., Holmes, E.C., 2007. Bayesian estimates of the evolutionary rate and age of

hepatitis B virus. J. Mol. Evol. 65, 197–205.