Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications...

11
Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome Zainab Asrar, Farhan Haq, Amir Ali Abbasi National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan article info Article history: Received 10 June 2012 Revised 27 October 2012 Accepted 29 October 2012 Available online 7 November 2012 Keywords: Multigene family HOX cluster Phylogeny Segmental duplication Whole genome duplication Intra-genomic synteny abstract Background: Susumu Ohno’s idea that modern vertebrates are degenerate polyploids (concept referred as 2R hypothesis) has been the subject of intense debate for past four decades. It was proposed that intra- genomic synteny regions (paralogons) in human genome are remains of ancient polyploidization events that occurred early in the vertebrate history. The quadruplicated paralogon centered on human HOX clus- ters is taken as evidence that human HOX-bearing chromosomes were structured by two rounds of whole genome duplication (WGD) events. Results: Evolutionary history of human HOX-bearing chromosomes (chromosomes 2/7/12/17) was eval- uated by the phylogenetic analysis of multigene families with triplicated or quadruplicated distribution on these chromosomes. Topology comparison approach categorized the members of 44 families into four distinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhi- bit similar evolutionary history and hence have duplicated simultaneously, whereas genes of two distinct co-duplicated groups do not share their evolutionary history and have not duplicated in concert with each other. Conclusion: The recovery of co-duplicated groups suggests that ‘‘ancient segmental duplications and rear- rangements’’ is the most rational model of evolutionary events that have generated the triplicated and quadruplicated paralogy regions seen on the human HOX-bearing chromosomes. Ó 2012 Elsevier Inc. All rights reserved. 1. Introduction To explain the genetic basis of increasing developmental and morphological complexity during early history of vertebrates, Sus- umu Ohno postulated that two or multiple rounds of whole gen- ome duplications (WGDs) might have occurred at the root of early vertebrate lineage (Ohno, 1970, 1973). This notion popularly referred as 2R hypothesis (two rounds of WGDs) has been inten- sely debated over the years (Abbasi, 2008, 2010b; Abbasi et al., 2009; Abbasi and Hanif, 2012; Donoghue and Purnell, 2005; Furlong and Holland, 2002, 2004; Hughes and Friedman, 2003; Kasahara, 2007; Martin, 2001, 1999; Skrabanek and Wolfe, 1998). Among substantial evidences adduced in favor of ancient verte- brate polyploidy (genome duplications), the most widely cited sug- gests the existence of paralogons or paralogous genomic segments in vertebrate genomes: homologous chromosomal segments with- in the genome sharing similar sets of genes (Dehal and Boore, 2005; Furlong and Holland, 2002; Gibson and Spring, 2000; Hokamp et al., 2003; Kuraku et al., 2009; Larhammar et al., 2002; Lundin et al., 2003; McLysaght et al., 2002; Vanneste et al., 2012). Precisely, the occurrence of four potential quadruplicated regions, notably on Homo sapiens autosome (Hsa) 1/6/9/19, Hsa 4/5/8/10, Hsa 1/2/8/10 and the HOX-cluster bearing chromosomes Hsa 2/7/12/17, are considered to have structured by two rounds of polyploidy (Hokamp et al., 2003; Lundin et al., 2003; Sundstrom et al., 2008). In depth analysis of genomic data from diverse set of vertebrate and invertebrate species has confronted the basis of two rounds of tetraploidy (2R hypothesis) (Abbasi, 2008, 2010b). It was proposed that elucidation of intra-genomic syntenic regions through map self-comparison approach does not provide compelling support for the proposed mechanism of origin of paralogons (Abbasi, 2008). Therefore, the sheer global physical organization of genes should not be taken as an evidence that vertebrate genome was shaped by ancient WGDs. However, such patterns are in support of 2R hypothesis if following two conditions are met; the duplica- tion history of multigene families constituting paralogons should advocate that majority of them duplicated within the time window of invertebrates–vertebrates and bony fish–tetrapod split (pro- posed timings of 2R) (Abbasi, 2010b; Abbasi and Grzeschik, 2007; Hughes, 1998; Hughes et al., 2001; Martin, 2001); similarly, the consistencies should be reflected among the tree topologies of distinct gene families whose members show syntenic associations on more than one genomic location (Abbasi, 2010b; Abbasi and 1055-7903/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2012.10.024 Corresponding author. E-mail address: [email protected] (A.A. Abbasi). Molecular Phylogenetics and Evolution 66 (2013) 737–747 Contents lists available at SciVerse ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Transcript of Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications...

Page 1: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Molecular Phylogenetics and Evolution 66 (2013) 737–747

Contents lists available at SciVerse ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier .com/ locate /ympev

Fourfold paralogy regions on human HOX-bearing chromosomes: Role ofancient segmental duplications in the evolution of vertebrate genome

Zainab Asrar, Farhan Haq, Amir Ali Abbasi ⇑National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, Islamabad 45320, Pakistan

a r t i c l e i n f o

Article history:Received 10 June 2012Revised 27 October 2012Accepted 29 October 2012Available online 7 November 2012

Keywords:Multigene familyHOX clusterPhylogenySegmental duplicationWhole genome duplicationIntra-genomic synteny

1055-7903/$ - see front matter � 2012 Elsevier Inc. Ahttp://dx.doi.org/10.1016/j.ympev.2012.10.024

⇑ Corresponding author.E-mail address: [email protected] (A.A. Abbas

a b s t r a c t

Background: Susumu Ohno’s idea that modern vertebrates are degenerate polyploids (concept referred as2R hypothesis) has been the subject of intense debate for past four decades. It was proposed that intra-genomic synteny regions (paralogons) in human genome are remains of ancient polyploidization eventsthat occurred early in the vertebrate history. The quadruplicated paralogon centered on human HOX clus-ters is taken as evidence that human HOX-bearing chromosomes were structured by two rounds of wholegenome duplication (WGD) events.Results: Evolutionary history of human HOX-bearing chromosomes (chromosomes 2/7/12/17) was eval-uated by the phylogenetic analysis of multigene families with triplicated or quadruplicated distributionon these chromosomes. Topology comparison approach categorized the members of 44 families into fourdistinct co-duplicated groups. Distinct gene families belonging to a particular co-duplicated group, exhi-bit similar evolutionary history and hence have duplicated simultaneously, whereas genes of two distinctco-duplicated groups do not share their evolutionary history and have not duplicated in concert witheach other.Conclusion: The recovery of co-duplicated groups suggests that ‘‘ancient segmental duplications and rear-rangements’’ is the most rational model of evolutionary events that have generated the triplicated andquadruplicated paralogy regions seen on the human HOX-bearing chromosomes.

� 2012 Elsevier Inc. All rights reserved.

1. Introduction

To explain the genetic basis of increasing developmental andmorphological complexity during early history of vertebrates, Sus-umu Ohno postulated that two or multiple rounds of whole gen-ome duplications (WGDs) might have occurred at the root ofearly vertebrate lineage (Ohno, 1970, 1973). This notion popularlyreferred as 2R hypothesis (two rounds of WGDs) has been inten-sely debated over the years (Abbasi, 2008, 2010b; Abbasi et al.,2009; Abbasi and Hanif, 2012; Donoghue and Purnell, 2005;Furlong and Holland, 2002, 2004; Hughes and Friedman, 2003;Kasahara, 2007; Martin, 2001, 1999; Skrabanek and Wolfe, 1998).Among substantial evidences adduced in favor of ancient verte-brate polyploidy (genome duplications), the most widely cited sug-gests the existence of paralogons or paralogous genomic segmentsin vertebrate genomes: homologous chromosomal segments with-in the genome sharing similar sets of genes (Dehal and Boore,2005; Furlong and Holland, 2002; Gibson and Spring, 2000;Hokamp et al., 2003; Kuraku et al., 2009; Larhammar et al., 2002;Lundin et al., 2003; McLysaght et al., 2002; Vanneste et al.,

ll rights reserved.

i).

2012). Precisely, the occurrence of four potential quadruplicatedregions, notably on Homo sapiens autosome (Hsa) 1/6/9/19, Hsa4/5/8/10, Hsa 1/2/8/10 and the HOX-cluster bearing chromosomesHsa 2/7/12/17, are considered to have structured by two rounds ofpolyploidy (Hokamp et al., 2003; Lundin et al., 2003; Sundstromet al., 2008).

In depth analysis of genomic data from diverse set of vertebrateand invertebrate species has confronted the basis of two rounds oftetraploidy (2R hypothesis) (Abbasi, 2008, 2010b). It was proposedthat elucidation of intra-genomic syntenic regions through mapself-comparison approach does not provide compelling supportfor the proposed mechanism of origin of paralogons (Abbasi,2008). Therefore, the sheer global physical organization of genesshould not be taken as an evidence that vertebrate genome wasshaped by ancient WGDs. However, such patterns are in supportof 2R hypothesis if following two conditions are met; the duplica-tion history of multigene families constituting paralogons shouldadvocate that majority of them duplicated within the time windowof invertebrates–vertebrates and bony fish–tetrapod split (pro-posed timings of 2R) (Abbasi, 2010b; Abbasi and Grzeschik,2007; Hughes, 1998; Hughes et al., 2001; Martin, 2001); similarly,the consistencies should be reflected among the tree topologies ofdistinct gene families whose members show syntenic associationson more than one genomic location (Abbasi, 2010b; Abbasi and

Page 2: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

738 Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747

Grzeschik, 2007; Hughes et al., 2001; Martin, 2001). Furthermore,ideally the quadruplicated gene families should exhibit a symmet-ric tree topology showing two clusters of two genes, referred as(AB) (CD) (Hughes, 1999; Martin, 2001).

The quadruplicate paralogy regions organized around HOX geneclusters (located on human chromosomes 2, 7, 12 and 17), hasbeen taken as an evidence that these paralogous gene sets alongwith the HOX clusters might have originated simultaneouslythrough two rounds of whole genome or block duplication events(Hokamp et al., 2003; Kuraku et al., 2009; Larhammar et al., 2002;Lundin et al., 2003; Sundstrom et al., 2008). To test this assump-tion, previously a phylogenetic analysis of 22 gene families wasperformed, with members on at least three of the HOX-bearingchromosomes (2, 7, 12, and 17), but the results found were contra-dictory (Abbasi, 2010b; Abbasi and Grzeschik, 2007) .

In the present study, we extend our previous work (Abbasi,2010b; Abbasi and Grzeschik, 2007), and report 21 novel multi-gene families that have representations on at least three of the fourHOX-bearing chromosomes (Fig. 1 and Table 1). Enormous amountof protein data ranging from vertebrate and invertebrate specieswas exploited and a detailed phylogenetic analysis of these multi-gene families was performed by neighbor joining (NJ) and maxi-mum likelihood (ML) methods. The topology comparisonapproach (Abbasi, 2010b; Abbasi and Grzeschik, 2007; Hugheset al., 2001; Martin, 2001), was then applied on the phylogeneticdata of total 43 families (21 present and 22 previous data) to iden-tify the genes that might have duplicated simultaneously witheach other and with the HOX cluster early in the vertebratelineage.

Fig. 1. Gene families with members on at least three of the human HOX-bearing chromosnear the HOX clusters suggests that HOX cluster paralogons might have been shaped bcation channel, neuronal; ATP6V0A, ATPase, H + transporting, lysosomal V0 subunit A;sulfotransferase; CLIP, CAP-GLY domain containing linker protein; FKBP, FK506 bindinHistone deacetylase; ING, Inhibitor of growth family; NXPH, Neurexophilin; ORMDL, ORMcontaining, family A; PPP1R1, Protein phosphatase 1, regulatory (inhibitor) subunit; PRcoupled) activity modifying protein; TMEM106, Transmembrane protein 106; TNS, TenGenes analyzed in this study are enclosed within rectangles, whereas the histories of otwork (Abbasi and Grzeschik, 2007; Abbasi, 2010). None of the features of this figure are

In support to previous work (Abbasi, 2010b; Abbasi and Grzeschik,2007), the results from the present study suggests that the genefamilies with three or more paralogs linked to HOX clusters didnot arise simultaneously through two rounds of whole chromo-some or WGD. Instead, our study concludes that these HOX clusterparalogons have resulted from independent gene duplications,segmental duplication and rearrangement events that occurredat widely different time points during early evolution ofvertebrates.

2. Materials and methods

2.1. Dataset

Gene families with triplicated or quadruplicated members onhuman HOX-bearing chromosomes (Hsa2/7/12/17) were identifiedby scanning the human genome sequence maps available at theEnsembl and UCSC genome browsers (Hubbard et al., 2002). A totalof 21 gene families were included in this study: five of these fam-ilies have members on each of the four human HOX-bearing chro-mosomes, while the remaining 16 have their members on at leastthree of HOX chromosomes (Table 1 and Fig. 1).

The closest putative orthologous sequences of the human pro-teins in other species were obtained using BLASTP in the Ensemblgenome browser (Hubbard et al., 2002). To enrich these gene fam-ilies with sequences from those organisms for which sequenceinformation was not available at Ensembl, a BLASTP (Altschulet al., 1990) search was carried out against the protein database

omes 2, 7, 12 and 17. Restricted location of members of many of these gene familiesy two rounds of block/whole chromosome duplications. ACCN, Amiloride-sensitive

CACNB, Calcium channel, voltage-dependent, beta subunit; CHST, Carbohydrateg protein; FMNL, Formin-like; GRB, Growth factor receptor-bound protein; HDAC,

1-like; PDK, Pyruvate dehydrogenase kinase; PLEKHA, Pleckstrin homology domainKAG, AMP-activated protein kinase gamma subunit; RAMP, Receptor (G protein-

sin; VAMP, Vesicle-associated membrane protein; ZNF385, Zinc finger protein 385.her genes (not enclosed in rectangles) were presented in our previously publisheddrawn to scale.

Page 3: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Table 1List of human gene families used in the phylogenetic analysis.

Gene family Members Chr location Human protein accession no Number of included taxa Number of sequences included

Amiloride-sensitive cation channel, neuronalACCN1 17q12 Q16515ACCN2 12q12 P78348 20 60ACCN3 7q35 Q9UHC3ACCN4 2q35 Q96FT7ACCN5 4q31.3-q32 Q9NY37ACCN1 17q12 Q16515

Calcium channel, voltage-dependent, beta subunitCACNB1 17q21-q22 Q02641 14 46CACNB2 10p12 Q08289CACNB3 12q13 P54284CACNB4 2q22-q23 O00305

FK506 binding proteinFKBP2 11q13.1-q13.3 P26885 25 104FKBP7 2q31.2 Q9Y680FKBP9 7p11.1 O95302FKBP10 17q21.2 Q96AY3FKBP11 12q13.12 Q9NYL4FKBP14 7p14.3 Q9NWM8

Inhibitor of growth familyING1 13q34 Q9UK53 18 70ING2 4q35.1 Q9H160ING3 7q31 Q9NXR8ING4 12p13.31 Q9UNL4ING5 2q37.3 Q8WYH8

ORM1-likeORMDL1 2q32 Q9P0S3 15 30ORMDL2 12q13.2 Q53FV1ORMDL3 17q12 Q8N138

Pleckstrin homology domain containing, family APLEKHA3 2q31.2 Q9HB20 26 41PLEKHA8 7p21-p11.2 Q96JA3PLEKHA9 12q O95397

Protein phosphatase 1, regulatory (inhibitor)subunitPPP1R1A 12q13.2 Q13522 18 38PPP1R1B 17q12 Q9UD71PPP1R1C 2q31.3 Q8WVI7

AMP-activated protein kinase gamma subunitPRKAG1 12q12-q14 P54619 26 63PRKAG2 7q36.1 Q9UGJ0PRKAG3 2q35 Q9UGI9

TensinTNS1 2q35-q36 Q9HBL0 25 74TNS3 7p12.3 Q68CZ2TNS4 17q21.2 Q8IZW8TENC1 12q13.13 Q63HR2

Vesicle-associated membrane proteinVAMP1 12p P23763 15 59VAMP2 17p13.1 P63027VAMP3 1p36.23 Q15836VAMP4 1q24-q25 O75379VAMP5 2p11.2 O95183VAMP8 2p12-p11.2 Q9BV40

Zinc finger protein 385ZNF385A 12q13.13 Q96PM9 17 51ZNF385B 2q31.2-q31.3 Q569K4ZNF385C 17q21.2 Q66K41ZNF385D 3p24.3 Q9H6B1

ATPase, H + transporting, lysosomal V0 Subunit AATP6V0A1 17q21 Q93050 23 74ATP6V0A2 12q24.31 Q9Y487TCIRG1 11q13.2 Q13488ATP6V0A4 7q34 Q9HBG4

Carbohydrate sulfotransferaseCHST8 19q13.1 Q9H2A9 17 84CHST9 18q11.2 Q7L1S5CHST10 2q11.2 O43529CHST11 12q Q9NPF2CHST12 7p22 Q9NRB3

(continued on next page)

Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747 739

Page 4: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Table 1 (continued)

Gene family Members Chr location Human protein accession no Number of included taxa Number of sequences included

CHST13 3q21.3 Q8NET6CHST14 15q15.1 Q8NCH0

CAP-GLY domain containing linker proteinCLIP1 12q24.3 P30622 28 89CLIP2 7q11.23 Q9UDT6CLIP3 19q13.12 Q96DZ5CLIP4 2p23.2 Q8N3C7

Pyruvate dehydrogenase kinasePDK1 2q31.1 Q15118 20 67PDK2 17q21.33 Q15119PDK3 Xp22.11 Q15120PDK4 7q21.3 Q16654

Growth factor receptor-bound proteinGRB7 17q12 Q14451 27 67GRB10 7p12.2 Q13322GRB14 2q22-q24 Q14449

Formin-likeFMNL1 17q21 O95466 21 50FMNL2 2q23.3 Q96PY5FMNL3 12q13.12 Q8IVF7

Histone deacetylaseHDAC4 2q37.3 P56524 17 61HDAC5 17q21 Q9UQL6HDAC6 Xp11.23 Q9UBN7HDAC7 12q13.1 Q8WUI4HDAC9 7p21.1 Q9UKV0HDAC10 22q13.31 Q969S8

NeurexophilinNXPH1 7p22 P58417 13 34NXPH2 2q22.1 O95156NXPH3 17q21.33 O95157NXPH4 12q13.3 O95158

Receptor (G protein-coupled) activity modifying proteinRAMP1 2q36-q37.1 O60894 18 44RAMP2 17q12-q21.1 O60895RAMP3 7p13-p12 O60896

Transmembrane protein 106TMEM106A 17q21.31 Q96A25 24 46TMEM106B 7p21.3 Q9NUM4TMEM106C 12q13.1 Q9BVX2

740 Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747

available at the National Center for Biotechnology Information(Johnson et al., 2008) and the Joint Genome Institute (http://www.jgi.doe.gov/).

Because the main objective of this study was to identify theduplications events which had occurred during vertebrate evolu-tion; the blast hits with higher scores than the available inverte-brate ancestral sequences were retained. Further confirmation ofancestral–descendents relationship among putative orthologswas achieved by clustering homologous proteins within phyloge-netic trees. Sequences whose position within a tree was sharplyin conflict with the uncontested animal phylogeny were excludedfrom analysis.

The list of sequences used in the analysis is provided in the Sup-plementary material (Appendix 1). The species that were selectedin the analysis comprises of Homo sapiens (Human), Mus musculus(Mouse), Pan troglodytes (Chimpanzee), Gorilla gorilla (Gorilla), Cal-lithrix jacchus (Marmoset), Pongo pygmaeus (Orangutan), Macacamulatta (Macaque), Rattus norvegicus (Rat), Oryctolagus cuniculus(Rabbit), Gallus gallus (Chicken), Taeniopygia guttata (Zebra finch),Canis familiaris (Dog), Felis catus (Cat), Bos taurus (Cow), Equuscaballus (Horse), Loxodonta Africana (Elephant), Dasypus novemcinc-tus (Armadillo), Myotis lucifugus (Microbat), Pteropus vampyrus(Megabat), Monodelphis domestica (Opossum), Ornithorhynchusanatinus (Platypus), Anolis carolinensis (Lizard), Xenopus tropicalis

(Frog), Erinaceus europaeus (Hedgehog), Danio rerio (Zebrafish),Takifugu rubripes (Fugu), Tetraodon nigroviridis (Tetraodon), Gast-erosteus aculeatus (Stickleback), Oryzias latipes (Medaka), Cionaintestinalis (Ascidian), Ciona savignyi (Ascidian), Branchiostomafloridae (Amphioxus), Strongylocentrotus purpuratus (Sea urchin),Drosophila melanogaster (Fruit fly), Apis mellifera (Honey bee),Anopheles gambiae (Mosquito), Caenorhabditis elegans (Nema-tode), Nematostella vectensis (Sea anemone), and Hydramagnipapillata.

2.2. Alignment and phylogenetic analysis

The phylogenetic analyses for each gene family were performedusing MEGA version 5 (Kumar et al., 2008). Amino acid sequenceswere aligned using a multiple alignment tool CLUSTAL W with de-fault parameter (Thompson et al., 1994). Phylogenetic trees foreach gene family were reconstructed using the neighbor joining(NJ) method (Russo et al., 1996; Saitou and Nei, 1987), the com-plete deletion option was used to exclude any site which postu-lated a gap in the sequences. Uncorrected proportion (p) ofamino acid difference and possion corrected (PC) amino acid dis-tance were used as amino acid substitution models. Since bothmethodologies produce similar results, only the results from NJtree based on uncorrected p-distance are presented here (Figs. 2

Page 5: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747 741

and 3 and Appendix 2). The authenticity of the resulting tree topol-ogies were confirmed by performing bootstrap method (at 1000pseudoreplicates) which generated the bootstrap probability foreach interior branch in the tree (Felsenstein, 1985). The sequencesthat were too diverged, disrupting the entire alignment were ex-cluded. To estimate phylogenetic trees using a different recon-struction method, Maximum Likelihood procedure based on theWhelan and Goldman (WAG) model of amino acid replacementwas employed (Whelan and Goldman, 2001), using MEGA 5 pro-gram (Appendix 3). Furthermore, to validate our results phyloge-netic trees for 21 gene families were reconstructed using anotherprogram, i.e. PhyML 3.0 aLRT (Guindon et al., 2005), based on max-imum likelihood approach (Appendix 3).

The gene duplication events with relevance to major taxa oforganisms were estimated by the branching order of each genefamily within the phylogenetic tree. The method of relative datingdoes not rely on the assumption of a constant rate of evolution.Therefore the process is sensitive to the varying rate of evolutionin different branches of the tree (Hughes, 1998). The tree topologyof each gene family was compared with those of other families andwith HOX clusters phylogeny to test for consistency in duplicationevents (Zhang and Nei, 1996) (Fig. 4).

Among the topologies of 21 gene families, the phylogenetictrees of CACNB, ORMDL, PLEKHA, PPP1R1, ZNF385, FMNL, GRB,NXPH, RAMP, and TMEM106 were rooted with orthologous genesfrom invertebrates, whereas the ACCN, TNS, ATP6V0A, CLIP andPDK phylogenies were rooted with both invertebrate and verte-

Fig. 2. Neighbor-Joining tree of the (A) FKBP family (B) TNS family. Uncorrected p-distabootstrap values (based on 1000 replications) supporting that branch; only the values P

brate sequences. The phylogenies of ING, VAMP and HDAC familiesconsisted of two subfamilies, each of which served to root theother. For the FKBP tree, vertebrate FKBP7 sequences served asan out group to root the remainder of the tree, while the remainingsequences served to root vertebrate FKBP7 sequences and similarlyfor the CHST and PRKAG trees, vertebrates CHST12 and PRKAG3 se-quences respectively served as an out-group to root the remainderof the tree, while the remaining sequences served to root CHST12and PRKAG3 sequences.

3. Results

To test the validity of 2R hypothesis, which postulates that thedistinct multigene families residing on HOX-bearing chromosomesresulted from 2 rounds of WGD events in early vertebrate lineage, aphylogenetic analysis was performed for 21 gene families havingparalogs residing on at least three of the four human HOX geneclusters bearing chromosomes (Hsa2/7/12/17). Genomic sequencedata from diverse set of vertebrate and invertebrate species were em-ployed to construct Neighbor-Joining (NJ) and Maximum-Likelihood(ML) trees from each gene families (Appendices 2 and 3).

To present a plausible explanation of the evolutionary eventsthat shaped the syntenic relationships seen on present day humanHOX cluster bearing chromosomes, topology comparison approachwas employed for 21 gene families to hunt the genes that exhibiteda strong statistical support for concurrent duplication events(Abbasi, 2010b; Abbasi and Grzeschik, 2007; Zhang and Nei,

nce was used. Complete-deletion option was used. Numbers on branches represent50% are presented here. Scale bar shows amino acid substitution per site.

Page 6: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Fig. 3. Neighbor-Joining tree of the (A) ATP6V0A family (B) CHST family. Uncorrected p-distance was used. Complete-deletion option was used. Numbers on branchesrepresent bootstrap values (based on 1000 replications) supporting that branch; only the values P50% are presented here. Scale bar shows amino acid substitution per site.

742 Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747

1996). For this purpose, only those portions of phylogenies wereselected (for further analysis) which indicated a strong signal forat least two duplication events within the time window of tele-ost–tetrapod and vertebrate–invertebrate split (proposed timingsof WGDs) (Fig. 5 and Table 2).

The phylogenies of TNS, GRB, NXPH and RAMP gene families,having members on three of the human HOX cluster bearing chro-mosomes, revealed that the relevant members duplicated in thechordate lineage. The careful analysis of these families, showedharmony in duplication patterns and suggested a topology of thetype where genes on Hsa7 and Hsa2 are grouped together and geneon Hsa12 form an outgroup to them (Fig. 2B). Assuming an inde-pendent translocation event (from Hsa12 to Hsa17), RAMP familymight have duplicated simultaneously with TNS, GRB, and NXPHgene families (Fig. 4A).

The syntenic region, including ORMDL, PPP1R1 and ZNF385gene families with representations on three HOX-bearing chromo-somes, illustrates the conservation of linkage relationship and geneorder (Fig. 1). ORMDL and PPP1R1 families showed a topology ofthe type where genes on Hsa17 and Hsa2 are clustered togetherand gene on Hsa12 form an outgroup to them. The phylogenetictree of ZNF385 gene family exhibited a topology where genes onHsa3 and Hsa2 grouped together, gene on Hsa12 branched next,and gene on Hsa17 was the first member of this family to diverge(Table 2). Assuming an independent translocation event fromHsa17 to Hsa3 in ZNF385 gene family suggests that segment com-

prising of ORMDL, PPP1R1 and ZNF385 gene families might haveduplicated in block through ancient SD (aSD) events (Fig. 4D).

The phylogenies of ACCN, PLEKHA, TMEM106, CACNB, PDK,VAMP, and FMNL families revealed at least two vertebrate specificduplication events (Table 2). The careful analysis of tree branchingorder demonstrates seven distinct topology patterns for these HOXlinked gene families (Table 2). For instance, PDK gene family hav-ing representation on three HOX-bearing chromosomes andshowed topology of the type ((Hsa17 Hsa7)Hsa2) (Table 2). Para-logs of ACCN gene family residing on all four HOX-bearing chromo-somes exhibited a topology pattern of (((Hsa12 Hsa17) Hsa2)Hsa7) with bootstrap support of 96% (Appendix 2). PLEKHA genefamily members located on three human HOX-bearing chromo-somes and diversified by two vertebrate specific duplicationevents, one of them occurred recently within the primate lineage(Table 2). TMEM106 gene family has representation on threeHOX-bearing chromosomes and showed topology of the type((Hsa7 Hsa12) Hsa17) (Table 2). The phylogenetic tree of VAMPexhibits the topology of the form ((Hsa1 Hsa12) Hsa17) (Appendix2). FMNL paralogs having representations on three of the humanHOX cluster bearing chromosomes and showed a tree toplogy pat-tern ((Hsa2 Hsa12)Hsa17) with highly significant (100%) bootstrapsupport (Table 2). CACNB gene family has representations on threeHOX-bearing chromosomes and received a strong bootstrap (99%)support for the topology of the type ((Hsa2 Hsa12) (Hsa17 Hsa10))(Appendix 2).

Page 7: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Fig. 4. Consistencies in phylogenies of families (analyzed in this and our previous study) including members on at least three of the human HOX-bearing chromosomes (A)schematic topology of GLI, INHB, HH, SLC4A, OSBPL, PDE, SCRN, TNS, RAMP, NXPH and GRB families (B) schematic topology of ERBB, ZNFN1A, IGFBP, CBX and PDK familymembers (C) schematic topology of HOX clusters, SP members, HNRNPA and FMNL family (D) schematic topology of integrin beta chain, ATP5G, RND, ORMDL, PPP1R1, andZNF385 gene families. In each case the percentage bootstrap support of the internal branches is given in parentheses, except for gene families exhibiting slightly lowerbootstrap values. The bars/half brackets connecting some gene families on the left depict the close physical linkages of relevant genes. Notations 12/17⁄ and 17/3⁄, suggeststhat genes indicated with asterisk belong to either chromosome 17 or 3 respectively.

Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747 743

Phylogenetic trees of six gene families (CHST, HDAC, FKBP, ING,ATP6V0A and CLIP) revealed duplication events occurring ancientlyat different time points, at least prior to invertebrate–vertebratesplit. CHST gene family recovered six duplication events, four ofthem occurred anciently at least prior to the divergence of inverte-brate chordate and vertebrate lineages (Fig. 3B). Similarly, HDACparalogs experienced five duplication events that may also predatethe divergence of invertebrate chordate and vertebrate lineages(Appendix 2). Phylogeny of FKBP gene family revealed four ancientduplication events that might have occurred at least before thedivergence of echinoderm and chordate lineages (Fig. 2A). Treetopology of ING family suggested that members of this familydiversified by in total four duplication events, two of them oc-curred prior to protostome–deuterostome split (Appendix 2). Phy-logenetic trees for ATP6V0A and CLIP suggested that each of thesefamilies diversified by three duplications events. One of theseduplication events is ancient and may predate the divergence ofthe protostome and deuterostome lineages. The other two are morerecent and are estimated to have occurred within the vertebratelineage prior to teleost–tetrapod split (Fig. 3A and Appendix 2).

Tree topology comparison approach suggests that harmonyamong the phylogenetic tree branching pattern of distinct genefamilies revealing conserved physical linkage on human paralogygroups might reflect their concurrent origin, thus defining thesame co-duplicated group (Abbasi, 2010b; Abbasi and Grzeschik,2007). Whereas dissimilar tree topologies of distinct gene setssharing physical location on human paralogy groups reflects that

concerned families might not have duplicated in concert with eachother (Abbasi, 2010b; Abbasi and Grzeschik, 2007; Hughes andFriedman, 2003). Based on these assumptions, previous data com-prising of 23 multigene families residing on HOX cluster paralo-gons were categorized into four distinct co-duplicated groups(Abbasi, 2010b). The first co-duplicated group with the topologyof type ((Hsa7 Hsa2)Hsa12/17) was the largest and suggestedsimultaneous duplication of seven gene families, i.e. GLI, HH, INHB,IGFBP (subfamily-1), SLC4A, OSBPL, PDE1, and SCRN families(Fig. 4A); co-duplicated group-2 presented a topology of the type(((Hsa7 Hsa17) Hsa2) Hsa12) and involved the members fromERBB, ZNFN1A, IGFBP (subfamily 2) and CBX families (Fig. 4B);co-duplicated group-3 suggested the topology of the type (((Hsa2Hsa12)Hsa7)Hsa17) included HOX clusters and members of theSP and HNRNPA gene families (Fig. 4C); and co-duplicated group-4 involved the genes from ITGB, MYL, ATP5G, and RND familieswith the topology of the type ((Hsa17/3 Hsa2)Hsa12) (Fig. 4D).

Interestingly, a large majority of multigene families analyzed inthe present study revealed the tree topologies that reconciled withpreviously recovered four co-duplicated groups (Fig. 4). For in-stance, the phylogenetic trees of four gene families, i.e. TNS, GRB,NXPH and RAMP showed topologies that were in harmony withthe previously recovered co-duplicated group-1. Thus, taken to-gether the previous and current data the co-duplicated group-1 in-volves simultaneous diversification of at least 12 HOX linked genefamilies through two rounds of gene cluster or segmental duplica-tion events. These include members from TNS, GRB, NXPH, RAMP,

Page 8: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Fig. 5. The relative timing of duplication events that expanded the multigene families residing on human HOX cluster paralogons. The branching order within phylogenetictrees was used to estimate the time windows of gene duplication events relative to major cladogenetic events. For 43 multigene families residing on Hsa2/7/12/17,25duplication events were detected before vertebrate–invertebrate split. 41 duplications were detected after vertebrate–invertebrate and before tetrapod–bony fishdivergence whereas only four tetrapod specific duplication events were detected .The numbers enclosed within the parentheses in front of the gene family names representnumber of duplications experienced by that gene family.

744 Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747

GLI, HH, INHB, IGFBP (subfamily-1), SLC4A, OSBPL, PDE1, and SCRNfamilies (Fig. 4A). The gene order and physical organization of con-stituent genes is largely disturbed, except for GLI-INHB and TNS-IGFBP genes which are tightly bound to each other on each ofthe relevant chromosomal segments (Fig. 4A).

If assumed an independent gene loss, (i.e. PDK gene fromHsa12) the phylogenetic tree of PDK family exhibited a topologywhich is in concordance with the previously recovered co-dupli-cated group-2; hence extending the number of gene familiesbelonging to this group to five which now includes PDK, ERBB,ZNFN1A, IGFBP (subfamily 2) and CBX families (Fig. 4B). This co-

duplicated group indicates a conservation of physical linkage andgene order for ERBB, ZNFN1A and IGFBP genes subsequent to threerounds of segmental duplication events.

The third co-duplicated group comprises of prominent HOXgene clusters, whose evolutionary conserved genomic architec-ture suggests ancient duplication events (Abbasi, 2010b; Abbasiand Grzeschik, 2007). To elucidate the timing and pattern ofthese events, the evolutionary histories of HOX linked gene fam-ilies provided an important insight to HOX evolution. Previously,the histories of SP and HNRNPA gene families with their mem-bers residing in close proximity of human HOX clusters sug-

Page 9: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Table 2Summary of the phylogenetic analysis of gene families whose three or more members are residing on HOX-cluster paralogons.

1. Family Name Hsa2/1a Hsa7/3a Hsa12 Hsa17 Consistency with HOX Phylogeny Topology

Previous study (Abbasi and Grzeschik, 2007 and (Abbasi, 2010b)ERBB ERBB4 EGFR ERBB3 ERBB2 – (((17, 7) 2) 12) 97,98Collagen COL3A1 COL5A2 COL1A2 COL2A1 COL1A1 – ((((12,17)7)2)2) 93,92,83IGFBP IGFBP2 IGFBP5 IGFBP1 IGFBP3 IGFBP6 IGFBP4 – ((17, 7)2) ((7, 2)12) 99,91INTB ITGB6 ITGB5a ITGB7 ITGB3 – (((3, 17)2)12) 98,99MYL MYL1 – MYL6 MYL4 – ((17, 2)12) 87SP Sp3 Sp4 Sp1 Sp2 Yes (((2, 12) 7) 17) 98,89ZNFN1A ZNFN1A2 ZNFN1A1 ZNFN1A4 ZNFN1A3 – (((7, 17) 2) 12) 94,90INHB INHBB INHBA INHBC INHBE – – ((7, 2)12) 93SLC4A SLC4A3 SLC4A2 – SLC4A1 – ((7, 2)17) 85GLI GLI2 GLI3 GLI1 – – ((7, 2)12) 99HH IHH SHH DHH – ((7, 2)12) 97OSBPL OSBPL6 OSBPL3 – OSBPL7 – ((7, 2)17) 90PDE1 PDE1A PDE1C PDE1B – – ((7, 2)12) 99SCRN SCRN3 SCRN1 – SCRN2 – ((7, 2)17) 68CBX – CBX3 CBX5 CBX1 – ((17, 7) 12) 51HNRNPA HNRNPA3 HNRNPA2B1 HNRNPA1 – Yes ((12, 2) 7) 98ATP5G ATP5G3 – ATP5G2 ATP5G1 – ((17, 2)12) 92RND RND3 – RND1 RND2 – ((17, 2)12) 99This studyPDK PDK1 PDK4 – PDK2 – ((17,7)2)FMNL FMNL2 – FMNL3 FMNL1 Yes ((2,12)17) 100GRB GRB14 GRB10 – GRB7 – ((2,7)17) 92NXPH NXPH2 NXPH1 NXPH4 NXPH3 – (((2,7)12)17) 95,89RAMP RAMP1 RAMP3 – RAMP2 – ((2,7)17) 76TMEM106 – TMEM106B TMEM106C TMEM106A – ((7,12)17) 81ORMDL ORMDL1 – ORMDL2 ORMDL3 – ((2,17)12) 60PLEKHA PLEKHA3 PLEKHA8 PLEKHA9 – – ((2,12)7) 99PPP1R1 PPP1R1C – PPP1R1A PPP1R1B – ((17,2)12) 98TNS TNS1 TNS3 TENC1 – – (((2,7)12)17) 76,83VAMP VAMP3a – VAMP1 VAMP2 – ((1,12)17) 67ZNF385 ZNF385B ZNF385Da ZNF385A ZNF385C – (((3,2)12)17) 58,73ACCN ACCN4 ACCN3 ACCN2 ACCN1 – (((12,17)2)7) 96,69CACNB CACNB4 – CACNB3 CACNB1 – ((2,12)(17,10))b 99

For each gene family the chromosomal location and topologies (in the Newick format) of those genes are given, which arose through duplications after the invertebrates–vertebrates split and before the tetrapods–fishes divergence. The percentage bootstrap support of the internal branches is given next to each relevant topology.

a Represents the situation where a gene family member is not residing on human HOX-bearing chromosomes.b Indicates that the paralog CACNB2 is located on a different chromosome, i.e. HSA10.

Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747 745

gested that these families might have diversified in concert withthe vertebrate HOX clusters through three rounds of successiveSD events (Fig. 4C). Among the families analyzed in the presentstudy FMNL paralogs are positioned in the close vicinity ofhuman HOX clusters, with FMNL1 gene mapping at �3 Mb cen-tromeric to HOXB, and FMNL3 at �4 Mb centromeric to HOXC,while FMNL2 is at �23 Mb centromeric to HOXD. Relativelyweak adjacency of FMNL2 and HOXD cluster might be theconsequence of chromosomal rearrangement events (Fig. 1).These HOX linked FMNL genes revealed a tree topology of thetype where paralog on Hsa2 and Hsa12 are grouped together,while a paralog on Hsa17 form an out group to them. Assuming,an independent gene loss from Hsa7, we can conclude that ver-tebrate FMNL gene tree topology is in harmony with closelylinked SP, HNRNPA and HOX genes (Fig 4C). Thus, among thefamilies analyzed in the present study, the tree topology andchromosomal location of FMNL paralogs support to the notionthat HOX cluster were duplicated deep in vertebrate historythrough three rounds of SDs (Fig. 4C and Table 2).

ORMDL, PPP1R1 and ZNF385 gene families showed their dupli-cation patterns in concordance with the previously recovered co-duplicated group 4, hence increasing the number of multigenefamilies from 4 to 7 including ORMDL, PPP1R1, ZNF385 ITGB,MYL, ATP5G, and RND gene families (Fig. 4D). This co-duplicatedgroup indicates a conservation of physical linkage for ORMDL,PPP1R1 and ZNF385 genes on each of the relevant chromosomalsegments. In addition MYL4 is closely linked to ITGB3 gene onHsa17.

4. Discussion

To identify ancient paralogons in the vertebrate genomes, post-genomic approaches such as map-self comparison and genome-wide pairwise comparisons provide an indispensable discernmentto those genome shaping events that have occurred in the recenthistory of vertebrates, because such events are not obscured bylong term evolutionary divergence, breakage and rearrangements.For instance, comparing physical organization of genes within thehuman genome and among the genomes of multiple primate spe-cie suggested intricate pattern of recent duplications, also referredas segmental duplications (SDs) (Bailey et al., 2002; Cheng et al.,2005). These recent duplications are the large genomic blocksranging in size from �300 kb to 1 Mb, positioned on at least twodifferent genomic locations showing prominent sequence identityof more than 90% (Samonte and Eichler, 2002). It has been esti-mated that these primate segmental duplications account up to5% of the human genome, correspond to duplication events corre-lating with the divergence of New world and Old world monkeys�35–40 Mya (Bailey and Eichler, 2006). Comparative analysis ofgenomic data across the species has endorsed various roles tothese segmental duplication events: creating novel primate genes,shaping of primate genomes, expansion of gene families, and initi-ating large scale hominoid specific chromosomal rearrangements(Marques-Bonet et al., 2009).

On the contrary, the most tedious task is to predict the absolutenature of evolutionary events that had led to creation of ancient(>450 Mya) paralogy regions (paralogons) in the vertebrate

Page 10: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

746 Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747

genome. Therefore, it is extremely difficult to track these geneticevents by exercising inter-genomic and intra-genomic map com-parison approaches, since such ancient events inconspicuouslyexperienced an enduring period of stochastic translocation processinvolving multiple chromosomal breakages and rearrangementevents; resulted in altered karyotype and disrupted gene orderon chromosomes. A more substantial approach to determine themechanism of embarkation of vertebrate ancient intra-genomicsynteny blocks (paralogons) is through phylogenetic analysis(Abbasi, 2010a,b; Abbasi and Grzeschik, 2007; Hughes, 1998;Hughes et al., 2001). This approach effectively apprehends the pre-cise nature of anciently duplicated events in two ways: Firstly, therelative timing of duplication events occurring prior or after a spe-ciation event provides a bird’s eye view to all the duplicationevents that occurred in a particular time window. If the phyloge-nies indicate that the majority of the paralogons originated beforethe separation of tetrapod–fish and after invertebrate–vertebratesplit, this suggests that large-scale duplications have occurred be-tween these speciation events (Van de Peer, 2004). Secondly, theorigin of paralogons can be estimated by coupling the informationfrom the global physical organization of gene families comprisingof paralogons with their relevant tree topologies (branching orderof phylogenetic tree). Correspondence among the topologies of dis-tinct multigene families comprising human paralogons would sug-gest the simultaneous block or segmental duplications. Thismechanism is well explained and applied in previous studies(Abbasi, 2010a,b; Abbasi and Grzeschik, 2007; Abbasi and Hanif,2012; Hughes et al., 2001).

Together with our recent data (Abbasi, 2010b; Abbasi andGrzeschik, 2007), we constructed the phylogenetic trees of 43 genefamilies with members on at least three of the four HOX-bearingchromosomes (Fig. 1). The topology comparison approach was ap-plied against these phylogenies to test the WGD hypothesis, whichsuggests that fourfold paralogous regions on human HOX-bearingchromosomes have originated by means of two rounds of wholegenome duplication events early in the vertebrate history (Table 2).Hence, the careful analysis resulted in the categorization of thephylogenies into four discrete co-duplicated groups, where theconstituent gene families were diversified through duplicationsthat had occurred within the time window of vertebrate—inverte-brate and tetrapod—bony fish split (Fig. 4). Gene families belongingto a particular co-duplicated group suggest that they share similarevolutionary history and might have originated through gene clus-ter duplication event, whereas the genes belonging to different co-duplicated groups may not share the evolutionary history andmight not have duplicated simultaneously. The recovery of theselarge co-duplicated groups is indicative of the fact that HOX clusterparalogons were shaped by segmental duplications and rearrange-ment events that occurred at the root of vertebrate lineage at leastas early as before the divergence of tetrapod and bony fish. Theconservation of gene content organization on different chromo-somal region implies some functional significance. For instance,the co-expression of neighboring genes is mediated by high orderstructural organization of chromosomes, which brings togetherthe genomic regions in close proximity to shared region of geneexpression (Meaburn and Misteli, 2007). Similarly, the gene regu-latory elements spread across long regions impose critical con-straint on genomic architecture and are known to havemaintained exceptionally long syntenic blocks both within andacross species (Goode et al., 2005; Kikuta et al., 2007; Lee et al.,2006).

Apart from the co-duplicated groups, two gene families namelyPLEKHA and CACNB showed novel and interesting phylogenies. InPLEKHA a very recent primate specific duplication (PLEKHA8–PLEKHA9) was observed (Appendix 2). Whereas, CACNB genefamily presented a topology of the type ‘‘(AB)(CD)’’ and according

to 2R proponents, such topology favors two rounds of WGDs. How-ever, local duplications and random process of chromosomalbreakage and rearrangement events present an alternative andmore realistic explanation for such symmetrical tree patterns(Abbasi, 2010a; Abbasi and Hanif, 2012). Therefore, in the absenceof WGD events, it would not be surprising to find some paralogoussets with this type of topology.

The present study aims to test the validity of the hypothesislinking the sudden appearance of complex morphological traitsin vertebrates with ancient whole genome/extensive gene duplica-tion events (Vanneste et al., 2012; Zhang and Cohn, 2008). Takentogether with previous data, our results based on the duplicationhistory of 44 gene families provide compelling evidence that thevertebrate genome evolved by relatively small-scale, regionalduplication events at widely different time points in animal his-tory. Therefore, it is conceivable to argue that those studies thathave speculated in favor of whole genome duplication hypothesis,based on analysis of few gene families cannot be considered reli-able (Daza et al., 2011; Meyer and Schartl, 1999; Zhang and Cohn,2008).

5. Conclusion

To unravel the evolutionary events that have structured themammalian HOX cluster paralogons, a careful phylogenetic analy-sis of 44 gene families with members residing on at least three ofthe human HOX-bearing chromosomes 2, 7, 12 and 17 was per-formed. Our results suggest that the multigene families with trip-licated or quadruplicated distributions on these humanchromosomes have not arisen through two rounds of block/chro-mosome or WGD events. Instead, our data indicate that extensiveintra-genomic synteny centered on human HOX clusters is the con-sequence of aSDs and chromosomal rearrangements that occurredat widely different time points in the early vertebrate history.These results are in concordance with the data which presents adefinite evidence for the extensive occurrence of genomic segmen-tal duplications in recent vertebrate history (rSDs). This would im-ply that, the mechanism of duplications at the base of vertebratehistory might not exhibit any difference from those that shapedour genome during its recent history.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.ympev.2012.10.024.

References

Abbasi, A.A., 2008. Are we degenerate tetraploids? More genomes, new facts. BiolDirect. 3, 50.

Abbasi, A.A., 2010a. Piecemeal or big bangs: correlating the vertebrate evolutionwith proposed models of gene expansion events. Nat. Rev. Genet. 11, 166.

Abbasi, A.A., 2010b. Unraveling ancient segmental duplication events in humangenome by phylogenetic analysis of multigene families residing on HOX-clusterparalogons. Mol. Phylogenet. Evol. 57, 836–848.

Abbasi, A.A., Goode, D.K., Amir, S., Grzeschik, K.H., 2009. Evolution and functionaldiversification of the GLI family of transcription factors in vertebrates. Evol.Bioinform. Online 5, 5–13.

Abbasi, A.A., Grzeschik, K.H., 2007. An insight into the phylogenetic history of HOXlinked gene families in vertebrates. BMC Evol. Biol. 7, 239.

Abbasi, A.A., Hanif, H., 2012. Phylogenetic history of paralogous gene quartets onhuman chromosomes 1, 2, 8 and 20 provides no evidence in favor of thevertebrate octoploidy hypothesis. Mol. Phylogenet. Evol. 63, 922–927.

Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic localalignment search tool. J. Mol. Biol. 215, 403–410.

Bailey, J.A., Eichler, E.E., 2006. Primate segmental duplications: crucibles ofevolution, diversity and disease. Nat. Rev. Genet. 7, 552–564.

Page 11: Fourfold paralogy regions on human HOX-bearing chromosomes: Role of ancient segmental duplications in the evolution of vertebrate genome

Z. Asrar et al. / Molecular Phylogenetics and Evolution 66 (2013) 737–747 747

Bailey, J.A., Gu, Z., Clark, R.A., Reinert, K., Samonte, R.V., Schwartz, S., Adams, M.D.,Myers, E.W., Li, P.W., Eichler, E.E., 2002. Recent segmental duplications in thehuman genome. Science 297, 1003–1007.

Cheng, Z., Ventura, M., She, X., Khaitovich, P., Graves, T., Osoegawa, K., Church, D.,DeJong, P., Wilson, R.K., Paabo, S., Rocchi, M., Eichler, E.E., 2005. A genome-widecomparison of recent chimpanzee and human segmental duplications. Nature437, 88–93.

Daza, D.O., Sundstrom, G., Bergqvist, C.A., Duan, C., Larhammar, D., 2011. Evolutionof the insulin-like growth factor binding protein (IGFBP) family. Endocrinology152, 2278–2289.

Dehal, P., Boore, J.L., 2005. Two rounds of whole genome duplication in the ancestralvertebrate. PLoS Biol. 3, e314.

Donoghue, P.C., Purnell, M.A., 2005. Genome duplication, extinction and vertebrateevolution. Trends Ecol. Evol. 20, 312–319.

Felsenstein, J., 1985. Confidence limit on phylogenies: an approach using thebootstrap. Evolution 39, 95–105.

Furlong, R.F., Holland, P.W., 2002. Were vertebrates octoploid? Philos. Trans. R. Soc.Lond. B Biol. Sci. 357, 531–544.

Furlong, R.F., Holland, P.W., 2004. Polyploidy in vertebrate ancestry: Ohno andbeyond. Biol. J. Linnean Soc. 82, 425–430.

Gibson, T.J., Spring, J., 2000. Evidence in favour of ancient octaploidy in thevertebrate genome. Biochem. Soc. Trans. 28, 259–264.

Goode, D.K., Snell, P., Smith, S.F., Cooke, J.E., Elgar, G., 2005. Highly conservedregulatory elements around the SHH gene may contribute to the maintenanceof conserved synteny across human chromosome 7q36.3. Genomics 86, 172–181.

Guindon, S., Lethiec, F., Duroux, P., Gascuel, O., 2005. PHYML Online–a web serverfor fast maximum likelihood-based phylogenetic inference. Nucleic Acids Res.33, W557–W559.

Hokamp, K., McLysaght, A., Wolfe, K.H., 2003. The 2R hypothesis and the humangenome sequence. J. Struct. Funct. Genomics 3, 95–110.

Hubbard, T., Barker, D., Birney, E., Cameron, G., Chen, Y., Clark, L., Cox, T., Cuff, J.,Curwen, V., Down, T., Durbin, R., Eyras, E., Gilbert, J., Hammond, M.,Huminiecki, L., Kasprzyk, A., Lehvaslaiho, H., Lijnzaad, P., Melsopp, C.,Mongin, E., Pettett, R., Pocock, M., Potter, S., Rust, A., Schmidt, E., Searle, S.,Slater, G., Smith, J., Spooner, W., Stabenau, A., Stalker, J., Stupka, E., Ureta-Vidal, A., Vastrik, I., Clamp, M., 2002. The Ensembl genome database project.Nucleic Acids Res. 30, 38–41.

Hughes, A.L., 1998. Phylogenetic tests of the hypothesis of block duplication ofhomologous genes on human chromosomes 6, 9, and 1. Mol. Biol. Evol. 15, 854–870.

Hughes, A.L., 1999. Phylogenies of developmentally important proteins do notsupport the hypothesis of two rounds of genome duplication early in vertebratehistory. J. Mol. Evol. 48, 565–576.

Hughes, A.L., da Silva, J., Friedman, R., 2001. Ancient genome duplications did notstructure the human Hox-bearing chromosomes. Genome Res. 11, 771–780.

Hughes, A.L., Friedman, R., 2003. 2R or not 2R: testing hypotheses of genomeduplication in early vertebrates. J. Struct. Funct. Genomics 3, 85–93.

Johnson, M., Zaretskaya, I., Raytselis, Y., Merezhuk, Y., McGinnis, S., Madden, T.L.,2008. NCBI BLAST: a better web interface. Nucleic Acids Res. 36, W5–W9.

Kasahara, M., 2007. The 2R hypothesis: an update. Curr. Opin. Immunol. 19, 547–552.

Kikuta, H., Laplante, M., Navratilova, P., Komisarczuk, A.Z., Engstrom, P.G., Fredman,D., Akalin, A., Caccamo, M., Sealy, I., Howe, K., Ghislain, J., Pezeron, G., Mourrain,P., Ellingsen, S., Oates, A.C., Thisse, C., Thisse, B., Foucher, I., Adolf, B., Geling, A.,Lenhard, B., Becker, T.S., 2007. Genomic regulatory blocks encompass multipleneighboring genes and maintain conserved synteny in vertebrates. Genome Res.17, 545–555.

Kumar, S., Nei, M., Dudley, J., Tamura, K., 2008. MEGA: a biologist-centric softwarefor evolutionary analysis of DNA and protein sequences. Brief Bioinform. 9,299–306.

Kuraku, S., Meyer, A., Kuratani, S., 2009. Timing of genome duplications relative tothe origin of the vertebrates: did cyclostomes diverge before or after? Mol. Biol.Evol. 26, 47–59.

Larhammar, D., Lundin, L.G., Hallbook, F., 2002. The human Hox-bearingchromosome regions did arise by block or chromosome (or even genome)duplications. Genome Res. 12, 1910–1920.

Lee, A.P., Koh, E.G., Tay, A., Brenner, S., Venkatesh, B., 2006. Highly conservedsyntenic blocks at the vertebrate Hox loci and conserved regulatory elementswithin and outside Hox gene clusters. Proc. Natl. Acad. Sci. USA 103, 6994–6999.

Lundin, L.G., Larhammar, D., Hallbook, F., 2003. Numerous groups of chromosomalregional paralogies strongly indicate two genome doublings at the root of thevertebrates. J. Struct. Funct. Genomics 3, 53–63.

Marques-Bonet, T., Girirajan, S., Eichler, E.E., 2009. The origins and impact ofprimate segmental duplications. Trends Genet. 25, 443–454.

Martin, A., 2001. Is tetralogy true? Lack of support for the ‘‘one-to-four rule’’. Mol.Biol. Evol. 18, 89–93.

Martin, A.P., 1999. Increasing genomic complexity by gene duplication and theorigin of vertebrates. Am. Nat. 154, 2.

McLysaght, A., Hokamp, K., Wolfe, K.H., 2002. Extensive genomic duplication duringearly chordate evolution. Nat. Genet. 31, 200–204.

Meaburn, K.J., Misteli, T., 2007. Cell biology: chromosome territories. Nature 445,379–781.

Meyer, A., Schartl, M., 1999. Gene and genome duplications in vertebrates: the one-to-four (-to-eight in fish) rule and the evolution of novel gene functions. Curr.Opin. Cell Biol. 11, 699–704.

Ohno, S., 1970. Evolution by Gene Duplication. Springer-Verlag.Ohno, S., 1973. Ancient linkage groups and frozen accidents. Nature 244, 259–262.Russo, C.A., Takezaki, N., Nei, M., 1996. Efficiencies of different genes and different

tree-building methods in recovering a known vertebrate phylogeny. Mol. Biol.Evol. 13, 525–536.

Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method forreconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.

Samonte, R.V., Eichler, E.E., 2002. Segmental duplications and the evolution of theprimate genome. Nat. Rev. Genet. 3, 65–72.

Skrabanek, L., Wolfe, K.H., 1998. Eukaryote genome duplication – where’s theevidence? Curr. Opin. Genet. Dev. 8, 694–700.

Sundstrom, G., Larsson, T.A., Larhammar, D., 2008. Phylogenetic and chromosomalanalyses of multiple gene families syntenic with vertebrate Hox clusters. BMCEvol. Biol. 8, 254.

Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving thesensitivity of progressive multiple sequence alignment through sequenceweighting, position-specific gap penalties and weight matrix choice. NucleicAcids Res. 22, 4673–4680.

Van de Peer, Y., 2004. Computational approaches to unveiling ancient genomeduplications. Nat. Rev. Genet. 5, 752–763.

Vanneste, K., Van de Peer, Y., Maere, S., 2012. Inference of genome duplications fromage distributions revisited. Mol. Biol. Evol.

Whelan, S., Goldman, N., 2001. A general empirical model of protein evolutionderived from multiple protein families using a maximum-likelihood approach.Mol. Biol. Evol. 18, 691–699.

Zhang, G., Cohn, M.J., 2008. Genome duplication and the origin of the vertebrateskeleton. Curr. Opin. Genet. Dev. 18, 387–393.

Zhang, J., Nei, M., 1996. Evolution of Antennapedia-class homeobox genes. Genetics142, 295–303.