Genetic Architecture and Selection of Chinese Cattle...

12
Genetic Architecture and Selection of Chinese Cattle Revealed by Whole Genome Resequencing Chugang Mei, †,1 Hongcheng Wang, †,1 Qijun Liao, †,2 Lizhong Wang, †,2 Gong Cheng, 1 Hongbao Wang, 1 Chunping Zhao, 1 Shancen Zhao, 2 Jiuzhou Song, 3 Xuanmin Guang, 2 George E. Liu, 4 Anning Li, 1 Xueli Wu, 2 Chongzhi Wang, 2 Xiaodong Fang, 2 Xin Zhao, 1,5 Stephen B. Smith, 6 Wucai Yang, 1 Wanqiang Tian, 7 Linsheng Gui, 1 Yingying Zhang, 1 Rodney A. Hill, 8 Zhongliang Jiang, 1 Yaping Xin, 1 Cunling Jia, 1 Xiuzhu Sun, 1 Shuhui Wang, 1 Huanming Yang, 9,10 Jian Wang, 9,10 Wenjuan Zhu,* ,2 and Linsen Zan* ,1 1 College of Animal Science and Technology, Northwest A&F University, Yangling, China 2 BGI Genomics, BGI-Shenzhen, Shenzhen, China 3 Department of Animal and Avian Sciences, University of Maryland, Maryland, USA 4 Animal Genomics and Improvement Laboratory, USDA-ARS, Maryland, USA 5 Department of Animal Science, McGill University, Montreal, Canada 6 Department of Animal Science, Texas A&M University, Texas, USA 7 Yangling Vocational & Technical College, Yangling, China 8 School of Biomedical Sciences, Charles Sturt University, New South Wales, Australia 9 BGI-Shenzhen, Shenzhen, China 10 James D. Watson Institute of Genome Sciences, Hangzhou, China These authors contributed equally to this work. The sequencing reads have been deposited in the NCBI Sequence Read Archive (SRA) under accession PRJNA283480. *Corresponding authors: E-mails: [email protected]; [email protected]. Associate editor: Bing Su Abstract The bovine genetic resources in China are diverse, but their value and potential are yet to be discovered. To determine the genetic diversity and population structure of Chinese cattle, we analyzed the whole genomes of 46 cattle from six phenotypically and geographically representative Chinese cattle breeds, together with 18 Red Angus cattle genomes, 11 Japanese black cattle genomes and taurine and indicine genomes available from previous studies. Our results showed that Chinese cattle originated from hybridization between Bos taurus and Bos indicus. Moreover, we found that the level of genetic variation in Chinese cattle depends upon the degree of indicine content. We also discovered many potential selective sweep regions associated with domestication related to breed-specific characteristics, with selective sweep regions including genes associated with coat color (ERCC2, MC1R, ZBTB17, and MAP2K1), dairy traits (NCAPG, MAPK7, FST, ITFG1, SETMAR, PAG1, CSN3, and RPL37A), and meat production/quality traits (such as BBS2, R3HDM1, IGFBP2, IGFBP5, MYH9, MYH4, and MC5R). These findings substantially expand the catalogue of genetic variants in cattle and reveal new insights into the evolutionary history and domestication traits of Chinese cattle. Key words: whole genome sequencing, selection, Chinese cattle, indicine components, admixture. Introduction Domesticated extant cattle can be categorized into two ma- jor geographic taxa: humpless taurine (B. taurus) and humped indicine (B. indicus) cattle, which diverged from each other >250,000 years ago (Hiendleder et al. 2008; Gibbs et al. 2009; Canavez et al. 2012; Porto-Neto et al. 2013). According to previous reports, taurine cattle were domesticated in the Fertile Crescent 8,000–10,000 years ago, and indicine cattle were domesticated in the Indus Valley 6,000–8,000 years ago (Loftus et al. 1994; Van Vuure 2002; Bickhart et al. 2016). As a representative ruminant, cattle provide hides, meat, and milk for human needs and work as draught animals for pulling carts, ploughing, and other tasks in less mechanized cultures (Sherratt 1983; Zhang et al. 2013). Through artificial selection, >1,000 cattle breeds were estab- lished throughout the world (Scherf and Pilling 2015). Of these breeds, 72 breeds originated from and are endemic to China. These Chinese breeds vary in their intrinsic character- istics and are important genetic resources for cattle world- wide. Chinese cattle have long been used as draught animals and are valued for their parasite resistance, utilization of roughage-based diets and tolerance to environmental challenges (Qiu et al. 1993; Wang and Ding 1996). Chinese cattle are roughly divided into three groups according to their ecological characteristics and sex Article ß The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For Permissions, please e-mail: [email protected] 688 Mol. Biol. Evol. 35(3):688–699 doi:10.1093/molbev/msx322 Advance Access publication December 19, 2017 Downloaded from https://academic.oup.com/mbe/article-abstract/35/3/688/4760963 by Charles Sturt University user on 16 August 2018

Transcript of Genetic Architecture and Selection of Chinese Cattle...

Page 1: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

Genetic Architecture and Selection of Chinese Cattle Revealedby Whole Genome Resequencing

Chugang Meidagger1 Hongcheng Wangdagger1 Qijun Liaodagger2 Lizhong Wangdagger2 Gong Cheng1 Hongbao Wang1

Chunping Zhao1 Shancen Zhao2 Jiuzhou Song3 Xuanmin Guang2 George E Liu4 Anning Li1 Xueli Wu2

Chongzhi Wang2 Xiaodong Fang2 Xin Zhao15 Stephen B Smith6 Wucai Yang1 Wanqiang Tian7

Linsheng Gui1 Yingying Zhang1 Rodney A Hill8 Zhongliang Jiang1 Yaping Xin 1 Cunling Jia1

Xiuzhu Sun1 Shuhui Wang1 Huanming Yang910 Jian Wang910 Wenjuan Zhu2 and Linsen Zan1

1College of Animal Science and Technology Northwest AampF University Yangling China2BGI Genomics BGI-Shenzhen Shenzhen China3Department of Animal and Avian Sciences University of Maryland Maryland USA4Animal Genomics and Improvement Laboratory USDA-ARS Maryland USA5Department of Animal Science McGill University Montreal Canada6Department of Animal Science Texas AampM University Texas USA7Yangling Vocational amp Technical College Yangling China8School of Biomedical Sciences Charles Sturt University New South Wales Australia9BGI-Shenzhen Shenzhen China10James D Watson Institute of Genome Sciences Hangzhou ChinadaggerThese authors contributed equally to this workThe sequencing reads have been deposited in the NCBI Sequence Read Archive (SRA) under accession PRJNA283480

Corresponding authors E-mails zanlinsen163com wenjuanzhugenomicscn

Associate editor Bing Su

Abstract

The bovine genetic resources in China are diverse but their value and potential are yet to be discovered To determinethe genetic diversity and population structure of Chinese cattle we analyzed the whole genomes of 46 cattle from sixphenotypically and geographically representative Chinese cattle breeds together with 18 Red Angus cattle genomes 11Japanese black cattle genomes and taurine and indicine genomes available from previous studies Our results showedthat Chinese cattle originated from hybridization between Bos taurus and Bos indicus Moreover we found that the levelof genetic variation in Chinese cattle depends upon the degree of indicine content We also discovered many potentialselective sweep regions associated with domestication related to breed-specific characteristics with selective sweepregions including genes associated with coat color (ERCC2 MC1R ZBTB17 and MAP2K1) dairy traits (NCAPGMAPK7 FST ITFG1 SETMAR PAG1 CSN3 and RPL37A) and meat productionquality traits (such as BBS2 R3HDM1IGFBP2 IGFBP5 MYH9 MYH4 and MC5R) These findings substantially expand the catalogue of genetic variants in cattleand reveal new insights into the evolutionary history and domestication traits of Chinese cattle

Key words whole genome sequencing selection Chinese cattle indicine components admixture

IntroductionDomesticated extant cattle can be categorized into two ma-jor geographic taxa humpless taurine (B taurus) and humpedindicine (B indicus) cattle which diverged from each othergt250000 years ago (Hiendleder et al 2008 Gibbs et al 2009Canavez et al 2012 Porto-Neto et al 2013) According toprevious reports taurine cattle were domesticated in theFertile Crescent 8000ndash10000 years ago and indicine cattlewere domesticated in the Indus Valley 6000ndash8000 yearsago (Loftus et al 1994 Van Vuure 2002 Bickhart et al2016) As a representative ruminant cattle provide hidesmeat and milk for human needs and work as draught animalsfor pulling carts ploughing and other tasks in less

mechanized cultures (Sherratt 1983 Zhang et al 2013)Through artificial selection gt1000 cattle breeds were estab-lished throughout the world (Scherf and Pilling 2015) Ofthese breeds 72 breeds originated from and are endemic toChina These Chinese breeds vary in their intrinsic character-istics and are important genetic resources for cattle world-wide Chinese cattle have long been used as draught animalsand are valued for their parasite resistance utilization ofroughage-based diets and tolerance to environmentalchallenges (Qiu et al 1993 Wang and Ding 1996)Chinese cattle are roughly divided into three groupsaccording to their ecological characteristics and sex

Article

The Author 2017 Published by Oxford University Press on behalf of the Society for Molecular Biology and EvolutionAll rights reserved For Permissions please e-mail journalspermissionsoupcom

688 Mol Biol Evol 35(3)688ndash699 doi101093molbevmsx322 Advance Access publication December 19 2017

Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

chromosome polymorphisms a southern group largelydescended from the indicine lineage a northern groupbelonging to the taurine lineage and a central groupwhich originated from B taurus B indicus hybrids(Qiu et al 1993 Cai et al 2006)

High-throughput whole genome sequencing can be usedto exploit population structure and characteristics to identifythe effects of selection upon the cattle genome in differentbreeds This approach has been performed with dairy cattlesuch as HolsteinndashFriesian Fleckvieh and Jersey populationsfor traits including embryonic death lethal chondrodysplasiamilk production and curly coat (Daetwyler et al 2014)Studies have also been performed on economic traits withbreeds such as Hereford Black Angus and Limousin (Gibbset al 2009 Stothard et al 2011) Several studies of traits underpositive selection have been performed with many Europeanbreeds however positive selection signatures in Chinese cat-tle have yet to be determined A limited number of phyloge-netic studies of Chinese cattle have been performed with Ychromosomal and mitochondrial DNAs (Lei et al 2000 Caiet al 2007 2014) These sequences reflect the histories ofindividual loci and thus do not have the power to track ar-tificial selection signals complex histories of introgression oradmixture of genomes Thus the population stratification ofChinese cattle and signatures of selection in these breedsremain poorly understood

In this study we performed whole-genome sequencing onsix phenotypically and geographically diverse domesticChinese cattle breeds (Qinchuan cattle QCC nfrac14 37Nanyang cattle NYC nfrac14 2 Luxi cattle LXC nfrac14 1 Yanbiancattle YBC nfrac14 2 Yunnan cattle YNC nfrac14 2 and Leiqiongcattle LQC nfrac14 2) and two non-Chinese breeds (Japaneseblack cattle JBC nfrac14 11 and Red Angus cattle RAN nfrac14 18)Using the obtained whole-genome sequence data togetherwith publicly available whole-genome sequence data for ad-ditional seven breeds we explored the genetic diversity phy-logenetic relationships and demographic history of Chinesecattle We also integrated patterns of hybridization anddetected genes and corresponding variants that are associ-ated with agriculturally important traits Our analyses providenew insights into the population stratification and localbreeding of Chinese cattle and the interface with worldwidedomestic breeds

Results and Discussion

Whole-Genome Sequencing and Genetic VariationWhole-genome sequencing of 75 samples generated a total of2752 billion paired-end reads with 500-bp insert sizeAlignment with the reference genome of B taurus(UMD31) showed an average depth of 114 and an averagecoverage of 9846 (supplementary table S1 SupplementaryMaterial online) To place these cattle in a more detailedphylogeographic context we also analyzed previously pub-lished whole-genome sequence data from individuals of rep-resentative taurine and indicine breeds (nfrac14 76supplementary tables S2 and S3 and fig S1 SupplementaryMaterial online) We detected a total of 5722 million

single-nucleotide polymorphisms (SNPs) and 527 millionsmall insertions and deletions (InDels) (supplementary tableS3 and fig S2 Supplementary Material online) More than half(5990 and 7245) of the SNPs and InDels were absent inthe SNP Database (dbSNP release 140) the novel variantswhich substantially expanded the set of genetic variants incattle were mainly contributed by B indicus and Chinesebreeds especially LQC and QCC (supplementary table S3and figs S3 and S4 Supplementary Material online) Rarevariants (lt1) captured 3764 of the data setApproximately 2154 million autosomal variants had a minorallele frequency lt1 1638 million had frequencies be-tween 1 and 5 and 1930 million had a frequencygt5 (supplementary fig S3a Supplementary Material on-line) The most common variants (8094 of 1930 million)with a gt5 minor allele frequency were found in the dbSNPdatabase In contrast only 3091 (506 million of 1638 mil-lion) of variants were in the range of 1ndash5 in frequency and1050 (226 million of 2154 million) of variants hadfrequencies lt1

Taurine breeds had an average of 506 million single-nucleotide variants (SNVs) per sample 962 of which werefound in the dbSNP database Indicine breeds had an averageof 1191 million SNVs per sample 235 times higher thantaurine breeds (supplementary table S3 SupplementaryMaterial online) Only 5903 of the indicine SNVs werefound in the database Specifically LQC from South Chinahad an average of 1678 million SNVs with only 4863 foundin the database which is 38 times the number for the taurinebreeds (supplementary tables S3 and S4 SupplementaryMaterial online) In addition 9585 of the singletons werefound in Chinese cattle breeds especially LQC (supplemen-tary figs S3a S4 and S5c Supplementary Material online)indicating that the Chinese cattle had high genetic diversity

Most of the cattle groups experienced population bottle-necks during domestication Taurine cattle showed a similarlevel of nucleotide diversity (hp 12 103) as that of yak(Qiu et al 2015) and giant panda (13103) (Zhao et al2013) Moreover they showed higher nucleotide diversitythan that estimated for human populations (10103)and lower than that of indicine cattle (2 103) (supple-mentary table S3 Supplementary Material online) This lowlevel of variation in taurine cattle was also reflected by theextensive linkage disequilibrium (LD) levels among taurinebreeds especially JBC and JER (supplementary fig S6bSupplementary Material online) indicating a severe bottle-neck in taurine cattle (Gibbs et al 2009 Stothard et al 2011Daetwyler et al 2014)

Compared with European cattle (1 103) Chinese cat-tle (2 1034 103) showed relatively high nucleotidediversity The nucleotide diversity of LQC (42103) wasapproximately two times higher than that of indicine breeds(2 103) This distinction between Leiqiong (LQC) and indi-cine cattle was also apparent from the high level of fixationindex (FST) between them (supplementary table S5Supplementary Material online) The difference of Chinesecattle from European cattle was also reflected by the valuesof heterozygosity haplotype diversity runs of homozygosity

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

689Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

(ROH) inbreeding coefficients and identity-by-descent (IBD)(supplementary table S6 and figs S6a and c and S7Supplementary Material online) These findings are consistentwith previous studies (Lai et al 2006 Lei et al 2006 Deckeret al 2014) in which B taurus B indicus admixture events inChinese cattle breeds were found to have significantly in-creased the genetic diversity of Chinese cattle relative toEuropean taurine cattle

By comparing the population-specific SNPs amongB indicus (including only BRM GIR and NEL) LQC andB taurus (including RAN JBC HOL JER FLV and LIM) wefound 938 79 and 4 population-specific nonsynonymousvariants (PSNVs) with high variant allele frequency (gt09)respectively (supplementary table S7 SupplementaryMaterial online) Strikingly some genes had more than threenovel nonsynonymous variants (supplementary tables S8ndashS10 Supplementary Material online) For example therewere 12 5 and 4 nonsynonymous mutations in theSPTBN5 RP1L1 and GHRHR genes respectively of B indicusand 9 8 and 5 novel nonsynonymous mutations in theLOC616720 LOC101903496 and RNF213 genes respectivelyof LQC (supplementary tables S8 and S10 SupplementaryMaterial online) In the SPTBN5 gene there were 14 B indi-cus-specific missense mutations with high allele frequency(gt093 supplementary table S8 Supplementary Material on-line) These missense mutations were only observed in indi-cine cattle and only two of them have been found in dbSNPIn humans SPTBN5 imparts a certain resistance to the para-site Plasmodium falciparum and to enterohemorrhagicEscherichia coli (Ruetz et al 2012 Labrecque et al 2013)Indicine cattle are more resistant to thermal stress parasitesand disease than taurine cattle (Hansen 2004 Sartori et al2010) This knowledge led us to speculate that the SPTBN5gene may contribute to parasite resistance in indicine cattleTherefore these genes with specific nonsynonymous variantsmight have played roles in the formation of the characteristicphenotypes of each breed

Population Structure and Characterization of ChineseCattle BreedsUsing yak (Bos grunniens) as an outgroup we explored thephylogenetic relationships among 151 cattle samples basedon whole genome SNP data The resulting neighbor-joiningtree supported the clustering of the taurine clade (RAN JBCHOL FLV LIM JER and YBC) and the indicine clade (BRMNEL and GIR) NYC LXC YNC and LQC were grouped to-gether near the indicine clade (fig 1a) Interestingly QCC wassituated between these two clades and had many branchesconnecting to the trunk consistent with previous studies (Laiet al 2006 Lei et al 2006 Decker et al 2014) suggesting thatthis breed has two main ancestor components taurine andindicine The principle component analysis (PCA) providedsimilar results with all of the taurine cattle breeds except JBCand YBC forming a tight cluster clearly separate from indicinecattle and most of the Chinese populations occupied inter-mediate positions between the two major clusters (fig 1b andsupplementary figs S8 and S9 Supplementary Material on-line) The PC2 tended to separate populations sampled in

East Asia from those of India and Europe Both the phyloge-netic and PCA analyses indicated a heterogeneous nature ofChinese cattle The genetic influence of B taurus was greateron QCC and YBC than on the other breeds of central andsouthern China whereas B indicus contributed more to LQCYNC NYC and LXC than to the remaining breeds Theseresults are consistent with the hypothesis that Chinese cattlebreeds are admixtures of taurine (B taurus) and indicine (Bindicus) cattle (Yu et al 1999)

We used clustering models for estimating ancestral pop-ulations setting Kfrac14 2 through Kfrac14 6 with ADMIXTURE(Alexander et al 2009) for all 151 cattle samples (fig 1c)With K changing progressively from 2 to 6 we found thatChinese breeds showed evidence of admixture with the ex-tent varying among the different breeds The average ancestryproportions for each of the admixed populations assumingKfrac14 2 ancestral populations are shown in supplementary ta-ble S11 Supplementary Material online We found a strongassociation between genetic diversity and indicine descent inChinese breeds (supplementary fig S10 SupplementaryMaterial online) Our results indicate that the heterogeneousnature of Chinese cattle mainly originated from hybridizationbetween B taurus and B indicus

Inference of Population Size from Whole-GenomeSequencing DataHistorical fluctuations in the effective population size (Ne) forall cattle were reconstructed using the Pairwise SequentialMarkovian Coalescent (PSMC) model and two bottlenecksand two expansions were identified for all cattle (fig 2a andsupplementary figs S11ndashS13 Supplementary Material online)The split time between the ancestor of indicine cattle and theancestor of taurine cattle occurred 16 Ma around whichtime the uplift of the Himalayas (Yuanmu movement16 Ma) (Zheng et al 2002) established geographical isola-tion It is highly possible that the habitat of B primigenius wassplit into two regions (South Asia and South China) resultingin population separation between the ancestor of B indicusand the ancestor of B taurus consistent with the aforemen-tioned results

The cattle population declined08 Ma at the same timeas the three largest Pleistocene glaciations the XixiabangmaGlaciation (XG 11ndash08 Ma) the Naynayxungla Glaciation(NG 078ndash05 Ma) and the Penultimate Glaciation (030ndash013 Ma) (fig 2a) However after a very short bottleneck inthe ancestor of indicine cattle (BRM GIR and NEL)500000 years ago Ne recovered very quickly and reacheda peak140000 years ago In contrast the ancestor of taurinecattle (RAN JBC HOL FLV LIM and JER) experienced a longstable bottleneck until 70000 years ago consistent with theirlower genetic diversity relative to that of indicine cattleDivergence among the ancestors of indicine and taurine cat-tle may have begun05 Ma coinciding with the uplift of theTibetanndashPamir Plateau which caused drying and desertifica-tion that were dramatically enhanced 05 Ma (Fang et al2002) The demographic trajectories of Chinese cattle breeds(LQC LXC NYC QCC YBC and YNC) were distinct fromthose of typical indicine cattle or taurine cattle due to the

Mei et al doi101093molbevmsx322 MBE

690Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

influence of B taurus B indicus admixture events The his-torical pattern of four Chinese cattle breeds (LQC NYC LXCand YNC) with more descent contributed by B indicus(gt69 supplementary table S11 Supplementary Materialonline) roughly correlates with the indicine lineage but isdistinct from QCC and YBC

It is noteworthy that a historical pattern of two bottle-necks and two expansions has been observed in many mam-mals such as yak (Qiu et al 2015) giant panda (Zhao et al2013) wild boar (Choi et al 2013) snub-nosed monkey (Zhouet al 2014) gayal (Mei et al 2016) and bear (Miller et al2012) These concordant patterns suggest that terrestrialmammals might share similar demographic histories andthat the evolution of terrestrial mammals at the Early-Middle Pleistocene boundary was strongly affected by globalglaciations and severely cold climates

The Multiple Sequentially Markovian Coalescent (MSMC)analysis was used to study the genetic separation betweentwo populations as a function of time by modeling the rela-tionships of multiple haplotypes For each population splitscenario the relative cross coalescence rate estimates wereobtained by dividing the cross-population coalescence rate bythe average within-population coalescence rate (Schiffels and

Durbin 2014) Based on the analysis of four haplotypes foreach pair of populations the MSMC results show that thebeginning of the split between the NEL ancestors and the GIRand BRM ancestors occurred 11000 years ago This splitoccurred shortly after the Younger Dryas epoch (an abruptrapid cooling period that occurred 12800ndash11500 years agoChen et al 2006) (fig 2b) The split between the NEL and BRMancestors occurred6600 years ago After separation the Ne

of indicine cattle expanded reaching a peak1500 years agowhereas the Ne of taurine cattle remained stable due to in-breeding and artificial domestication (supplementary fig S14Supplementary Material online) These data suggest that NELGIR and BRM shared the same ancestor and that NEL sepa-rated from indicine ancestors earlier than GIR and BRM didAs we observed the separation was slow and might havebeen the result of continuous gene exchange among thesebreeds

Different from the split among indicine cattle groups asharp separation among the ancestors of taurine breeds oc-curred 5000ndash2000 years ago with a split time at 3500 yearsago (fig 2c) coinciding with the Unetice culture (4200ndash3500 years ago) Taurine breeds underwent strong domesti-cation in a very short time We infer that the considerable

FIG 1 Population genetic analysis (a) Neighbor-joining tree of indicine taurine and hybrid cattle based on our data and publicly available whole-genome sequencing data Orange branches indicate indicine cattle green branches represent taurine cattle and blue branches represent hybridcattle The scale bar represents the identity-by-state (IBS) score between individuals (b) Principal component analysis of cattle (c) Geneticstructure of cattle breeds using the ADMIXTURE program

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

691Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

economic prosperity based on the diversified agriculture ofthe Unetice culture contributed to the early formation ofEuropean cattle breeds (Svizzero and Tisdell 2016)

Signatures of Positive Selection in Cattle GenomeWe identified regions that exhibit high levels of differentiationamong cattle breeds using the di statistic in a reduced data setcontaining breeds with at least 10 samples (fig 3) We treatedGIR NEL and BRM as one group (IND) based on their closegenetic relationships as evidenced by both the PCA and phy-logenetic results (fig 1a and b) Those windows with thehighest average di values within each breed which fell intothe upper 99th percentile of the empirical distribution wereconsidered putative signatures of selection (supplementaryfig S15 Supplementary Material online)

In total we identified 2842 potential selective sweepregions in one or more of the seven breeds (full genomicregions are detailed in supplementary tables S12ndashS18Supplementary Material online) which had an average sizeof 67 kb (ranging from 11 kb to 1150 kb) These regionsharbored 1429 protein-coding genes 682 (4772) of whichwere previously identified as under positive selection in cattle(Randhawa et al 2016) More specifically we detected 357381 232 234 307 300 and 307 potentially positively selectedgenes on breed-specific selection events in the IND QCC JBCRAN FLV HOL and JER genomes respectively (fig 3a and

supplementary tables S19ndashS25 Supplementary Materialonline)

To obtain a broad overview of the molecular functions ofthese genes and to test the hypothesis that particular func-tional classes are enriched in the most differentiated regionsof cattle genome we performed a gene ontology (GO) anal-ysis using ClueGO (Bindea et al 2009) for each group sepa-rately A potential concern regarding this analysis is the lowpower to detect enrichment due to the low expected countsfor many categories Nonetheless several categories showedenrichment for signals of positive selection in one or moregroups (supplementary table S26 Supplementary Materialonline) including the related categories of cellular responseto UV as well as immune response and pathogen defenceThese findings suggest that immune-related genes are perva-sive targets of positive selection because of their critical role inimmune and defence functions

Many genes associated with shaping particular character-istics of the populations are presented within these regions(table 1) These include morphological (coat color hornpolledness) and production traits (dairy muscle formationskeletal development energy partitioning fertility draft traits)

Several genes involved in coat color phenotypes were iden-tified as targets of positive selection in one or more groups(table 1) including ERCC2 in QCC and IND MC1R in INDZBTB17 in QCC and MAP2K1 in JBC One of these genesMC1R is well known for its role in regulating the switch

FIG 2 Demographic history of cattle (a) Ancestral population size is inferred using PSMC A generation time of 5 years and a mutation rate of098 108 mutations per nucleotide per generation are used The relative cross coalescence rates over time between indicine (b) and taurine (c)breeds are estimated using MSMC with four haplotypes each pair

Mei et al doi101093molbevmsx322 MBE

692Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 2: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

chromosome polymorphisms a southern group largelydescended from the indicine lineage a northern groupbelonging to the taurine lineage and a central groupwhich originated from B taurus B indicus hybrids(Qiu et al 1993 Cai et al 2006)

High-throughput whole genome sequencing can be usedto exploit population structure and characteristics to identifythe effects of selection upon the cattle genome in differentbreeds This approach has been performed with dairy cattlesuch as HolsteinndashFriesian Fleckvieh and Jersey populationsfor traits including embryonic death lethal chondrodysplasiamilk production and curly coat (Daetwyler et al 2014)Studies have also been performed on economic traits withbreeds such as Hereford Black Angus and Limousin (Gibbset al 2009 Stothard et al 2011) Several studies of traits underpositive selection have been performed with many Europeanbreeds however positive selection signatures in Chinese cat-tle have yet to be determined A limited number of phyloge-netic studies of Chinese cattle have been performed with Ychromosomal and mitochondrial DNAs (Lei et al 2000 Caiet al 2007 2014) These sequences reflect the histories ofindividual loci and thus do not have the power to track ar-tificial selection signals complex histories of introgression oradmixture of genomes Thus the population stratification ofChinese cattle and signatures of selection in these breedsremain poorly understood

In this study we performed whole-genome sequencing onsix phenotypically and geographically diverse domesticChinese cattle breeds (Qinchuan cattle QCC nfrac14 37Nanyang cattle NYC nfrac14 2 Luxi cattle LXC nfrac14 1 Yanbiancattle YBC nfrac14 2 Yunnan cattle YNC nfrac14 2 and Leiqiongcattle LQC nfrac14 2) and two non-Chinese breeds (Japaneseblack cattle JBC nfrac14 11 and Red Angus cattle RAN nfrac14 18)Using the obtained whole-genome sequence data togetherwith publicly available whole-genome sequence data for ad-ditional seven breeds we explored the genetic diversity phy-logenetic relationships and demographic history of Chinesecattle We also integrated patterns of hybridization anddetected genes and corresponding variants that are associ-ated with agriculturally important traits Our analyses providenew insights into the population stratification and localbreeding of Chinese cattle and the interface with worldwidedomestic breeds

Results and Discussion

Whole-Genome Sequencing and Genetic VariationWhole-genome sequencing of 75 samples generated a total of2752 billion paired-end reads with 500-bp insert sizeAlignment with the reference genome of B taurus(UMD31) showed an average depth of 114 and an averagecoverage of 9846 (supplementary table S1 SupplementaryMaterial online) To place these cattle in a more detailedphylogeographic context we also analyzed previously pub-lished whole-genome sequence data from individuals of rep-resentative taurine and indicine breeds (nfrac14 76supplementary tables S2 and S3 and fig S1 SupplementaryMaterial online) We detected a total of 5722 million

single-nucleotide polymorphisms (SNPs) and 527 millionsmall insertions and deletions (InDels) (supplementary tableS3 and fig S2 Supplementary Material online) More than half(5990 and 7245) of the SNPs and InDels were absent inthe SNP Database (dbSNP release 140) the novel variantswhich substantially expanded the set of genetic variants incattle were mainly contributed by B indicus and Chinesebreeds especially LQC and QCC (supplementary table S3and figs S3 and S4 Supplementary Material online) Rarevariants (lt1) captured 3764 of the data setApproximately 2154 million autosomal variants had a minorallele frequency lt1 1638 million had frequencies be-tween 1 and 5 and 1930 million had a frequencygt5 (supplementary fig S3a Supplementary Material on-line) The most common variants (8094 of 1930 million)with a gt5 minor allele frequency were found in the dbSNPdatabase In contrast only 3091 (506 million of 1638 mil-lion) of variants were in the range of 1ndash5 in frequency and1050 (226 million of 2154 million) of variants hadfrequencies lt1

Taurine breeds had an average of 506 million single-nucleotide variants (SNVs) per sample 962 of which werefound in the dbSNP database Indicine breeds had an averageof 1191 million SNVs per sample 235 times higher thantaurine breeds (supplementary table S3 SupplementaryMaterial online) Only 5903 of the indicine SNVs werefound in the database Specifically LQC from South Chinahad an average of 1678 million SNVs with only 4863 foundin the database which is 38 times the number for the taurinebreeds (supplementary tables S3 and S4 SupplementaryMaterial online) In addition 9585 of the singletons werefound in Chinese cattle breeds especially LQC (supplemen-tary figs S3a S4 and S5c Supplementary Material online)indicating that the Chinese cattle had high genetic diversity

Most of the cattle groups experienced population bottle-necks during domestication Taurine cattle showed a similarlevel of nucleotide diversity (hp 12 103) as that of yak(Qiu et al 2015) and giant panda (13103) (Zhao et al2013) Moreover they showed higher nucleotide diversitythan that estimated for human populations (10103)and lower than that of indicine cattle (2 103) (supple-mentary table S3 Supplementary Material online) This lowlevel of variation in taurine cattle was also reflected by theextensive linkage disequilibrium (LD) levels among taurinebreeds especially JBC and JER (supplementary fig S6bSupplementary Material online) indicating a severe bottle-neck in taurine cattle (Gibbs et al 2009 Stothard et al 2011Daetwyler et al 2014)

Compared with European cattle (1 103) Chinese cat-tle (2 1034 103) showed relatively high nucleotidediversity The nucleotide diversity of LQC (42103) wasapproximately two times higher than that of indicine breeds(2 103) This distinction between Leiqiong (LQC) and indi-cine cattle was also apparent from the high level of fixationindex (FST) between them (supplementary table S5Supplementary Material online) The difference of Chinesecattle from European cattle was also reflected by the valuesof heterozygosity haplotype diversity runs of homozygosity

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

689Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

(ROH) inbreeding coefficients and identity-by-descent (IBD)(supplementary table S6 and figs S6a and c and S7Supplementary Material online) These findings are consistentwith previous studies (Lai et al 2006 Lei et al 2006 Deckeret al 2014) in which B taurus B indicus admixture events inChinese cattle breeds were found to have significantly in-creased the genetic diversity of Chinese cattle relative toEuropean taurine cattle

By comparing the population-specific SNPs amongB indicus (including only BRM GIR and NEL) LQC andB taurus (including RAN JBC HOL JER FLV and LIM) wefound 938 79 and 4 population-specific nonsynonymousvariants (PSNVs) with high variant allele frequency (gt09)respectively (supplementary table S7 SupplementaryMaterial online) Strikingly some genes had more than threenovel nonsynonymous variants (supplementary tables S8ndashS10 Supplementary Material online) For example therewere 12 5 and 4 nonsynonymous mutations in theSPTBN5 RP1L1 and GHRHR genes respectively of B indicusand 9 8 and 5 novel nonsynonymous mutations in theLOC616720 LOC101903496 and RNF213 genes respectivelyof LQC (supplementary tables S8 and S10 SupplementaryMaterial online) In the SPTBN5 gene there were 14 B indi-cus-specific missense mutations with high allele frequency(gt093 supplementary table S8 Supplementary Material on-line) These missense mutations were only observed in indi-cine cattle and only two of them have been found in dbSNPIn humans SPTBN5 imparts a certain resistance to the para-site Plasmodium falciparum and to enterohemorrhagicEscherichia coli (Ruetz et al 2012 Labrecque et al 2013)Indicine cattle are more resistant to thermal stress parasitesand disease than taurine cattle (Hansen 2004 Sartori et al2010) This knowledge led us to speculate that the SPTBN5gene may contribute to parasite resistance in indicine cattleTherefore these genes with specific nonsynonymous variantsmight have played roles in the formation of the characteristicphenotypes of each breed

Population Structure and Characterization of ChineseCattle BreedsUsing yak (Bos grunniens) as an outgroup we explored thephylogenetic relationships among 151 cattle samples basedon whole genome SNP data The resulting neighbor-joiningtree supported the clustering of the taurine clade (RAN JBCHOL FLV LIM JER and YBC) and the indicine clade (BRMNEL and GIR) NYC LXC YNC and LQC were grouped to-gether near the indicine clade (fig 1a) Interestingly QCC wassituated between these two clades and had many branchesconnecting to the trunk consistent with previous studies (Laiet al 2006 Lei et al 2006 Decker et al 2014) suggesting thatthis breed has two main ancestor components taurine andindicine The principle component analysis (PCA) providedsimilar results with all of the taurine cattle breeds except JBCand YBC forming a tight cluster clearly separate from indicinecattle and most of the Chinese populations occupied inter-mediate positions between the two major clusters (fig 1b andsupplementary figs S8 and S9 Supplementary Material on-line) The PC2 tended to separate populations sampled in

East Asia from those of India and Europe Both the phyloge-netic and PCA analyses indicated a heterogeneous nature ofChinese cattle The genetic influence of B taurus was greateron QCC and YBC than on the other breeds of central andsouthern China whereas B indicus contributed more to LQCYNC NYC and LXC than to the remaining breeds Theseresults are consistent with the hypothesis that Chinese cattlebreeds are admixtures of taurine (B taurus) and indicine (Bindicus) cattle (Yu et al 1999)

We used clustering models for estimating ancestral pop-ulations setting Kfrac14 2 through Kfrac14 6 with ADMIXTURE(Alexander et al 2009) for all 151 cattle samples (fig 1c)With K changing progressively from 2 to 6 we found thatChinese breeds showed evidence of admixture with the ex-tent varying among the different breeds The average ancestryproportions for each of the admixed populations assumingKfrac14 2 ancestral populations are shown in supplementary ta-ble S11 Supplementary Material online We found a strongassociation between genetic diversity and indicine descent inChinese breeds (supplementary fig S10 SupplementaryMaterial online) Our results indicate that the heterogeneousnature of Chinese cattle mainly originated from hybridizationbetween B taurus and B indicus

Inference of Population Size from Whole-GenomeSequencing DataHistorical fluctuations in the effective population size (Ne) forall cattle were reconstructed using the Pairwise SequentialMarkovian Coalescent (PSMC) model and two bottlenecksand two expansions were identified for all cattle (fig 2a andsupplementary figs S11ndashS13 Supplementary Material online)The split time between the ancestor of indicine cattle and theancestor of taurine cattle occurred 16 Ma around whichtime the uplift of the Himalayas (Yuanmu movement16 Ma) (Zheng et al 2002) established geographical isola-tion It is highly possible that the habitat of B primigenius wassplit into two regions (South Asia and South China) resultingin population separation between the ancestor of B indicusand the ancestor of B taurus consistent with the aforemen-tioned results

The cattle population declined08 Ma at the same timeas the three largest Pleistocene glaciations the XixiabangmaGlaciation (XG 11ndash08 Ma) the Naynayxungla Glaciation(NG 078ndash05 Ma) and the Penultimate Glaciation (030ndash013 Ma) (fig 2a) However after a very short bottleneck inthe ancestor of indicine cattle (BRM GIR and NEL)500000 years ago Ne recovered very quickly and reacheda peak140000 years ago In contrast the ancestor of taurinecattle (RAN JBC HOL FLV LIM and JER) experienced a longstable bottleneck until 70000 years ago consistent with theirlower genetic diversity relative to that of indicine cattleDivergence among the ancestors of indicine and taurine cat-tle may have begun05 Ma coinciding with the uplift of theTibetanndashPamir Plateau which caused drying and desertifica-tion that were dramatically enhanced 05 Ma (Fang et al2002) The demographic trajectories of Chinese cattle breeds(LQC LXC NYC QCC YBC and YNC) were distinct fromthose of typical indicine cattle or taurine cattle due to the

Mei et al doi101093molbevmsx322 MBE

690Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

influence of B taurus B indicus admixture events The his-torical pattern of four Chinese cattle breeds (LQC NYC LXCand YNC) with more descent contributed by B indicus(gt69 supplementary table S11 Supplementary Materialonline) roughly correlates with the indicine lineage but isdistinct from QCC and YBC

It is noteworthy that a historical pattern of two bottle-necks and two expansions has been observed in many mam-mals such as yak (Qiu et al 2015) giant panda (Zhao et al2013) wild boar (Choi et al 2013) snub-nosed monkey (Zhouet al 2014) gayal (Mei et al 2016) and bear (Miller et al2012) These concordant patterns suggest that terrestrialmammals might share similar demographic histories andthat the evolution of terrestrial mammals at the Early-Middle Pleistocene boundary was strongly affected by globalglaciations and severely cold climates

The Multiple Sequentially Markovian Coalescent (MSMC)analysis was used to study the genetic separation betweentwo populations as a function of time by modeling the rela-tionships of multiple haplotypes For each population splitscenario the relative cross coalescence rate estimates wereobtained by dividing the cross-population coalescence rate bythe average within-population coalescence rate (Schiffels and

Durbin 2014) Based on the analysis of four haplotypes foreach pair of populations the MSMC results show that thebeginning of the split between the NEL ancestors and the GIRand BRM ancestors occurred 11000 years ago This splitoccurred shortly after the Younger Dryas epoch (an abruptrapid cooling period that occurred 12800ndash11500 years agoChen et al 2006) (fig 2b) The split between the NEL and BRMancestors occurred6600 years ago After separation the Ne

of indicine cattle expanded reaching a peak1500 years agowhereas the Ne of taurine cattle remained stable due to in-breeding and artificial domestication (supplementary fig S14Supplementary Material online) These data suggest that NELGIR and BRM shared the same ancestor and that NEL sepa-rated from indicine ancestors earlier than GIR and BRM didAs we observed the separation was slow and might havebeen the result of continuous gene exchange among thesebreeds

Different from the split among indicine cattle groups asharp separation among the ancestors of taurine breeds oc-curred 5000ndash2000 years ago with a split time at 3500 yearsago (fig 2c) coinciding with the Unetice culture (4200ndash3500 years ago) Taurine breeds underwent strong domesti-cation in a very short time We infer that the considerable

FIG 1 Population genetic analysis (a) Neighbor-joining tree of indicine taurine and hybrid cattle based on our data and publicly available whole-genome sequencing data Orange branches indicate indicine cattle green branches represent taurine cattle and blue branches represent hybridcattle The scale bar represents the identity-by-state (IBS) score between individuals (b) Principal component analysis of cattle (c) Geneticstructure of cattle breeds using the ADMIXTURE program

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

691Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

economic prosperity based on the diversified agriculture ofthe Unetice culture contributed to the early formation ofEuropean cattle breeds (Svizzero and Tisdell 2016)

Signatures of Positive Selection in Cattle GenomeWe identified regions that exhibit high levels of differentiationamong cattle breeds using the di statistic in a reduced data setcontaining breeds with at least 10 samples (fig 3) We treatedGIR NEL and BRM as one group (IND) based on their closegenetic relationships as evidenced by both the PCA and phy-logenetic results (fig 1a and b) Those windows with thehighest average di values within each breed which fell intothe upper 99th percentile of the empirical distribution wereconsidered putative signatures of selection (supplementaryfig S15 Supplementary Material online)

In total we identified 2842 potential selective sweepregions in one or more of the seven breeds (full genomicregions are detailed in supplementary tables S12ndashS18Supplementary Material online) which had an average sizeof 67 kb (ranging from 11 kb to 1150 kb) These regionsharbored 1429 protein-coding genes 682 (4772) of whichwere previously identified as under positive selection in cattle(Randhawa et al 2016) More specifically we detected 357381 232 234 307 300 and 307 potentially positively selectedgenes on breed-specific selection events in the IND QCC JBCRAN FLV HOL and JER genomes respectively (fig 3a and

supplementary tables S19ndashS25 Supplementary Materialonline)

To obtain a broad overview of the molecular functions ofthese genes and to test the hypothesis that particular func-tional classes are enriched in the most differentiated regionsof cattle genome we performed a gene ontology (GO) anal-ysis using ClueGO (Bindea et al 2009) for each group sepa-rately A potential concern regarding this analysis is the lowpower to detect enrichment due to the low expected countsfor many categories Nonetheless several categories showedenrichment for signals of positive selection in one or moregroups (supplementary table S26 Supplementary Materialonline) including the related categories of cellular responseto UV as well as immune response and pathogen defenceThese findings suggest that immune-related genes are perva-sive targets of positive selection because of their critical role inimmune and defence functions

Many genes associated with shaping particular character-istics of the populations are presented within these regions(table 1) These include morphological (coat color hornpolledness) and production traits (dairy muscle formationskeletal development energy partitioning fertility draft traits)

Several genes involved in coat color phenotypes were iden-tified as targets of positive selection in one or more groups(table 1) including ERCC2 in QCC and IND MC1R in INDZBTB17 in QCC and MAP2K1 in JBC One of these genesMC1R is well known for its role in regulating the switch

FIG 2 Demographic history of cattle (a) Ancestral population size is inferred using PSMC A generation time of 5 years and a mutation rate of098 108 mutations per nucleotide per generation are used The relative cross coalescence rates over time between indicine (b) and taurine (c)breeds are estimated using MSMC with four haplotypes each pair

Mei et al doi101093molbevmsx322 MBE

692Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 3: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

(ROH) inbreeding coefficients and identity-by-descent (IBD)(supplementary table S6 and figs S6a and c and S7Supplementary Material online) These findings are consistentwith previous studies (Lai et al 2006 Lei et al 2006 Deckeret al 2014) in which B taurus B indicus admixture events inChinese cattle breeds were found to have significantly in-creased the genetic diversity of Chinese cattle relative toEuropean taurine cattle

By comparing the population-specific SNPs amongB indicus (including only BRM GIR and NEL) LQC andB taurus (including RAN JBC HOL JER FLV and LIM) wefound 938 79 and 4 population-specific nonsynonymousvariants (PSNVs) with high variant allele frequency (gt09)respectively (supplementary table S7 SupplementaryMaterial online) Strikingly some genes had more than threenovel nonsynonymous variants (supplementary tables S8ndashS10 Supplementary Material online) For example therewere 12 5 and 4 nonsynonymous mutations in theSPTBN5 RP1L1 and GHRHR genes respectively of B indicusand 9 8 and 5 novel nonsynonymous mutations in theLOC616720 LOC101903496 and RNF213 genes respectivelyof LQC (supplementary tables S8 and S10 SupplementaryMaterial online) In the SPTBN5 gene there were 14 B indi-cus-specific missense mutations with high allele frequency(gt093 supplementary table S8 Supplementary Material on-line) These missense mutations were only observed in indi-cine cattle and only two of them have been found in dbSNPIn humans SPTBN5 imparts a certain resistance to the para-site Plasmodium falciparum and to enterohemorrhagicEscherichia coli (Ruetz et al 2012 Labrecque et al 2013)Indicine cattle are more resistant to thermal stress parasitesand disease than taurine cattle (Hansen 2004 Sartori et al2010) This knowledge led us to speculate that the SPTBN5gene may contribute to parasite resistance in indicine cattleTherefore these genes with specific nonsynonymous variantsmight have played roles in the formation of the characteristicphenotypes of each breed

Population Structure and Characterization of ChineseCattle BreedsUsing yak (Bos grunniens) as an outgroup we explored thephylogenetic relationships among 151 cattle samples basedon whole genome SNP data The resulting neighbor-joiningtree supported the clustering of the taurine clade (RAN JBCHOL FLV LIM JER and YBC) and the indicine clade (BRMNEL and GIR) NYC LXC YNC and LQC were grouped to-gether near the indicine clade (fig 1a) Interestingly QCC wassituated between these two clades and had many branchesconnecting to the trunk consistent with previous studies (Laiet al 2006 Lei et al 2006 Decker et al 2014) suggesting thatthis breed has two main ancestor components taurine andindicine The principle component analysis (PCA) providedsimilar results with all of the taurine cattle breeds except JBCand YBC forming a tight cluster clearly separate from indicinecattle and most of the Chinese populations occupied inter-mediate positions between the two major clusters (fig 1b andsupplementary figs S8 and S9 Supplementary Material on-line) The PC2 tended to separate populations sampled in

East Asia from those of India and Europe Both the phyloge-netic and PCA analyses indicated a heterogeneous nature ofChinese cattle The genetic influence of B taurus was greateron QCC and YBC than on the other breeds of central andsouthern China whereas B indicus contributed more to LQCYNC NYC and LXC than to the remaining breeds Theseresults are consistent with the hypothesis that Chinese cattlebreeds are admixtures of taurine (B taurus) and indicine (Bindicus) cattle (Yu et al 1999)

We used clustering models for estimating ancestral pop-ulations setting Kfrac14 2 through Kfrac14 6 with ADMIXTURE(Alexander et al 2009) for all 151 cattle samples (fig 1c)With K changing progressively from 2 to 6 we found thatChinese breeds showed evidence of admixture with the ex-tent varying among the different breeds The average ancestryproportions for each of the admixed populations assumingKfrac14 2 ancestral populations are shown in supplementary ta-ble S11 Supplementary Material online We found a strongassociation between genetic diversity and indicine descent inChinese breeds (supplementary fig S10 SupplementaryMaterial online) Our results indicate that the heterogeneousnature of Chinese cattle mainly originated from hybridizationbetween B taurus and B indicus

Inference of Population Size from Whole-GenomeSequencing DataHistorical fluctuations in the effective population size (Ne) forall cattle were reconstructed using the Pairwise SequentialMarkovian Coalescent (PSMC) model and two bottlenecksand two expansions were identified for all cattle (fig 2a andsupplementary figs S11ndashS13 Supplementary Material online)The split time between the ancestor of indicine cattle and theancestor of taurine cattle occurred 16 Ma around whichtime the uplift of the Himalayas (Yuanmu movement16 Ma) (Zheng et al 2002) established geographical isola-tion It is highly possible that the habitat of B primigenius wassplit into two regions (South Asia and South China) resultingin population separation between the ancestor of B indicusand the ancestor of B taurus consistent with the aforemen-tioned results

The cattle population declined08 Ma at the same timeas the three largest Pleistocene glaciations the XixiabangmaGlaciation (XG 11ndash08 Ma) the Naynayxungla Glaciation(NG 078ndash05 Ma) and the Penultimate Glaciation (030ndash013 Ma) (fig 2a) However after a very short bottleneck inthe ancestor of indicine cattle (BRM GIR and NEL)500000 years ago Ne recovered very quickly and reacheda peak140000 years ago In contrast the ancestor of taurinecattle (RAN JBC HOL FLV LIM and JER) experienced a longstable bottleneck until 70000 years ago consistent with theirlower genetic diversity relative to that of indicine cattleDivergence among the ancestors of indicine and taurine cat-tle may have begun05 Ma coinciding with the uplift of theTibetanndashPamir Plateau which caused drying and desertifica-tion that were dramatically enhanced 05 Ma (Fang et al2002) The demographic trajectories of Chinese cattle breeds(LQC LXC NYC QCC YBC and YNC) were distinct fromthose of typical indicine cattle or taurine cattle due to the

Mei et al doi101093molbevmsx322 MBE

690Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

influence of B taurus B indicus admixture events The his-torical pattern of four Chinese cattle breeds (LQC NYC LXCand YNC) with more descent contributed by B indicus(gt69 supplementary table S11 Supplementary Materialonline) roughly correlates with the indicine lineage but isdistinct from QCC and YBC

It is noteworthy that a historical pattern of two bottle-necks and two expansions has been observed in many mam-mals such as yak (Qiu et al 2015) giant panda (Zhao et al2013) wild boar (Choi et al 2013) snub-nosed monkey (Zhouet al 2014) gayal (Mei et al 2016) and bear (Miller et al2012) These concordant patterns suggest that terrestrialmammals might share similar demographic histories andthat the evolution of terrestrial mammals at the Early-Middle Pleistocene boundary was strongly affected by globalglaciations and severely cold climates

The Multiple Sequentially Markovian Coalescent (MSMC)analysis was used to study the genetic separation betweentwo populations as a function of time by modeling the rela-tionships of multiple haplotypes For each population splitscenario the relative cross coalescence rate estimates wereobtained by dividing the cross-population coalescence rate bythe average within-population coalescence rate (Schiffels and

Durbin 2014) Based on the analysis of four haplotypes foreach pair of populations the MSMC results show that thebeginning of the split between the NEL ancestors and the GIRand BRM ancestors occurred 11000 years ago This splitoccurred shortly after the Younger Dryas epoch (an abruptrapid cooling period that occurred 12800ndash11500 years agoChen et al 2006) (fig 2b) The split between the NEL and BRMancestors occurred6600 years ago After separation the Ne

of indicine cattle expanded reaching a peak1500 years agowhereas the Ne of taurine cattle remained stable due to in-breeding and artificial domestication (supplementary fig S14Supplementary Material online) These data suggest that NELGIR and BRM shared the same ancestor and that NEL sepa-rated from indicine ancestors earlier than GIR and BRM didAs we observed the separation was slow and might havebeen the result of continuous gene exchange among thesebreeds

Different from the split among indicine cattle groups asharp separation among the ancestors of taurine breeds oc-curred 5000ndash2000 years ago with a split time at 3500 yearsago (fig 2c) coinciding with the Unetice culture (4200ndash3500 years ago) Taurine breeds underwent strong domesti-cation in a very short time We infer that the considerable

FIG 1 Population genetic analysis (a) Neighbor-joining tree of indicine taurine and hybrid cattle based on our data and publicly available whole-genome sequencing data Orange branches indicate indicine cattle green branches represent taurine cattle and blue branches represent hybridcattle The scale bar represents the identity-by-state (IBS) score between individuals (b) Principal component analysis of cattle (c) Geneticstructure of cattle breeds using the ADMIXTURE program

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

691Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

economic prosperity based on the diversified agriculture ofthe Unetice culture contributed to the early formation ofEuropean cattle breeds (Svizzero and Tisdell 2016)

Signatures of Positive Selection in Cattle GenomeWe identified regions that exhibit high levels of differentiationamong cattle breeds using the di statistic in a reduced data setcontaining breeds with at least 10 samples (fig 3) We treatedGIR NEL and BRM as one group (IND) based on their closegenetic relationships as evidenced by both the PCA and phy-logenetic results (fig 1a and b) Those windows with thehighest average di values within each breed which fell intothe upper 99th percentile of the empirical distribution wereconsidered putative signatures of selection (supplementaryfig S15 Supplementary Material online)

In total we identified 2842 potential selective sweepregions in one or more of the seven breeds (full genomicregions are detailed in supplementary tables S12ndashS18Supplementary Material online) which had an average sizeof 67 kb (ranging from 11 kb to 1150 kb) These regionsharbored 1429 protein-coding genes 682 (4772) of whichwere previously identified as under positive selection in cattle(Randhawa et al 2016) More specifically we detected 357381 232 234 307 300 and 307 potentially positively selectedgenes on breed-specific selection events in the IND QCC JBCRAN FLV HOL and JER genomes respectively (fig 3a and

supplementary tables S19ndashS25 Supplementary Materialonline)

To obtain a broad overview of the molecular functions ofthese genes and to test the hypothesis that particular func-tional classes are enriched in the most differentiated regionsof cattle genome we performed a gene ontology (GO) anal-ysis using ClueGO (Bindea et al 2009) for each group sepa-rately A potential concern regarding this analysis is the lowpower to detect enrichment due to the low expected countsfor many categories Nonetheless several categories showedenrichment for signals of positive selection in one or moregroups (supplementary table S26 Supplementary Materialonline) including the related categories of cellular responseto UV as well as immune response and pathogen defenceThese findings suggest that immune-related genes are perva-sive targets of positive selection because of their critical role inimmune and defence functions

Many genes associated with shaping particular character-istics of the populations are presented within these regions(table 1) These include morphological (coat color hornpolledness) and production traits (dairy muscle formationskeletal development energy partitioning fertility draft traits)

Several genes involved in coat color phenotypes were iden-tified as targets of positive selection in one or more groups(table 1) including ERCC2 in QCC and IND MC1R in INDZBTB17 in QCC and MAP2K1 in JBC One of these genesMC1R is well known for its role in regulating the switch

FIG 2 Demographic history of cattle (a) Ancestral population size is inferred using PSMC A generation time of 5 years and a mutation rate of098 108 mutations per nucleotide per generation are used The relative cross coalescence rates over time between indicine (b) and taurine (c)breeds are estimated using MSMC with four haplotypes each pair

Mei et al doi101093molbevmsx322 MBE

692Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 4: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

influence of B taurus B indicus admixture events The his-torical pattern of four Chinese cattle breeds (LQC NYC LXCand YNC) with more descent contributed by B indicus(gt69 supplementary table S11 Supplementary Materialonline) roughly correlates with the indicine lineage but isdistinct from QCC and YBC

It is noteworthy that a historical pattern of two bottle-necks and two expansions has been observed in many mam-mals such as yak (Qiu et al 2015) giant panda (Zhao et al2013) wild boar (Choi et al 2013) snub-nosed monkey (Zhouet al 2014) gayal (Mei et al 2016) and bear (Miller et al2012) These concordant patterns suggest that terrestrialmammals might share similar demographic histories andthat the evolution of terrestrial mammals at the Early-Middle Pleistocene boundary was strongly affected by globalglaciations and severely cold climates

The Multiple Sequentially Markovian Coalescent (MSMC)analysis was used to study the genetic separation betweentwo populations as a function of time by modeling the rela-tionships of multiple haplotypes For each population splitscenario the relative cross coalescence rate estimates wereobtained by dividing the cross-population coalescence rate bythe average within-population coalescence rate (Schiffels and

Durbin 2014) Based on the analysis of four haplotypes foreach pair of populations the MSMC results show that thebeginning of the split between the NEL ancestors and the GIRand BRM ancestors occurred 11000 years ago This splitoccurred shortly after the Younger Dryas epoch (an abruptrapid cooling period that occurred 12800ndash11500 years agoChen et al 2006) (fig 2b) The split between the NEL and BRMancestors occurred6600 years ago After separation the Ne

of indicine cattle expanded reaching a peak1500 years agowhereas the Ne of taurine cattle remained stable due to in-breeding and artificial domestication (supplementary fig S14Supplementary Material online) These data suggest that NELGIR and BRM shared the same ancestor and that NEL sepa-rated from indicine ancestors earlier than GIR and BRM didAs we observed the separation was slow and might havebeen the result of continuous gene exchange among thesebreeds

Different from the split among indicine cattle groups asharp separation among the ancestors of taurine breeds oc-curred 5000ndash2000 years ago with a split time at 3500 yearsago (fig 2c) coinciding with the Unetice culture (4200ndash3500 years ago) Taurine breeds underwent strong domesti-cation in a very short time We infer that the considerable

FIG 1 Population genetic analysis (a) Neighbor-joining tree of indicine taurine and hybrid cattle based on our data and publicly available whole-genome sequencing data Orange branches indicate indicine cattle green branches represent taurine cattle and blue branches represent hybridcattle The scale bar represents the identity-by-state (IBS) score between individuals (b) Principal component analysis of cattle (c) Geneticstructure of cattle breeds using the ADMIXTURE program

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

691Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

economic prosperity based on the diversified agriculture ofthe Unetice culture contributed to the early formation ofEuropean cattle breeds (Svizzero and Tisdell 2016)

Signatures of Positive Selection in Cattle GenomeWe identified regions that exhibit high levels of differentiationamong cattle breeds using the di statistic in a reduced data setcontaining breeds with at least 10 samples (fig 3) We treatedGIR NEL and BRM as one group (IND) based on their closegenetic relationships as evidenced by both the PCA and phy-logenetic results (fig 1a and b) Those windows with thehighest average di values within each breed which fell intothe upper 99th percentile of the empirical distribution wereconsidered putative signatures of selection (supplementaryfig S15 Supplementary Material online)

In total we identified 2842 potential selective sweepregions in one or more of the seven breeds (full genomicregions are detailed in supplementary tables S12ndashS18Supplementary Material online) which had an average sizeof 67 kb (ranging from 11 kb to 1150 kb) These regionsharbored 1429 protein-coding genes 682 (4772) of whichwere previously identified as under positive selection in cattle(Randhawa et al 2016) More specifically we detected 357381 232 234 307 300 and 307 potentially positively selectedgenes on breed-specific selection events in the IND QCC JBCRAN FLV HOL and JER genomes respectively (fig 3a and

supplementary tables S19ndashS25 Supplementary Materialonline)

To obtain a broad overview of the molecular functions ofthese genes and to test the hypothesis that particular func-tional classes are enriched in the most differentiated regionsof cattle genome we performed a gene ontology (GO) anal-ysis using ClueGO (Bindea et al 2009) for each group sepa-rately A potential concern regarding this analysis is the lowpower to detect enrichment due to the low expected countsfor many categories Nonetheless several categories showedenrichment for signals of positive selection in one or moregroups (supplementary table S26 Supplementary Materialonline) including the related categories of cellular responseto UV as well as immune response and pathogen defenceThese findings suggest that immune-related genes are perva-sive targets of positive selection because of their critical role inimmune and defence functions

Many genes associated with shaping particular character-istics of the populations are presented within these regions(table 1) These include morphological (coat color hornpolledness) and production traits (dairy muscle formationskeletal development energy partitioning fertility draft traits)

Several genes involved in coat color phenotypes were iden-tified as targets of positive selection in one or more groups(table 1) including ERCC2 in QCC and IND MC1R in INDZBTB17 in QCC and MAP2K1 in JBC One of these genesMC1R is well known for its role in regulating the switch

FIG 2 Demographic history of cattle (a) Ancestral population size is inferred using PSMC A generation time of 5 years and a mutation rate of098 108 mutations per nucleotide per generation are used The relative cross coalescence rates over time between indicine (b) and taurine (c)breeds are estimated using MSMC with four haplotypes each pair

Mei et al doi101093molbevmsx322 MBE

692Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 5: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

economic prosperity based on the diversified agriculture ofthe Unetice culture contributed to the early formation ofEuropean cattle breeds (Svizzero and Tisdell 2016)

Signatures of Positive Selection in Cattle GenomeWe identified regions that exhibit high levels of differentiationamong cattle breeds using the di statistic in a reduced data setcontaining breeds with at least 10 samples (fig 3) We treatedGIR NEL and BRM as one group (IND) based on their closegenetic relationships as evidenced by both the PCA and phy-logenetic results (fig 1a and b) Those windows with thehighest average di values within each breed which fell intothe upper 99th percentile of the empirical distribution wereconsidered putative signatures of selection (supplementaryfig S15 Supplementary Material online)

In total we identified 2842 potential selective sweepregions in one or more of the seven breeds (full genomicregions are detailed in supplementary tables S12ndashS18Supplementary Material online) which had an average sizeof 67 kb (ranging from 11 kb to 1150 kb) These regionsharbored 1429 protein-coding genes 682 (4772) of whichwere previously identified as under positive selection in cattle(Randhawa et al 2016) More specifically we detected 357381 232 234 307 300 and 307 potentially positively selectedgenes on breed-specific selection events in the IND QCC JBCRAN FLV HOL and JER genomes respectively (fig 3a and

supplementary tables S19ndashS25 Supplementary Materialonline)

To obtain a broad overview of the molecular functions ofthese genes and to test the hypothesis that particular func-tional classes are enriched in the most differentiated regionsof cattle genome we performed a gene ontology (GO) anal-ysis using ClueGO (Bindea et al 2009) for each group sepa-rately A potential concern regarding this analysis is the lowpower to detect enrichment due to the low expected countsfor many categories Nonetheless several categories showedenrichment for signals of positive selection in one or moregroups (supplementary table S26 Supplementary Materialonline) including the related categories of cellular responseto UV as well as immune response and pathogen defenceThese findings suggest that immune-related genes are perva-sive targets of positive selection because of their critical role inimmune and defence functions

Many genes associated with shaping particular character-istics of the populations are presented within these regions(table 1) These include morphological (coat color hornpolledness) and production traits (dairy muscle formationskeletal development energy partitioning fertility draft traits)

Several genes involved in coat color phenotypes were iden-tified as targets of positive selection in one or more groups(table 1) including ERCC2 in QCC and IND MC1R in INDZBTB17 in QCC and MAP2K1 in JBC One of these genesMC1R is well known for its role in regulating the switch

FIG 2 Demographic history of cattle (a) Ancestral population size is inferred using PSMC A generation time of 5 years and a mutation rate of098 108 mutations per nucleotide per generation are used The relative cross coalescence rates over time between indicine (b) and taurine (c)breeds are estimated using MSMC with four haplotypes each pair

Mei et al doi101093molbevmsx322 MBE

692Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 6: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

between eumelanin and pheomelanin biosynthesis pathwaysin mammals including cattle The selection signals of IND inMC1R represent the significant role of light coloration (lightgray to white in BRM and NEL yellowish-red to white in GIR)associated with the adaptation of IND to its tropicalenvironment

Some of the strongest signals of selection appeared invarious types of genes related to production traits (table 1)

For example several genes involved in milk productionshowed clear evidence of positive selection in dairy cattle(NCAPG in both FLV and JER MAPK7 in FLV FST ITFG1SETMAR and PAG1 in HOL CSN3 and RPL37A in JER)Various genes involved in meat traits have also been targetsof recent positive selection (table 1) Some genes related toskeletal muscle development and muscle fibre type appearedto be targets of positive selection including CASP9 DIO1

FIG 3 Candidate positively selected genes (a) Shared candidate positively selected genes among groups Only partial numbers are shown (b) Anexample of selective sweep at the BBS2 gene in JBC only positive values of di are shown (top) The Tajimarsquos D values in each group are shown(middle) SNPs with minor allele frequenciesgt 005 are used to construct haplotype patterns (bottom) The major alleles in JBC are green and theminor alleles in JBC are yellow

Table 1 Summary of Partial Traits Associated with Positively Selected Genes

Gene Breed Trait Reference

MC1R IND CC (Lee et al 2002 Gan et al 2007)MAP2K1a JBC CC (Gutierrez-Gil et al 2015)ZBTB17 QCC CC (Gutierrez-Gil et al 2015)ERCC2a QCC IND CC (Gutierrez-Gil et al 2015)FST ITFG1 SETMAR PAG1 HOL DY (Bech-Sabat et al 2008 Rincon et al 2009

Bloise et al 2010 Xu et al 2015)RPL37Aa CSN3 JER DY (Wedholm et al 2006 Yahvah et al 2015)MAPK7a FLV DY (Lin et al 2013)NCAPG FLV JER DY (Eberlein et al 2009 Setoguchi et al 2011)BBS2a S1PR3a LRP2BPa IGFBP2

IGFBP5 MYH9 ASGR1JBC MT GT (Forti et al 2007 Sattler and Levkau 2009 Zhang et al 2012

Lee et al 2013 Sorbolini et al 2015 Yoon and Ko 2016)SRPK3a POLDIP2a SLC2A5

TMEM97 MYH4RAN MT GT (Smith et al 2001 Clark et al 2011 Xu et al 2011

Zhao et al 2012 Lee et al 2013 Zhang et al 2016)CASP9a DIO1 SREBF2 PLOD3 QCC MT GT (Ouali et al 2006 Lee et al 2013)SLC2A4a OSTN CPT2 CSF2RB FLV MT GT (Grindflek et al 2002 Lee et al 2013 Xu et al 2015)MC5Ra IND MT GT (Kovacik et al 2012 Switonski et al 2013)AOX1a QCC RAN FLV JER and IND MT GT (Brandes et al 1995)R3HDM1 QCC FLV HOL JER and IND MT FC (Gibbs et al 2009)

CC coat color MT meat traits GT growth traits DY dairy traits FC food conversion efficiencyaNewly identified genes associated with phenotypic features of cattle

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

693Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 7: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

SREBF2 and PLOD3 in QCC ASGR1 IGFBP2 IGFBP5 andMYH9 in JBC OSTN CPT2 CSF2RB and SLC2A4 in FLV andSLC2A5 TMEM97 MYH4 SRPK3 and POLDIP2 in RAN Wealso found that a set of important genes associated with lipidmetabolism were putatively positively selected (AOX1 inQCC RAN FLV JER and IND MC5R in IND BBS2 S1PR3and LRP2BP in JBC) Interestingly we identified a missensemutation in BBS2 (exon15 rs135889003 cA1880G pQ627R)that was almost fixed (allele frequencygt 095) in JBC a breedknown for producing the intensely marbled Wagyu beef (withgt30 intramuscular fat of beef) (Gotoh et al 2014) BBS2 is amember of the BardetndashBiedl syndrome gene family the pri-mary clinical feature of which is obesity and has been foundto play a significant role in adipogenesis (Forti et al 2007) Thepositive selection signals near the BBS2 region are furtherconfirmed by significantly lower values of Tajimarsquos D andthe long haplotype patterns in JBC (fig 3b) which may beuseful as a genetic target for breeding selection for beef mar-bling improvement In addition R3HDM1 a gene associatedwith efficient food conversion and intramuscular fat contentshowed signals of positive selection in five groups (QCC FLVHOL JER and IND) These genes might be associated with thetenderness and quality of meat in cattle

ConclusionWhole-genome sequencing of representative Chinese cattlebreeds and two additional breeds (JBC and RAN) generated acomprehensive catalogue of genetic variations This is the firstpopulation genomic study on Chinese cattle to use next-generation whole-genome sequencing data and is an impor-tant source of genetic information for cattle worldwideBovine haplotypes have been inferred in Mongolian yakswith recent admixture at least 1500 years ago (Medugoracet al 2017) It is highly possible that there was recent intro-gression from yak (B grunniens) to Chinese cattle as sug-gested by previous studies (Lei et al 2000 Cai et al 20072014) The genetic influence of yak is too limited to havebeen detected in the representative cattle breeds examinedin our study We also discovered many potential selectivesweeps associated with domestication related to breed-specific characteristics with selective sweep regions includinggenes associated with coat color dairy traits and meat pro-ductionquality traits Collectively these findings substantiallyexpand the catalogue of genetic variants in cattle and revealnew insights into the evolutionary history and domesticationtraits of Chinese cattle

Materials and Methods

Sample Collection and SequencingTo represent the overall genetic diversity of Chinese cattlewe selected 46 samples from 6 representative Chinese cattlebreeds with divergent phenotypic characters across the maingeographic distribution Qinchuan cattle (QCC nfrac14 37)Nanyang cattle (NYC nfrac14 2) Luxi cattle (LXC nfrac14 1)Yanbian cattle (YBC nfrac14 2) Yunnan cattle (YNCnfrac14 2) and Leiqiong cattle (LQC nfrac14 2) For comparisonsamples from two specialized beef cattle breeds Red Angus

(RAN nfrac14 18) and JBC (nfrac14 11) were also collected (supple-mentary table S2 Supplementary Material online) Total ge-nomic DNA was extracted from the blood samples of theanimals using a standard phenolndashchloroform protocol Foreach individual at least 5-mg genomic DNA was used to con-struct paired-end libraries with an insert size of 500 bp accord-ing to the Illuminarsquos library preparation protocol Moreoverwe collected 76 genome sequences from previous studies forthe breeds Brahman (BRM indicine nfrac14 6) Nelore (NELindicine nfrac14 5) Gir (GIR indicine nfrac14 4) Limousin (LIMtaurine nfrac14 6) Jersey (JER taurine nfrac14 18) Fleckvieh (FLVtaurine nfrac14 19) and Holstein (HOL taurine nfrac14 18) (detailsin supplementary tables S1 and S2 Supplementary Materialonline)

Alignments and Variant IdentificationPaired-end reads (100 bp) obtained from sequencing in thepresent study and previous studies were mapped to the Btaurus genome (UMD31) (Zimin et al 2009) using BWA (Liand Durbin 2009) with the default parameters SequenceAlignment Map (SAM) format files were imported intoSAMtools (Li et al 2009) for sorting and merging and intoPicard (httpbroadinstitutegithubiopicard version 192)to remove duplicated reads To identify the ancestral stateof cattle we mapped the raw reads of yak (Qiu et al 2012)sequenced to 65 to the reference genome

Initial variant site identification was performed usingSAMtools mpileup and GATK UnifiedGenotyper (GenomeAnalysis Toolkit version 24-9) (McKenna et al 2010) withthe default settings The overlap subset of 53979675 single-nucleotide polymorphisms (SNPs) and 5924578 small inser-tions and deletions (InDels 91 of InDels were 1ndash30 bp inlength and the largest InDel was 403 bp in length) was de-fined as a high-confidence catalogue used for base qualityrecalibration using GATK with the default set of covariantsThe resulting recalibrated bam files were then used as inputfor a second variant calling with GATK The resulting variantcalls were analyzed and approximately the highest scoring10 of the predicted variant sites were used as a training setfor variant quality recalibration and filtering by using GATKThese steps resulted in 60031459 SNPs and 5603383 InDelsTo obtain high-quality results for further analyses we onlyretained biallelic SNPs and InDels with gt90 calling ratesresulting in 57220105 SNPs and 5270518 InDels Beagle(Browning and Browning 2007) which has been shown toyield highly accurate solutions was used to improve the ge-notype calls using genotype likelihoods from GATK and toinfer the haplotypes in the sample Short InDels were notincluded in the diversity or divergence estimates and werenot included in the other analyses Variants (SNPs and InDels)were annotated using ANNOVAR (Wang et al 2010)

Phylogenetic and Population Structure AnalysesA phylogenetic tree was constructed from the SNP data byusing the neighbor-joining method in the program PHYLIPv3695 (httpevolutiongeneticswashingtoneduphyliphtml) and distance matrices were calculated using PLINK(Purcell et al 2007) The ancestral states of the SNPs were

Mei et al doi101093molbevmsx322 MBE

694Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 8: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

determined using a close relative of cattle B grunniens as theoutgroup Population structure was further inferred usingADMIXTURE (Alexander et al 2009) with kinship (K) setfrom 2 to 7 Principle component analysis was carried outusing the smartPCA program of the EIGENSOFT (Pattersonet al 2006) package

Genome-Wide Patterns of Genetic Diversity andDivergenceThe average pairwise nucleotide diversity (hp) and Tajimarsquos Dstatistic of each breed were calculated using a sliding windowapproach (50-kb sliding windows in 10-kb steps) with thedefault parameters of VCFtools (Danecek et al 2011)Population differentiation was measured by pairwise FST usingthe unbiased estimator of Weir and Cockerham (1984) withthe default parameters

Linkage DisequilibriumTo estimate the genome-wide LD of each breed we calcu-lated the mean r2 values for pairwise markers with Haploview(Barrett et al 2005) software Only SNPs with a minor allelefrequencygt005 in three groups (Chinese cattle indicine andtaurine) were used The parameters of Haploview were set toldquo-maxdistance 200 -dprime -memory 5000 -minGeno 06 -minMAF 005 -hwcutoff 0001rdquo To minimize the influence ofsample size only breeds with at least five individuals wereused and breeds with more than five samples were down-sampled to five

Haplotype DiversityFor the haplotype diversity analysis the same breeds and SNPset were used as in the linkage disequilibrium analysis Tocalculate haplotype diversity the genome was divided into5- to 500-kb bins (detailed in supplementary table S6Supplementary Material online) Windows with fewer thantwo SNPs per 5 kb were removed and those with more thanfour SNPs four SNPs were randomly selected Considering thesubstantial variation in the recombination rate across thecattle genome we adopted a sliding-window strategy andallowed the window to slide by half its length each timeThe frequencies of haplotypes were counted and haplotypediversity (H) was calculated as described previously(Daetwyler et al 2014)

PSMC AnalysisWe inferred the demographic history of B taurus and B in-dicus using the Pairwise Sequentially Markovian Coalescent(PSMC) model (Li and Durbin 2011) In the default PSMCapproach a whole genome diploid consensus sequence wasgenerated using the alignment file from one sample Recallingthat most of our genomes have not been sequenced to a highaverage depth of coverage (mostly 10) and that PSMChas high false-negative rates at low depths of coverage (ielt20) leading to a systematic underestimation of true eventtimes (Orlando et al 2013 Nadachowska-Brzyska et al 2016)we applied a modified PSMC approach the SNPs of onesample were extracted from variants called on cohortsof all samples and converted to consensus sequences

This procedure was followed for samples (marked in supple-mentary table S1 Supplementary Material online) with rela-tively high sequencing depth in each breed to ensure thequality of consensus sequences We then transformed theconsensus sequence into a fasta-like format usingldquofq2psmcfardquo The PSMC parameters were set as follows ldquo-p4thorn 252thorn 4thorn 6rdquo The mutation rate per generation per sitewas estimated as lfrac14D g2 T where D is the observedfrequency of pairwise differences between two species T isthe estimated divergence time and g is the estimated gener-ation time for the two species The cattle generation time (g)was set to an estimate of 5 years and the estimated diver-gence time was set to 49 Ma based on a previous study oncattle and yak (Qiu et al 2012) These values yielded anestimated mutation rate of 9796 109 mutations pergeneration per site We obtained mass accumulationrate (MAR) of Chinese loess of the past 36 My (Sun andAn 2005) an index indicating cold and dry or warm andwet climatic periods in China (fig 2a and supplementaryfigs S13 and S14 Supplementary Material online)

To evaluate the differences between our revised PSMCapproach and the default method we reconstructed trajec-tories from two samples with different depth of coverage(SRR1262805 with 24 and SRR1262808 with 9) of thesame breed (FLV) which should yield similar inferences ThePSMC profiles retrieved from the default and revised ap-proach of the high depth sample were found to be almostidentical (supplementary fig S11 Supplementary Material on-line) both with regarding to the timing and the magnitude ofdemographic events except for the most recent expansionphase in which a lower intensity was found using the revisedapproach We found that PSMC inference based on the lowdepth sample showed a biased demographic model andcould be satisfactorily corrected with our revised PSMC ap-proach Additionally we note that the detected bias observedfor genomes with low depth (lt20) could also be correctedassuming a uniform False Negative Rate (uNFR) by using theoption ldquondashMrdquo of the plotting script ldquopsmc_plotplrdquo to specifythe uFNR correction rate (Orlando et al 2013 Hung et al2014) The uFNR correction showed a similar plot of a lowdepth sample compared with high depth PSMC inference(supplementary fig S11 Supplementary Material online)No striking differences were observed among the PSMC pro-files reconstructed from different taurine breeds with differ-ent sequencing depth of coverage (range from 9 to 24supplementary table S1 Supplementary Material online andfig 2a) Consequently we found our revised approach to be asuitable method that introduced acceptable new biases toestimate the PSMC inference of low average sequencingdepth samples

To explore the potential impact of the reference ge-nome on the PSMC results of indicine breeds we mappedsequence reads of indicine samples against the assembly ofB indicus (Nelore breed GenBank assembly accessionGCF_0002477951) and repeated the PSMC analysis (de-fault setting with uFNR correction) Although the PSMCprofiles reconstructed from different references were notidentical the qualitative results hold for indicine breeds

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

695Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 9: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

with the B indicus reference genome (fig 2a and supple-mentary fig S12 Supplementary Material online)

MSMC AnalysisThe Multiple Sequential Coalescent Markovian (MSMC)model (Schiffels and Durbin 2014) was used to infer changesin effective population size (Ne) and divergence time betweenbreeds (samples marked in supplementary table S1Supplementary Material online) MSMC is an extension ofthe PSMC model which uses a hidden Markov model toscan genomes and analyze patterns of heterozygosity withlong DNA segments with low heterozygosity reflecting recentcoalescent evens The rate of coalescent events is then used toestimate Ne at a given time To scale the output of MSMC toreal-time population sizes we used the generation time andmutation rate mentioned earlier (description of PSMC anal-ysis) We obtained atmospheric surface air temperature (SAT)and global sea level (GSL) data of the past 3 My (Bintanja andvan de Wal 2008) (supplementary figs S13 and S14Supplementary Material online)

Selective Sweep AnalysisConsidering the sample size and close genetic background ofindicine cattle (NEL BRM and GIR) we pooled these threeindicine breeds into one group (IND) in our selection analysisSeven cattle groups (QCC RAN JBC IND HOL FLV and JER)with sample sizes gt10 were retained for the following anal-ysis To identify candidate loci for breed-specific phenotypesthat are known to be under positive selection we used the di

statistic (Akey et al 2010) to measure the locus-specific diver-gences in allele frequency for each group based on unbiasedestimates of pairwise FST Briefly for each SNP we calculated

the statistic di frac14P

j 6frac14iFij

STEfrac12FijST

sdfrac12FijST

where Efrac12FijST and sdfrac12Fij

ST de-

note the expected value and SD of FST between group i and jcalculated from all SNPs For each group di was averaged overthe SNPs in nonoverlapping 50-kb windows Windows withSNP number lt10 were removed The top 1 of windowswith highest mean di score were defined as candidate selec-tive sweep regions Adjacent sweeps within a distance of50 kb were merged into one sweep Selective sweep regionswere annotated with cattle QTLdb release 29 from theAnimal Quantitative Trait Loci Database (Hu et al 2016)Candidate genes under positive selection were defined asthose in which more than half of the gene interval was foundin selective sweep regions Tajimarsquos D statistic was computedby using VCFtools for each candidate gene Gene Ontology(GO) enrichment analysis for genes in selective sweep regionswas performed with a hypergeometric test using ClueGO(Bindea et al 2009) The false discovery rate (FDR) was usedto correct the P values with the BenjaminindashHochbergapproach

Supplementary MaterialSupplementary data are available at Molecular Biology andEvolution online

Author ContributionsL-SZ W-JZ GC and H-BW led the experiments anddesigned the analytical strategy L-SZ C-GM H-CW W-QT L-SG Y-YZ Z-LJ Y-PX and X-ZS performed animalwork and prepared biological samples C-GM H-CW GCH-BW C-PZ A-NL W-CY C-LJ and S-HW con-structed the DNA library and performed sequencing W-JZ Q-JL L-ZW X-LW X-MG and C-ZW detectedannotated and summarized up variants W-JZ Q-JLC-GM and H-CW performed selection analysis W-JZL-ZW C-GM and H-CW analyzed origination of Chinaindicine cattle and population history C-GM H-CWW-JZ Q-JL and L-ZW wrote the manuscript L-SZS-CZ J-ZS GL X-DF XZ SS H-MY JW and RH re-vised the manuscript All the authors reviewed and approvedthe final manuscript

AcknowledgmentsWe thank many people not listed as authors who pro-vided feedback samples and encouragement especiallyChangguo Yan Yimin Xu Shanzhai Liu Guanli WangXiang Gao Jianghong Wan and Kaixing Qu This workwas supported by the National 863 Program of China(2013AA102505) the National Science-technology SupportPlan Projects (2015BAD03B04) the Program of NationalBeef Cattle and Yak Industrial Technology System (CARS-37) the Technical Innovation Engineering Project ofShaanxi Province (2014KTZB02-02-01) the NationalBeef Cattle Improvement Center the National amp LocalJoint Engineering Research Center for Modern CattleBiotechnology and Application the Beef Cattle EngineeringTechnology Research Center of Shaanxi Province and theState Key Laboratory of Agricultural Genomics

ReferencesAkey JM Ruhe AL Akey DT Wong AK Connelly CF Madeoy J Nicholas

TJ Neff MW 2010 Tracking footprints of artificial selection in thedog genome Proc Natl Acad Sci U S A 107(3)1160ndash1165

Alexander DH Novembre J Lange K 2009 Fast model-based esti-mation of ancestry in unrelated individuals Genome Res19(9)1655ndash1664

Barrett JC Fry B Maller J Daly MJ 2005 Haploview analysis andvisualization of LD and haplotype maps Bioinformatics21(2)263ndash265

Bech-Sabat G Lopez-Gatius F Yaniz JL Garcia-Ispierto I Santolaria PSerrano B Sulon J de Sousa NM Beckers JF 2008 Factors affectingplasma progesterone in the early fetal period in high producing dairycows Theriogenology 69(4)426ndash432

Bickhart DM Xu L Hutchison JL Cole JB Null DJ Schroeder SG Song JGarcia JF Sonstegard TS Van Tassell CP et al 2016 Diversity andpopulation-genetic properties of copy number variations and multi-copy genes in cattle DNA Res 23(3)253ndash262

Bindea G Mlecnik B Hackl H Charoentong P Tosolini M KirilovskyA Fridman WH Pages F Trajanoski Z Galon J 2009 ClueGO aCytoscape plug-in to decipher functionally grouped gene ontol-ogy and pathway annotation networks Bioinformatics25(8)1091ndash1093

Bintanja R van de Wal RSW 2008 North American ice-sheet dynamicsand the onset of 100 000-year glacial cycles Nature454(7206)869ndash872

Mei et al doi101093molbevmsx322 MBE

696Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 10: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

Bloise E Cassali GD Ferreira MC Ciarmela P Petraglia F Reis FM 2010Activin-related proteins in bovine mammary gland localization anddifferential expression during gestational development and differen-tiation J Dairy Sci 93(10)4592ndash4601

Brandes R Arad R Bar-Tana J 1995 Inducers of adipose conversionactivate transcription promoted by a peroxisome proliferatorrsquos re-sponse element in 3T3-L1 cells Biochem Pharmacol50(11)1949ndash1951

Browning SR Browning BL 2007 Rapid and accurate haplotype phasingand missing-data inference for whole-genome association studies byuse of localized haplotype clustering Am J Hum Genet81(5)1084ndash1097

Cai D Sun Y Tang Z Hu S Li W Zhao X Xiang H Zhou H 2014 Theorigins of Chinese domestic cattle as revealed by ancient DNA anal-ysis J Archaeol Sci 41423ndash434

Cai X Chen H Lei C Wang S Xue K Zhang B 2007 mtDNA diversity andgenetic lineages of eighteen cattle breeds from Bos taurus and Bosindicus in China Genetica 131(2)175ndash183

Cai X Chen H Wang S Xue K Lei C 2006 Polymorphisms of two Ychromosome microsatellites in Chinese cattle Genet Sel Evol38(5)525ndash534

Canavez FC Luche DD Stothard P Leite KR Sousa-Canavez JMPlastow G Meidanis J Souza MA Feijao P Moore SS et al2012 Genome sequence and assembly of Bos indicus J Hered103(3)342ndash348

Chen S Wang Y Kong X Liu D Cheng H Edwards RL 2006 A possibleYounger Dryas-type event during Asian monsoonal Termination 3Sci China D Earth Sci 49(9)982ndash990

Choi JW Liao X Park S Jeon HJ Chung WH Stothard P Park YS Lee JKLee KT Kim SH et al 2013 Massively parallel sequencing of Chikso(Korean brindle cattle) to discover genome-wide SNPs and InDelsMol Cells 36(3)203ndash211

Clark DL Boler DD Kutzler LW Jones KA McKeith FK Killefer J Carr TRDilger AC 2011 Muscle gene expression associated with increasedmarbling in beef cattle Anim Biotechnol 22(2)51ndash63

Daetwyler HD Capitan A Pausch H Stothard P van Binsbergen RBrondum RF Liao X Djari A Rodriguez SC Grohs C et al 2014Whole-genome sequencing of 234 bulls facilitates mapping ofmonogenic and complex traits in cattle Nat Genet46(8)858ndash865

Danecek P Auton A Abecasis G Albers CA Banks E DePristo MAHandsaker RE Lunter G Marth GT Sherry ST et al 2011 Thevariant call format and VCFtools Bioinformatics 27(15)2156ndash2158

Decker JE McKay SD Rolf MM Kim J Molina AA Sonstegard TSHanotte O Gotherstrom A Seabury CM Praharani L et al 2014Worldwide patterns of ancestry divergence and admixture in do-mesticated cattle PLoS Genet 10(3)e1004254

Eberlein A Takasuga A Setoguchi K Pfuhl R Flisikowski K Fries R KloppN Furbass R Weikard R Kuhn C 2009 Dissection of genetic factorsmodulating fetal growth in cattle indicates a substantial role of thenon-SMC condensin I complex subunit G (NCAPG) gene Genetics183(3)951ndash964

Fang X Lu L Yang S Li J An Z Jiang PA Chen X 2002 Loess inKunlun Mountains and its implications on desert developmentand Tibetan Plateau uplift in west China Sci China D Earth Sci45(4)289ndash299

Forti E Aksanov O Birk RZ 2007 Temporal expression pattern ofBardet-Biedl syndrome genes in adipogenesis Int J Biochem Cell B39(5)1055ndash1062

Gan HY Li JB Wang HM Gao YD Liu WH Li JP Zhong JF 2007Relationship between the melanocortin receptor 1 (MC1R)gene and the coat color phenotype in cattle Yi Chuan29195ndash200

Gibbs RA Taylor JF Van Tassell CP Barendse W Eversole KA Gill CAGreen RD Hamernik DL Kappes SM Lien S et al 2009 Genome-wide survey of SNP variation uncovers the genetic structure of cattlebreeds Science 324(5926)528ndash532

Gotoh T Takahashi H Nishimura T Kuchida K Mannen H 2014 Meatproduced by Japanese Black cattle and Wagyu Anim Front4(4)46ndash54

Grindflek E Holzbauer R Plastow G Rothschild MF 2002 Mapping andinvestigation of the porcine major insulin sensitive glucose transport(SLC2A4GLUT4) gene as a candidate gene for meat quality andcarcass traits J Anim Breed Genet 119(1)47ndash55

Gutierrez-Gil B Arranz JJ Wiener P 2015 An interpretive review ofselective sweep studies in Bos taurus cattle populations identifica-tion of unique and shared selection signals across breeds FrontGenet 6167

Hansen PJ 2004 Physiological and cellular adaptations of zebu cattle tothermal stress Anim Reprod Sci 82ndash83349ndash360

Hiendleder S Lewalski H Janke A 2008 Complete mitochondrialgenomes of Bos taurus and Bos indicus provide new insights intointra-species variation taxonomy and domestication CytogenetGenome Res 120(1ndash2)150ndash156

Hu ZL Park CA Reecy JM 2016 Developmental progress and cur-rent status of the Animal QTLdb Nucleic Acids Res44(D1)D827ndashD833

Hung CM Shaner PJ Zink RM Liu WC Chu TC Huang WS Li SH2014 Drastic population fluctuations explain the rapid extinc-tion of the passenger pigeon Proc Natl Acad Sci U S A111(29)10636ndash10641

Kovacik A Bulla J Trakovicka A ZItny J Rafayova A 2012 The effect ofthe porcine melanocortin-5 receptor (MC5R) gene associated withfeed intake carcass and physico-chemical characteristics J MicrobiolBiotechnol Food Sci 1498ndash506

Labrecque R Vigneault C Blondin P Sirard MA 2013 Gene expressionanalysis of bovine oocytes with high developmental competenceobtained from FSH-stimulated animals Mol Reprod Dev80(6)428ndash440

Lai SJ Liu YP Liu YX Li XW Yao YG 2006 Genetic diversity and origin ofChinese cattle revealed by mtDNA D-loop sequence variation MolPhylogenet Evol 38(1)146ndash154

Lee KT Chung WH Lee SY Choi JW Kim J Lim D Lee S Jang GW Kim BChoy YH et al 2013 Whole-genome resequencing of Hanwoo(Korean cattle) and insight into regions of homozygosity BMCGenomics 14519

Lee SS Yang BS Yang YH Kang SY Ko SB Jung JK Oh WY Oh SJ Kim KI2002 Analysis of melanocortin receptor 1 (MC1R) genotype inKorean brindle cattle and Korean cattle with dark muzzle J AnimSci Technol 44(1)23ndash30

Lei C Chen H Hu S 2000 Studies on Y chromosome polymorphism andthe origin and classification of Chinese yellow cattle Acta AgricBoreali-Occidentalis Sin 943ndash47

Lei CZ Chen H Zhang HC Cai X Liu RY Luo LY Wang CF Zhang W GeQL Zhang RF et al 2006 Origin and phylogeographical structure ofChinese cattle Anim Genet 37(6)579ndash582

Li H Durbin R 2009 Fast and accurate short read alignment withBurrows-Wheeler transform Bioinformatics 25(14)1754ndash1760

Li H Durbin R 2011 Inference of human population history fromindividual whole-genome sequences Nature 475(7357)493ndash496

Li H Handsaker B Wysoker A Fennell T Ruan J Homer N Marth GAbecasis G Durbin R 2009 The Sequence AlignmentMap formatand SAMtools Bioinformatics 25(16)2078ndash2079

Lin X Luo J Zhang L Zhu J 2013 MicroRNAs synergistically regulate milkfat synthesis in mammary gland epithelial cells of dairy goats GeneExpr 16(1)1ndash13

Loftus RT MacHugh DE Bradley DG Sharp PM Cunningham P 1994Evidence for two independent domestications of cattle Proc NatlAcad Sci U S A 91(7)2757ndash2761

McKenna A Hanna M Banks E Sivachenko A Cibulskis K Kernytsky AGarimella K Altshuler D Gabriel S Daly M et al 2010 Thegenome analysis toolkit a mapreduce framework for analyzingnext-generation DNA sequencing data Genome Res 20(9)1297ndash1303

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

697Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 11: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

Medugorac I Graf A Grohs C Rothammer S Zagdsuren Y Gladyr EZinovieva N Barbieri J Seichter D Russ I et al 2017 Whole-genomeanalysis of introgressive hybridization and characterization of thebovine legacy of Mongolian yaks Nat Genet 49(3)470ndash475

Mei C Wang H Zhu W Wang H Cheng G Qu K Guang X Li A Zhao CYang W et al 2016 Whole-genome sequencing of the endangeredbovine species Gayal (Bos frontalis) provides new insights into itsgenetic features Sci Rep 619787

Miller W Schuster SC Welch AJ Ratan A Bedoya-Reina OC Zhao F KimHL Burhans RC Drautz DI Wittekindt NE et al 2012 Polar andbrown bear genomes reveal ancient admixture and demographicfootprints of past climate change Proc Natl Acad Sci U S A109(36)E2382ndashE2390

Nadachowska-Brzyska K Burri R Smeds L Ellegren H 2016 PSMC anal-ysis of effective population sizes in molecular ecology and its appli-cation to black-and-white Ficedula flycatchers Mol Ecol25(5)1058ndash1072

Orlando L Ginolhac A Zhang G Froese D Albrechtsen A Stiller MSchubert M Cappellini E Petersen B Moltke I et al 2013Recalibrating Equus evolution using the genome sequence of anearly Middle Pleistocene horse Nature 499(7456)74ndash78

Ouali A Herrera-Mendez CH Coulis G Becila S Boudjellal A Aubry LSentandreu MA 2006 Revisiting the conversion of muscle into meatand the underlying mechanisms Meat Sci 74(1)44ndash58

Patterson N Price AL Reich D 2006 Population structure and eigena-nalysis PLoS Genet 2(12)e190

Porto-Neto LR Sonstegard TS Liu GE Bickhart DM Da SM MachadoMA Utsunomiya YT Garcia JF Gondro C Van Tassell CP 2013Genomic divergence of zebu and taurine cattle identified throughhigh-density SNP genotyping BMC Genomics 14(1)876

Purcell S Neale B Todd-Brown K Thomas L Ferreira MA Bender DMaller J Sklar P de Bakker PI Daly MJ et al 2007 PLINK a tool set forwhole-genome association and population-based linkage analysesAm J Hum Genet 81(3)559ndash575

Qiu H Ju ZY Chang ZJ 1993 A survey of cattle production in ChinaMore attention to animal genetic resources Food and AgricultureOrganization of the United Nations

Qiu Q Wang L Wang K Yang Y Ma T Wang Z Zhang X Ni Z Hou FLong R et al 2015 Yak whole-genome resequencing reveals domes-tication signatures and prehistoric population expansions NatCommun 610283

Qiu Q Zhang G Ma T Qian W Wang J Ye Z Cao C Hu Q Kim J LarkinDM et al 2012 The yak genome and adaptation to life at highaltitude Nat Genet 44(8)946ndash949

Randhawa IAS Khatkar MS Thomson PC Raadsma HW Barendse W2016 A meta-assembly of selection signatures in cattle PLoS One11(4)e153013

Rincon G Islas-Trejo A Casellas J Ronin Y Soller M Lipkin E Medrano JF2009 Fine mapping and association analysis of a quantitative traitlocus for milk production traits on Bos taurus autosome 4 J Dairy Sci92(2)758ndash764

Ruetz TJ Lin AE Guttman JA 2012 Enterohaemorrhagic Escherichia colirequires the spectrin cytoskeleton for efficient attachment and ped-estal formation on host cells Microb Pathog 52(3)149ndash156

Sartori R Bastos MR Baruselli PS Gimenes LU Ereno RL Barros CM2010 Physiological differences and implications to reproductivemanagement of Bos taurus and Bos indicus cattle in a tropical envi-ronment Reprod Domest Rumin Vii 7(1)357ndash375

Sattler K Levkau B 2009 Sphingosine-1-phosphate as a mediator ofhigh-density lipoprotein effects in cardiovascular protectionCardiovasc Res 82(2)201ndash211

Scherf BD Pilling D 2015 The second report on the state of the worldrsquosanimal genetic resources for food and agriculture Food andAgriculture Organization of the United Nations

Schiffels S Durbin R 2014 Inferring human population size and separa-tion history from multiple genome sequences Nat Genet46(8)919ndash925

Setoguchi K Watanabe T Weikard R Albrecht E Kuhn C Kinoshita ASugimoto Y Takasuga A 2011 The SNP c1326TgtG in the non-SMC

condensin I complex subunit G (NCAPG) gene encoding apIle442Met variant is associated with an increase in body framesize at puberty in cattle Anim Genet 42(6)650ndash655

Sherratt A 1983 The secondary exploitation of animals in the OldWorld World Archaeol 15(1)90ndash104

Smith TP Grosse WM Freking BA Roberts AJ Stone RT Casas E WrayJE White J Cho J Fahrenkrug SC et al 2001 Sequence evaluation offour pooled-tissue normalized bovine cDNA libraries and construc-tion of a gene index for cattle Genome Res 11(4)626ndash630

Sorbolini S Marras G Gaspa G Dimauro C Cellesi M Valentini AMacciotta NP 2015 Detection of selection signatures inPiemontese and Marchigiana cattle two breeds with similar produc-tion aptitudes but different selection histories Genet Sel Evol 4752

Stothard P Choi JW Basu U Sumner-Thomson JM Meng Y Liao XMoore SS 2011 Whole genome resequencing of black Angusand Holstein cattle for SNP and CNV discovery BMC Genomics12559

Sun YB An ZS 2005 Late Pliocene-Pleistocene changes in mass accu-mulation rates of eolian deposits on the central Chinese LoessPlateau J Geophys Res-Atmos 110(D23)1193ndash1194

Svizzero S Tisdell C 2016 Input shortages and the lack of sustainabilityof bronze production by the Unetice In Working Papers onEconomics Ecology and the Environment No 202 Queensland(Australia) University of Queensland

Switonski M Mankowska M Salamon S 2013 Family of melanocortinreceptor (MCR) genes in mammals-mutations polymorphisms andphenotypic effects J Appl Genet 54(4)461ndash472

Van Vuure T 2002 History morphology and ecology of the aurochs (Bosprimigenius) Available from httpciteseerxistpsueduviewdocsummary doifrac1410115346285

Wang K Li M Hakonarson H 2010 ANNOVAR functional annotationof genetic variants from high-throughput sequencing data NucleicAcids Res 38(16)e164

Wang M Ding Y 1996 The importance of work animals in rural ChinaWorld Anim Rev 8665ndash67

Wedholm A Larsen LB Lindmark-Mansson H Karlsson AH AndrenA 2006 Effect of protein composition on the cheese-makingproperties of milk from individual dairy cows J Dairy Sci89(9)3296ndash3305

Weir BS Cockerham CC 1984 Estimating F-statistics for the analysis ofpopulation structure Evolution 38(6)1358ndash1370

Xu L Bickhart DM Cole JB Schroeder SG Song J Tassell CP SonstegardTS Liu GE 2015 Genomic signatures reveal new evidences for se-lection of important traits in domestic cattle Mol Biol Evol32(3)711ndash725

Xu Y Yu W Xiong Y Xie H Ren Z Xu D Lei M Zuo B Feng X 2011Molecular characterization and expression patterns of serinearginine-rich specific kinase 3 (SPRK3) in porcine skeletal muscleMol Biol Rep 38(5)2903ndash2909

Yahvah KM Brooker SL Williams JE Settles M Mcguire MAMcguire MK 2015 Elevated dairy fat intake in lactating womenalters milk lipid and fatty acids without detectible changes inexpression of genes related to lipid uptake or synthesis Nutr Res35(3)221

Yoon D Ko E 2016 Association study between SNPs of the genes withinbovine QTLs and meat quality of Hanwoo J Anim Sci 94(Suppl4)145

Yu Y Nie L He ZQ Wen JK Jian CS Zhang YP 1999 Mitochondrial DNAvariation in cattle of south China origin and introgression AnimGenet 30(4)245ndash250

Zhang H Paijmans JL Chang F Wu X Chen G Lei C Yang X Wei ZBradley DG Orlando L et al 2013 Morphological and genetic evi-dence for early Holocene cattle management in northeastern ChinaNat Commun 42755

Zhang H Wang S Wang Z Da Y Wang N Hu X Zhang Y Wang Y LengL Tang Z et al 2012 A genome-wide scan of selective sweeps in twobroiler chicken lines divergently selected for abdominal fat contentBMC Genomics 13704

Mei et al doi101093molbevmsx322 MBE

698Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2
Page 12: Genetic Architecture and Selection of Chinese Cattle ...researchoutput.csu.edu.au/files/23632062/21923549_Published_article_OA.pdfGenetic Architecture and Selection of Chinese Cattle

Zhang Z Wang Z Yang Y Zhao J Chen Q Liao R Chen Z Zhang X XueM Yang H et al 2016 Identification of pleiotropic genes and genesets underlying growth and immunity traits a case study onMeishan pigs Animal 10(4)550ndash557

Zhao C Tian F Yu Y Luo J Hu Q Bequette BJ Baldwin VR Liu G Zan LScott UM et al 2012 Muscle transcriptomic analyses in Angus cattlewith divergent tenderness Mol Biol Rep 39(4)4185ndash4193

Zhao S Zheng P Dong S Zhan X Wu Q Guo X Hu Y He W Zhang SFan W et al 2013 Whole-genome sequencing of giant pandasprovides insights into demographic history and local adaptationNat Genet 45(1)67ndash71

Zheng B Xu Q Shen Y 2002 The relationship between climate changeand Quaternary glacial cycles on the QinghaindashTibetan Plateau re-view and speculation Quatern Int 97ndash9893ndash101

Zhou X Wang B Pan Q Zhang J Kumar S Sun X Liu Z Pan H Lin Y LiuG et al 2014 Whole-genome sequencing of the snub-nosed monkeyprovides insights into folivory and evolutionary history Nat Genet46(12)1303ndash1310

Zimin AV Delcher AL Florea L Kelley DR Schatz MC Puiu D HanrahanF Pertea G Van Tassell CP Sonstegard TS et al 2009 A whole-genome assembly of the domestic cow Bos taurus Genome Biol10(4)R42

Genetic Architecture and Selection of Chinese Cattle doi101093molbevmsx322 MBE

699Downloaded from httpsacademicoupcommbearticle-abstract3536884760963by Charles Sturt University useron 16 August 2018

  • msx322-TF1
  • msx322-TF2