Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding...

21
Submitted 12 April 2016 Accepted 19 July 2016 Published 17 August 2016 Corresponding author Lei Cao, [email protected] Academic editor Michael Somers Additional Information and Declarations can be found on page 14 DOI 10.7717/peerj.2345 Copyright 2016 Yang et al. Distributed under Creative Commons CC-BY 4.0 OPEN ACCESS Selection of a marker gene to construct a reference library for wetland plants, and the application of metabarcoding to analyze the diet of wintering herbivorous waterbirds Yuzhan Yang 1 , Aibin Zhan 2 , Lei Cao 2 , Fanjuan Meng 2 and Wenbin Xu 3 1 School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China 2 Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing, China 3 Anhui Shengjin Lake National Nature Reserve Administration, Chizhou, Anhui, China ABSTRACT Food availability and diet selection are important factors influencing the abundance and distribution of wild waterbirds. In order to better understand changes in waterbird population, it is essential to figure out what they feed on. However, analyzing their diet could be difficult and inefficient using traditional methods such as microhistologic observation. Here, we addressed this gap of knowledge by investigating the diet of greater white-fronted goose Anser albifrons and bean goose Anser fabalis, which are obligate herbivores wintering in China, mostly in the Middle and Lower Yangtze River floodplain. First, we selected a suitable and high-resolution marker gene for wetland plants that these geese would consume during the wintering period. Eight candidate genes were included: rbc L, rpoC1, rpoB, mat K, trnH-psbA, trnL (UAA), atpF-atpH, and psbK-psbI. The selection was performed via analysis of representative sequences from NCBI and comparison of amplification efficiency and resolution power of plant samples collected from the wintering area. The trnL gene was chosen at last with c/h primers, and a local plant reference library was constructed with this gene. Then, utilizing DNA metabarcoding, we discovered 15 food items in total from the feces of these birds. Of the 15 unique dietary sequences, 10 could be identified at specie level. As for greater white-fronted goose, 73% of sequences belonged to Poaceae spp., and 26% belonged to Carex spp. In contrast, almost all sequences of bean goose belonged to Carex spp. (99%). Using the same samples, microhistology provided consistent food composition with metabarcoding results for greater white-fronted goose, while 13% of Poaceae was recovered for bean goose. In addition, two other taxa were discovered only through microhistologic analysis. Although most of the identified taxa matched relatively well between the two methods, DNA metabarcoding gave taxonomically more detailed information. Discrepancies were likely due to biased PCR amplification in metabarcoding, low discriminating power of current marker genes for monocots, and biases in microhistologic analysis. The diet differences between two geese species might indicate deeper ecological significance beyond the scope of this study. We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific interactions, as well as new How to cite this article Yang et al. (2016), Selection of a marker gene to construct a reference library for wetland plants, and the applica- tion of metabarcoding to analyze the diet of wintering herbivorous waterbirds. PeerJ 4:e2345; DOI 10.7717/peerj.2345

Transcript of Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding...

Page 1: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Submitted 12 April 2016Accepted 19 July 2016Published 17 August 2016

Corresponding authorLei Cao caoleiustceducn

Academic editorMichael Somers

Additional Information andDeclarations can be found onpage 14

DOI 107717peerj2345

Copyright2016 Yang et al

Distributed underCreative Commons CC-BY 40

OPEN ACCESS

Selection of a marker gene to constructa reference library for wetland plantsand the application of metabarcoding toanalyze the diet of wintering herbivorouswaterbirdsYuzhan Yang1 Aibin Zhan2 Lei Cao2 Fanjuan Meng2 and Wenbin Xu3

1 School of Life Sciences University of Science and Technology of China Hefei Anhui China2Research Center for Eco-Environmental Sciences Chinese Academy of Sciences Beijing China3Anhui Shengjin Lake National Nature Reserve Administration Chizhou Anhui China

ABSTRACTFood availability and diet selection are important factors influencing the abundanceand distribution of wild waterbirds In order to better understand changes in waterbirdpopulation it is essential to figure out what they feed on However analyzing theirdiet could be difficult and inefficient using traditional methods such as microhistologicobservation Here we addressed this gap of knowledge by investigating the diet ofgreater white-fronted goose Anser albifrons and bean goose Anser fabalis which areobligate herbivores wintering in China mostly in the Middle and Lower Yangtze Riverfloodplain First we selected a suitable and high-resolution marker gene for wetlandplants that these geese would consume during the wintering period Eight candidategenes were included rbcL rpoC1 rpoBmatK trnH-psbA trnL (UAA) atpF-atpHand psbK-psbI The selection was performed via analysis of representative sequencesfrom NCBI and comparison of amplification efficiency and resolution power of plantsamples collected from the wintering area The trnL gene was chosen at last with chprimers and a local plant reference library was constructed with this gene Thenutilizing DNA metabarcoding we discovered 15 food items in total from the fecesof these birds Of the 15 unique dietary sequences 10 could be identified at specielevel As for greater white-fronted goose 73 of sequences belonged to Poaceae sppand 26 belonged to Carex spp In contrast almost all sequences of bean goosebelonged to Carex spp (99) Using the same samples microhistology providedconsistent food composition with metabarcoding results for greater white-frontedgoose while 13 of Poaceae was recovered for bean goose In addition two othertaxa were discovered only through microhistologic analysis Although most of theidentified taxa matched relatively well between the two methods DNA metabarcodinggave taxonomically more detailed information Discrepancies were likely due to biasedPCRamplification inmetabarcoding lowdiscriminating power of currentmarker genesfor monocots and biases in microhistologic analysis The diet differences betweentwo geese species might indicate deeper ecological significance beyond the scope ofthis study We concluded that DNA metabarcoding provides new perspectives forstudies of herbivorous waterbird diets and inter-specific interactions as well as new

How to cite this article Yang et al (2016) Selection of a marker gene to construct a reference library for wetland plants and the applica-tion of metabarcoding to analyze the diet of wintering herbivorous waterbirds PeerJ 4e2345 DOI 107717peerj2345

possibilities to investigate interactions between herbivores and plants In additionmicrohistologic analysis should be used together with metabarcoding methods tointegrate this information

Subjects Animal Behavior Conservation Biology Ecology Molecular Biology VeterinaryMedicineKeywords Bean goose Greater white-fronted goose Diet analysis trnL Molecular referencelibrary Metabarcoding

INTRODUCTIONWetlands are one of the most important ecosystems in nature and they harbor a variety ofecosystem services such as protection against floods water purification climate regulationand recreational opportunities (Brander Florax amp Vermaat 2006) Waterbirds are typicallywetland-dependent animals upon which they could get abundant food and suitable habitats(Ma et al 2010) Waterbird abundance and distribution could reflect the status of wetlandstructure and functions making them important bio-indicators for wetland health (Fox etal 2011) Among all factors affecting waterbird community dynamics food availability isfrequently considered to play one of themost important roles (Wang et al 2013) Howeverrecently suitable food resources have tended to decrease or even disappear due todeterioration and loss of natural wetlands (Fox et al 2011) As a result waterbirds areforced to discard previous habitats and sometimes even feed in agricultural lands (Zhanget al 2011) In addition migratory waterbirds may aid the dispersal of aquatic plants orinvertebrates by carrying and transporting them between water bodies at various spatialscales (Reynolds Miranda amp Cumming 2015) Consequently long-time monitoring andsystematic studies of waterbird diets are essential to understand population dynamics of wa-terbirds as well as to establish effectivemanagement programs for them (Wang et al 2012)

Traditional methods for waterbird diet analysis were direct observation in the field(Swennen amp Yu 2005) or microhistologic analysis of remnants in feces andor gut contents(James amp Burney 1997 Fox et al 2007) While these approaches have been proved usefulin some cases they are relatively labor-intensive and greatly skill-dependent (Fox et al2007 Samelius amp Alisauskas 1999 Symondson 2002) Applications of other methods foranalyzing gut contents or feces were also restricted due to inherent limitations as reviewedby Pompanon et al (2012) Recently metabarcoding methods based on high-throughputsequencing have provided new perspectives for diet analysis and biodiversity assessment(Taberlet et al 2007 Creer et al 2010) These methods provide higher taxonomic resolu-tion and higher detectability with enormous sequence output from large-scale environ-mental samples such as soil water and feces (Shokralla Spall amp Gibson 2012 Bohmann etal 2014) Owing to these advantages metabarcoding has been widely employed in the dietanalysis of herbivores (Taberlet et al 2012 Ando et al 2013Hibert et al 2013) carnivores(Deagle Kirkwood amp Jarman 2009 Shehzad et al 2012) and omnivores (De Barba et al2014) But the pitfalls of metabarcoding should not be ignored when choosing suitabletechniques for new studies For instance many researches have shown that it is difficult

Yang et al (2016) PeerJ DOI 107717peerj2345 221

to obtain quantitative data using metabarcoding (Sun et al 2015) This drawback mightresult from both technical issues of this method and relevant biological features of samples(Pompanon et al 2012)

One paramount prerequisite of metabarcoding methods is to select robust geneticmarkers and corresponding primers (Zhan et al 2014 Zhan amp MacIsaac 2015) For dietstudies of herbivores at least eight chloroplast genes and two nuclear genes are usedas potential markers for land plants (Hollingsworth Graham amp Little 2011) Althoughmitochondrial cytochrome coxidase I (COI) is extensively recommended as a standardbarcode for animals its relatively low rate of evolution in botanical genomes precludes itbeing an optimum for plants (Wolfe Li amp Sharp 1987 Fazekas et al 2008) The internaltranscribed spacer (ITS) is excluded due to divergence discrepancies of individuals andlow reproducibility (Aacutelvarez amp Wendel 2003) A variety of combinations and comparisonshave been performed for the eight candidate genes however none proved equally powerfulfor all cases (Fazekas et al 2008) Consequently it is more effective to choose barcodesfor a circumscribed set of species occurring in a regional community (Kress et al 2009)Another equally important aspect of metabarcoding applications is the construction ofreference libraries which assist taxonomic assignment (Rayeacute et al 2011 Xu et al 2015)It is difficult to accurately interpret sequence reads without a reliable reference library(Elliott amp Jonathan Davies 2014)

Diet analysis is one of the central issues in waterbird research both for decipheringwaterfowl population dynamics and interpreting inter- or intra-specific interactions ofcohabitating species (Zhao et al 2015) For instance more than 60 of bean goose Anserfabalis and almost 40 of greater white-fronted gooseAnser albifrons populations along theEast AsianndashAustralian Flyway Route winter at the Shengjin Lake National Nature Reserve(Zhao et al 2015) Previous studies based on microhistologic observation illustratedthat the dominant composition of their diets was monocotyledons such as Carex spp(Zhao et al 2012) Poaceae (Zhang et al 2011) and a relatively small proportion of non-monocots (referred to as dicotyledons in the study of lsquoZhao Cao amp Fox 2013rsquo) Howeverfew food items could be identified to species-level mainly owing to variable tissue structureswithin plants similar morphology between relative species and a high level of degradationafter digestion (Zhang et al 2011 Zhao et al 2012 Zhao Cao amp Fox 2013) Ambiguousidentification has hindered understanding of waterbird population dynamics and potentialto establish effective conservation plans for them

In this study we aimed to improve this situation using the metabarcoding method toanalyze diets of these species (see flowchart in Fig 1) By examining the efficiency of eightcandidate genes (rbcL rpoC1 rpoB matK trnH-psbA trnL (UAA) atpF-atpH and psbK-psbI) we selected robust genes and corresponding primers for reference library constructionand high-throughput sequencing Subsequently we used the metabarcoding method toinvestigate diet composition of these two species based on feces collected from ShengjinLake Finally we discussed and compared results from microhistology and DNA metabar-coding using the same samples to assess the utility and efficiency of these two methods

Yang et al (2016) PeerJ DOI 107717peerj2345 321

Figure 1 Technical flowchart of this study

MATERIALS AND METHODSEthics statementOur research work did not involve capture or any direct manipulation or disturbances ofanimals We collected samples of plants and feces for molecular analyses We obtainedaccess to the reserve under the permission of the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

Study areaShengjin Lake (11655primendash11715primeE 3015primendash3030primeN) was established as a National NatureReserve in 1997 aiming to protect waterbirds including geese cranes and storks Thewater level fluctuates greatly in this lake with maximal water level of 17 m duringsummer (flood season) but only 10 m during winter (dry season) Due to this fluctuationreceding waters expose two large Carex spp meadows and provide suitable habitats forwaterbirds This makes Shengjin Lake one of the most important wintering sites formigratory waterbirds (Zhao et al 2015) Greater white-fronted goose and bean goose arethe dominant herbivores wintering (from October to April) in this area accounting for40 and 60 of populations along the East AsianndashAustralian Flyway Route respectively(Zhao et al 2015)

Field samplingThe most common plant species that these two geese may consume were collected inMay 2014 and January 2015 especially species belonging to Carex and Poaceae Fresh andintact leaves were carefully picked tin-packaged in the field and stored at minus80 C in thelaboratory before further treatment Morphological identification was carried out with theassistance of two botanists (Profs Zhenyu Li and Shuren Zhang from Institute of BotanyChinese Academy of Sciences)

Yang et al (2016) PeerJ DOI 107717peerj2345 421

Figure 2 The location of our study area Shengjin Lake National Nature Reserve and our samplingsites (Source httperosusgsgov)

All feces were collected at the reserve (Fig 2) in January 2015 Based on previousstudies and the latest waterbird survey sites with large flocks of geese (ie more than 200individuals) were chosen (Zhang et al 2011) As soon as geese finished feeding and feceswere defecated fresh droppings were picked and stored on dry ice Droppings of bean geesewere generally thicker than those of smaller greater white-fronted goose to the degree thatthese could be reliably distinguished in the field (Zhao et al 2015) Disposal gloves werechanged for each sample to avoid cross contamination To avoid repeated sampling andto make sure samples were from different individuals each sample was collected with aseparation of more than 2 m In total 21 feces were collected including 11 for greaterwhite-fronted goose and 10 for bean goose All samples were transported to laboratory ondry ice and then stored at minus80 C until further analysis

Selection of molecular markers and corresponding primersHere we aimed to select gene markers with adequate discriminating power for ourstudy We included eight chloroplast genesmdash rbcL rpoC1 rpoB matK trnH-psbAtrnL (UAA) atpF-atpH and psbK-psbI for estimation Although Shengjin Lake includedan array of plant species we focused mainly on the most likely food resources(Xu et al 2008 Zhao et al 2015) that geese would consume for candidate gene testsThese covered 11 genera and the family Poaceae (Table S1) For tests of all candidate geneswe recovered sequences of representative species in the selected groups from GenBank(httpwwwncbinlmnihgovnuccore) We calculated inter-specific divergence within

Yang et al (2016) PeerJ DOI 107717peerj2345 521

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 2: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

possibilities to investigate interactions between herbivores and plants In additionmicrohistologic analysis should be used together with metabarcoding methods tointegrate this information

Subjects Animal Behavior Conservation Biology Ecology Molecular Biology VeterinaryMedicineKeywords Bean goose Greater white-fronted goose Diet analysis trnL Molecular referencelibrary Metabarcoding

INTRODUCTIONWetlands are one of the most important ecosystems in nature and they harbor a variety ofecosystem services such as protection against floods water purification climate regulationand recreational opportunities (Brander Florax amp Vermaat 2006) Waterbirds are typicallywetland-dependent animals upon which they could get abundant food and suitable habitats(Ma et al 2010) Waterbird abundance and distribution could reflect the status of wetlandstructure and functions making them important bio-indicators for wetland health (Fox etal 2011) Among all factors affecting waterbird community dynamics food availability isfrequently considered to play one of themost important roles (Wang et al 2013) Howeverrecently suitable food resources have tended to decrease or even disappear due todeterioration and loss of natural wetlands (Fox et al 2011) As a result waterbirds areforced to discard previous habitats and sometimes even feed in agricultural lands (Zhanget al 2011) In addition migratory waterbirds may aid the dispersal of aquatic plants orinvertebrates by carrying and transporting them between water bodies at various spatialscales (Reynolds Miranda amp Cumming 2015) Consequently long-time monitoring andsystematic studies of waterbird diets are essential to understand population dynamics of wa-terbirds as well as to establish effectivemanagement programs for them (Wang et al 2012)

Traditional methods for waterbird diet analysis were direct observation in the field(Swennen amp Yu 2005) or microhistologic analysis of remnants in feces andor gut contents(James amp Burney 1997 Fox et al 2007) While these approaches have been proved usefulin some cases they are relatively labor-intensive and greatly skill-dependent (Fox et al2007 Samelius amp Alisauskas 1999 Symondson 2002) Applications of other methods foranalyzing gut contents or feces were also restricted due to inherent limitations as reviewedby Pompanon et al (2012) Recently metabarcoding methods based on high-throughputsequencing have provided new perspectives for diet analysis and biodiversity assessment(Taberlet et al 2007 Creer et al 2010) These methods provide higher taxonomic resolu-tion and higher detectability with enormous sequence output from large-scale environ-mental samples such as soil water and feces (Shokralla Spall amp Gibson 2012 Bohmann etal 2014) Owing to these advantages metabarcoding has been widely employed in the dietanalysis of herbivores (Taberlet et al 2012 Ando et al 2013Hibert et al 2013) carnivores(Deagle Kirkwood amp Jarman 2009 Shehzad et al 2012) and omnivores (De Barba et al2014) But the pitfalls of metabarcoding should not be ignored when choosing suitabletechniques for new studies For instance many researches have shown that it is difficult

Yang et al (2016) PeerJ DOI 107717peerj2345 221

to obtain quantitative data using metabarcoding (Sun et al 2015) This drawback mightresult from both technical issues of this method and relevant biological features of samples(Pompanon et al 2012)

One paramount prerequisite of metabarcoding methods is to select robust geneticmarkers and corresponding primers (Zhan et al 2014 Zhan amp MacIsaac 2015) For dietstudies of herbivores at least eight chloroplast genes and two nuclear genes are usedas potential markers for land plants (Hollingsworth Graham amp Little 2011) Althoughmitochondrial cytochrome coxidase I (COI) is extensively recommended as a standardbarcode for animals its relatively low rate of evolution in botanical genomes precludes itbeing an optimum for plants (Wolfe Li amp Sharp 1987 Fazekas et al 2008) The internaltranscribed spacer (ITS) is excluded due to divergence discrepancies of individuals andlow reproducibility (Aacutelvarez amp Wendel 2003) A variety of combinations and comparisonshave been performed for the eight candidate genes however none proved equally powerfulfor all cases (Fazekas et al 2008) Consequently it is more effective to choose barcodesfor a circumscribed set of species occurring in a regional community (Kress et al 2009)Another equally important aspect of metabarcoding applications is the construction ofreference libraries which assist taxonomic assignment (Rayeacute et al 2011 Xu et al 2015)It is difficult to accurately interpret sequence reads without a reliable reference library(Elliott amp Jonathan Davies 2014)

Diet analysis is one of the central issues in waterbird research both for decipheringwaterfowl population dynamics and interpreting inter- or intra-specific interactions ofcohabitating species (Zhao et al 2015) For instance more than 60 of bean goose Anserfabalis and almost 40 of greater white-fronted gooseAnser albifrons populations along theEast AsianndashAustralian Flyway Route winter at the Shengjin Lake National Nature Reserve(Zhao et al 2015) Previous studies based on microhistologic observation illustratedthat the dominant composition of their diets was monocotyledons such as Carex spp(Zhao et al 2012) Poaceae (Zhang et al 2011) and a relatively small proportion of non-monocots (referred to as dicotyledons in the study of lsquoZhao Cao amp Fox 2013rsquo) Howeverfew food items could be identified to species-level mainly owing to variable tissue structureswithin plants similar morphology between relative species and a high level of degradationafter digestion (Zhang et al 2011 Zhao et al 2012 Zhao Cao amp Fox 2013) Ambiguousidentification has hindered understanding of waterbird population dynamics and potentialto establish effective conservation plans for them

In this study we aimed to improve this situation using the metabarcoding method toanalyze diets of these species (see flowchart in Fig 1) By examining the efficiency of eightcandidate genes (rbcL rpoC1 rpoB matK trnH-psbA trnL (UAA) atpF-atpH and psbK-psbI) we selected robust genes and corresponding primers for reference library constructionand high-throughput sequencing Subsequently we used the metabarcoding method toinvestigate diet composition of these two species based on feces collected from ShengjinLake Finally we discussed and compared results from microhistology and DNA metabar-coding using the same samples to assess the utility and efficiency of these two methods

Yang et al (2016) PeerJ DOI 107717peerj2345 321

Figure 1 Technical flowchart of this study

MATERIALS AND METHODSEthics statementOur research work did not involve capture or any direct manipulation or disturbances ofanimals We collected samples of plants and feces for molecular analyses We obtainedaccess to the reserve under the permission of the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

Study areaShengjin Lake (11655primendash11715primeE 3015primendash3030primeN) was established as a National NatureReserve in 1997 aiming to protect waterbirds including geese cranes and storks Thewater level fluctuates greatly in this lake with maximal water level of 17 m duringsummer (flood season) but only 10 m during winter (dry season) Due to this fluctuationreceding waters expose two large Carex spp meadows and provide suitable habitats forwaterbirds This makes Shengjin Lake one of the most important wintering sites formigratory waterbirds (Zhao et al 2015) Greater white-fronted goose and bean goose arethe dominant herbivores wintering (from October to April) in this area accounting for40 and 60 of populations along the East AsianndashAustralian Flyway Route respectively(Zhao et al 2015)

Field samplingThe most common plant species that these two geese may consume were collected inMay 2014 and January 2015 especially species belonging to Carex and Poaceae Fresh andintact leaves were carefully picked tin-packaged in the field and stored at minus80 C in thelaboratory before further treatment Morphological identification was carried out with theassistance of two botanists (Profs Zhenyu Li and Shuren Zhang from Institute of BotanyChinese Academy of Sciences)

Yang et al (2016) PeerJ DOI 107717peerj2345 421

Figure 2 The location of our study area Shengjin Lake National Nature Reserve and our samplingsites (Source httperosusgsgov)

All feces were collected at the reserve (Fig 2) in January 2015 Based on previousstudies and the latest waterbird survey sites with large flocks of geese (ie more than 200individuals) were chosen (Zhang et al 2011) As soon as geese finished feeding and feceswere defecated fresh droppings were picked and stored on dry ice Droppings of bean geesewere generally thicker than those of smaller greater white-fronted goose to the degree thatthese could be reliably distinguished in the field (Zhao et al 2015) Disposal gloves werechanged for each sample to avoid cross contamination To avoid repeated sampling andto make sure samples were from different individuals each sample was collected with aseparation of more than 2 m In total 21 feces were collected including 11 for greaterwhite-fronted goose and 10 for bean goose All samples were transported to laboratory ondry ice and then stored at minus80 C until further analysis

Selection of molecular markers and corresponding primersHere we aimed to select gene markers with adequate discriminating power for ourstudy We included eight chloroplast genesmdash rbcL rpoC1 rpoB matK trnH-psbAtrnL (UAA) atpF-atpH and psbK-psbI for estimation Although Shengjin Lake includedan array of plant species we focused mainly on the most likely food resources(Xu et al 2008 Zhao et al 2015) that geese would consume for candidate gene testsThese covered 11 genera and the family Poaceae (Table S1) For tests of all candidate geneswe recovered sequences of representative species in the selected groups from GenBank(httpwwwncbinlmnihgovnuccore) We calculated inter-specific divergence within

Yang et al (2016) PeerJ DOI 107717peerj2345 521

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 3: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

to obtain quantitative data using metabarcoding (Sun et al 2015) This drawback mightresult from both technical issues of this method and relevant biological features of samples(Pompanon et al 2012)

One paramount prerequisite of metabarcoding methods is to select robust geneticmarkers and corresponding primers (Zhan et al 2014 Zhan amp MacIsaac 2015) For dietstudies of herbivores at least eight chloroplast genes and two nuclear genes are usedas potential markers for land plants (Hollingsworth Graham amp Little 2011) Althoughmitochondrial cytochrome coxidase I (COI) is extensively recommended as a standardbarcode for animals its relatively low rate of evolution in botanical genomes precludes itbeing an optimum for plants (Wolfe Li amp Sharp 1987 Fazekas et al 2008) The internaltranscribed spacer (ITS) is excluded due to divergence discrepancies of individuals andlow reproducibility (Aacutelvarez amp Wendel 2003) A variety of combinations and comparisonshave been performed for the eight candidate genes however none proved equally powerfulfor all cases (Fazekas et al 2008) Consequently it is more effective to choose barcodesfor a circumscribed set of species occurring in a regional community (Kress et al 2009)Another equally important aspect of metabarcoding applications is the construction ofreference libraries which assist taxonomic assignment (Rayeacute et al 2011 Xu et al 2015)It is difficult to accurately interpret sequence reads without a reliable reference library(Elliott amp Jonathan Davies 2014)

Diet analysis is one of the central issues in waterbird research both for decipheringwaterfowl population dynamics and interpreting inter- or intra-specific interactions ofcohabitating species (Zhao et al 2015) For instance more than 60 of bean goose Anserfabalis and almost 40 of greater white-fronted gooseAnser albifrons populations along theEast AsianndashAustralian Flyway Route winter at the Shengjin Lake National Nature Reserve(Zhao et al 2015) Previous studies based on microhistologic observation illustratedthat the dominant composition of their diets was monocotyledons such as Carex spp(Zhao et al 2012) Poaceae (Zhang et al 2011) and a relatively small proportion of non-monocots (referred to as dicotyledons in the study of lsquoZhao Cao amp Fox 2013rsquo) Howeverfew food items could be identified to species-level mainly owing to variable tissue structureswithin plants similar morphology between relative species and a high level of degradationafter digestion (Zhang et al 2011 Zhao et al 2012 Zhao Cao amp Fox 2013) Ambiguousidentification has hindered understanding of waterbird population dynamics and potentialto establish effective conservation plans for them

In this study we aimed to improve this situation using the metabarcoding method toanalyze diets of these species (see flowchart in Fig 1) By examining the efficiency of eightcandidate genes (rbcL rpoC1 rpoB matK trnH-psbA trnL (UAA) atpF-atpH and psbK-psbI) we selected robust genes and corresponding primers for reference library constructionand high-throughput sequencing Subsequently we used the metabarcoding method toinvestigate diet composition of these two species based on feces collected from ShengjinLake Finally we discussed and compared results from microhistology and DNA metabar-coding using the same samples to assess the utility and efficiency of these two methods

Yang et al (2016) PeerJ DOI 107717peerj2345 321

Figure 1 Technical flowchart of this study

MATERIALS AND METHODSEthics statementOur research work did not involve capture or any direct manipulation or disturbances ofanimals We collected samples of plants and feces for molecular analyses We obtainedaccess to the reserve under the permission of the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

Study areaShengjin Lake (11655primendash11715primeE 3015primendash3030primeN) was established as a National NatureReserve in 1997 aiming to protect waterbirds including geese cranes and storks Thewater level fluctuates greatly in this lake with maximal water level of 17 m duringsummer (flood season) but only 10 m during winter (dry season) Due to this fluctuationreceding waters expose two large Carex spp meadows and provide suitable habitats forwaterbirds This makes Shengjin Lake one of the most important wintering sites formigratory waterbirds (Zhao et al 2015) Greater white-fronted goose and bean goose arethe dominant herbivores wintering (from October to April) in this area accounting for40 and 60 of populations along the East AsianndashAustralian Flyway Route respectively(Zhao et al 2015)

Field samplingThe most common plant species that these two geese may consume were collected inMay 2014 and January 2015 especially species belonging to Carex and Poaceae Fresh andintact leaves were carefully picked tin-packaged in the field and stored at minus80 C in thelaboratory before further treatment Morphological identification was carried out with theassistance of two botanists (Profs Zhenyu Li and Shuren Zhang from Institute of BotanyChinese Academy of Sciences)

Yang et al (2016) PeerJ DOI 107717peerj2345 421

Figure 2 The location of our study area Shengjin Lake National Nature Reserve and our samplingsites (Source httperosusgsgov)

All feces were collected at the reserve (Fig 2) in January 2015 Based on previousstudies and the latest waterbird survey sites with large flocks of geese (ie more than 200individuals) were chosen (Zhang et al 2011) As soon as geese finished feeding and feceswere defecated fresh droppings were picked and stored on dry ice Droppings of bean geesewere generally thicker than those of smaller greater white-fronted goose to the degree thatthese could be reliably distinguished in the field (Zhao et al 2015) Disposal gloves werechanged for each sample to avoid cross contamination To avoid repeated sampling andto make sure samples were from different individuals each sample was collected with aseparation of more than 2 m In total 21 feces were collected including 11 for greaterwhite-fronted goose and 10 for bean goose All samples were transported to laboratory ondry ice and then stored at minus80 C until further analysis

Selection of molecular markers and corresponding primersHere we aimed to select gene markers with adequate discriminating power for ourstudy We included eight chloroplast genesmdash rbcL rpoC1 rpoB matK trnH-psbAtrnL (UAA) atpF-atpH and psbK-psbI for estimation Although Shengjin Lake includedan array of plant species we focused mainly on the most likely food resources(Xu et al 2008 Zhao et al 2015) that geese would consume for candidate gene testsThese covered 11 genera and the family Poaceae (Table S1) For tests of all candidate geneswe recovered sequences of representative species in the selected groups from GenBank(httpwwwncbinlmnihgovnuccore) We calculated inter-specific divergence within

Yang et al (2016) PeerJ DOI 107717peerj2345 521

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 4: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Figure 1 Technical flowchart of this study

MATERIALS AND METHODSEthics statementOur research work did not involve capture or any direct manipulation or disturbances ofanimals We collected samples of plants and feces for molecular analyses We obtainedaccess to the reserve under the permission of the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

Study areaShengjin Lake (11655primendash11715primeE 3015primendash3030primeN) was established as a National NatureReserve in 1997 aiming to protect waterbirds including geese cranes and storks Thewater level fluctuates greatly in this lake with maximal water level of 17 m duringsummer (flood season) but only 10 m during winter (dry season) Due to this fluctuationreceding waters expose two large Carex spp meadows and provide suitable habitats forwaterbirds This makes Shengjin Lake one of the most important wintering sites formigratory waterbirds (Zhao et al 2015) Greater white-fronted goose and bean goose arethe dominant herbivores wintering (from October to April) in this area accounting for40 and 60 of populations along the East AsianndashAustralian Flyway Route respectively(Zhao et al 2015)

Field samplingThe most common plant species that these two geese may consume were collected inMay 2014 and January 2015 especially species belonging to Carex and Poaceae Fresh andintact leaves were carefully picked tin-packaged in the field and stored at minus80 C in thelaboratory before further treatment Morphological identification was carried out with theassistance of two botanists (Profs Zhenyu Li and Shuren Zhang from Institute of BotanyChinese Academy of Sciences)

Yang et al (2016) PeerJ DOI 107717peerj2345 421

Figure 2 The location of our study area Shengjin Lake National Nature Reserve and our samplingsites (Source httperosusgsgov)

All feces were collected at the reserve (Fig 2) in January 2015 Based on previousstudies and the latest waterbird survey sites with large flocks of geese (ie more than 200individuals) were chosen (Zhang et al 2011) As soon as geese finished feeding and feceswere defecated fresh droppings were picked and stored on dry ice Droppings of bean geesewere generally thicker than those of smaller greater white-fronted goose to the degree thatthese could be reliably distinguished in the field (Zhao et al 2015) Disposal gloves werechanged for each sample to avoid cross contamination To avoid repeated sampling andto make sure samples were from different individuals each sample was collected with aseparation of more than 2 m In total 21 feces were collected including 11 for greaterwhite-fronted goose and 10 for bean goose All samples were transported to laboratory ondry ice and then stored at minus80 C until further analysis

Selection of molecular markers and corresponding primersHere we aimed to select gene markers with adequate discriminating power for ourstudy We included eight chloroplast genesmdash rbcL rpoC1 rpoB matK trnH-psbAtrnL (UAA) atpF-atpH and psbK-psbI for estimation Although Shengjin Lake includedan array of plant species we focused mainly on the most likely food resources(Xu et al 2008 Zhao et al 2015) that geese would consume for candidate gene testsThese covered 11 genera and the family Poaceae (Table S1) For tests of all candidate geneswe recovered sequences of representative species in the selected groups from GenBank(httpwwwncbinlmnihgovnuccore) We calculated inter-specific divergence within

Yang et al (2016) PeerJ DOI 107717peerj2345 521

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 5: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Figure 2 The location of our study area Shengjin Lake National Nature Reserve and our samplingsites (Source httperosusgsgov)

All feces were collected at the reserve (Fig 2) in January 2015 Based on previousstudies and the latest waterbird survey sites with large flocks of geese (ie more than 200individuals) were chosen (Zhang et al 2011) As soon as geese finished feeding and feceswere defecated fresh droppings were picked and stored on dry ice Droppings of bean geesewere generally thicker than those of smaller greater white-fronted goose to the degree thatthese could be reliably distinguished in the field (Zhao et al 2015) Disposal gloves werechanged for each sample to avoid cross contamination To avoid repeated sampling andto make sure samples were from different individuals each sample was collected with aseparation of more than 2 m In total 21 feces were collected including 11 for greaterwhite-fronted goose and 10 for bean goose All samples were transported to laboratory ondry ice and then stored at minus80 C until further analysis

Selection of molecular markers and corresponding primersHere we aimed to select gene markers with adequate discriminating power for ourstudy We included eight chloroplast genesmdash rbcL rpoC1 rpoB matK trnH-psbAtrnL (UAA) atpF-atpH and psbK-psbI for estimation Although Shengjin Lake includedan array of plant species we focused mainly on the most likely food resources(Xu et al 2008 Zhao et al 2015) that geese would consume for candidate gene testsThese covered 11 genera and the family Poaceae (Table S1) For tests of all candidate geneswe recovered sequences of representative species in the selected groups from GenBank(httpwwwncbinlmnihgovnuccore) We calculated inter-specific divergence within

Yang et al (2016) PeerJ DOI 107717peerj2345 521

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 6: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

every genus or family based on the Kiruma 2-parameter model (K2P) using MEGA version6 (Tamura et al 2013) We also constructed molecular trees based on UPGMA usingMEGA and characterized the resolution of species by calculating the percentage of speciesrecovered as monophyletic based on phylogenetic trees (Rf) Secondly primers selectedout of eight candidate genes were used to amplify all specimens collected in ShengjinLake and to check their amplification efficiency and universality Thirdly we calculatedinter-specific divergence based on sequences that we obtained from last step Generallya robust barcode gene is obtained when the minimal inter-specific distance exceeds themaximal intra-specific distance (eg existence of barcoding gaps) Finally to allow therecognition of sequences after high-throughput sequencing both of the forward andreverse primers of the selected marker gene were tagged specifically for each sample with8nt nucleotide codes at the 5primeend (Parameswaran et al 2007)

DNA extraction amplification and sequencingTwo hundred milligrams of leaf was used to extract the total DNA from each plant sampleusing a modified CTAB protocol (Cota-Sanchez Remarchuk amp Ubayasena 2006) DNAextraction of feces was carried out using the same protocol with minor modification inincubation time (elongate to 12 h) Each fecal sample was crushed thoroughly and dividedinto four quarters All quarters of DNA extracts were then pooled together DNA extractionwas carried out in a clean room used particularly for this study For each batch of DNAextraction negative controls (ie extraction without feces) were included to monitorpossible contamination

For plant DNA extracts PCR amplifications were carried out in a volume of 25 microl withsim100 ng total DNA as template 1U of Taq Polymerase (Takara Dalian Liaoning ProvinceChina) 1times PCR buffer 2 mM of Mg2 + 025 mM of dNTPs 01 microM of forward primerand 01 microM of reverse primer After 4 min at 94 C the PCR cycles were as follows 35cycles of 30 s at 94 C 30 s at 56 C and 45 s at 72 C and the final extension was 10min at 72 C We applied the same PCR conditions for all primers All the successful PCRproducts were sequenced with Genewiz (Suzhou Jiangsu Province China)

For fecal DNA extracts PCR mixtures (25 microl) were prepared in six replicates foreach sample to reduce biased amplification Each replicate was subjected to the sameamplification procedure used for plant extracts The six replicates of each sample werepooled and purified using the Sangon PCR product purification kit (Sangon BiotechShanghai China) Quantification was carried out to ensure equilibrium of contributionof each sample using the NanoDrop ND-2000 UV-Vis Spectrophotometer (NanoDropTechnologies Wilmington Delaware USA) High-throughput sequencing was performedusing Illumina MiSeq platform following manufacturerrsquos instructions by BGI (ShenzhenGuangdong Province China) Reads of high-throughput sequencing could be found atNCBIrsquos Sequence Read Archive (Accession number SRP070470)

Data analysis for estimating diet compositionAfter high-throughput sequencing pair-ended readsweremergedwith the fastq_mergepairscommand using usearch (httpdrive5comusearch Edgar 2010) Reads were then split

Yang et al (2016) PeerJ DOI 107717peerj2345 621

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 7: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

into independent files according to unique tags using the initial process of RDP pipeline(httpspyrocmemsueduinitformspr) We removed sequences (i) that didnrsquot perfectlymatch tags and primer sequences (ii) that contained ambiguous nucleotide (Nrsquos) Tagsand primers were then trimmed using the initial process of RDP pipeline Further qualityfiltering using usearch discarded sequences with (i) quality score less than 30 (ltQ30) and(ii) length shorter than 100 bp Unique sequences were clustered to operational taxonomyunits (OTUs) at the similarity threshold of 98 (Edgar 2013) All OTUs were assignedto unique taxonomy with local blast 2230+ (Altschul et al 1990) We detected a plantwithin the reference library for each sequence with the threshold of length coverage gt98identity gt98 and e-value lt 10eminus50 If a query sequence matched two or more taxa itwas assigned to a higher taxonomic level which included all taxa

Microhistology analysisWe used the method described by Zhang et al (2011) to perform microhistologicexamination of fecal samples Each sample was first washed with pure water and filteredwith a 25-microm filter Subsequently the suspension was examined under a light microscopeat 10times magnification for quantification statistics and at 40times magnification for speciesidentification We compared photos of visible fragments with an epidermis database ofplants from Shengjin Lake to identify food items (Zhang et al 2011)

RESULTSSelection of genes and corresponding primers and reference libraryconstructionA total of 3296 representative sequences were recovered from GenBank ranging from 0to 345 sequences per gene per taxon (Table S1) For Eleocharis and Trapa only sequencesof rbcL gene and trnL gene were retained which makes it unfair to compare the efficiencyand suitability of eight candidate genes For the other ten taxa trnL trnH-psbA rbcL andpsbK-psbI showed the largest inter-specific divergence in five three one and one taxonomicgroups respectively In addition trnH-psbA atpF-atpH trnL and psbK-psbI showed thehighest mean divergence in four four one and one taxonomic groups respectivelyHowever given the small number of sequences and coverage of species the suitability andefficiency of atpF-atpH and psbK-psbI seem to be less reliable than others This comparisonmakes trnH-psbA trnL and rbcL to be selected out of the eight candidate genes As matKused to be recommended as the standard barcode gene for Carex species (Starr Naczi ampChouinard 2009) which happened to be the dominant food for herbivorous geese in ourstudy (Zhao et al 2015) we included matK as a supplement at last

Primers for these four genes (Table 1) were used to amplify the plants that we collectedin the field In total we collected 88 specimens in the field belonging to 25 families53 genera and 70 species (Table 2) The selected primers for trnL and rbcL successfullyamplified 100 and 91 of all species respectively while primers for trnH-psbA andmatKamplified only 71 and 43 respectively Therefore we chose trnL and rbcL to test theirdiscriminating power in our target plants

Yang et al (2016) PeerJ DOI 107717peerj2345 721

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 8: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Table 1 Primers of candidate genes and reference library constructingOnly the c and h were used forhigh-throughput sequencing in fusion primer mode (primer+ tags) The unique tags were used to differ-entiate PCR products pooled together for highthroughput sequencing (Parameswaran et al 2007)

Gene Primer Sequence (5prime-3prime)

matK matK-XFa TAATTTACGATCAATTCATTCmatK-MALPb ACAAGAAAGTCGAAGTAT

rbcL rbcLa-Fc ATGTCACCACAAACAGAGACTAAAGCrbcLa-Rd GTAAAATCAAGTCCACCRCG

trnH-psbA pasbA3_fe CGCGCATGGTGGATTCACAATCCtrnHf_05f GTTATGCATGAACGTAATGCTC

trnL cg CGAAATCGGTAGACGCTACGhh CCATTGAGTCTCTGCACCTATC

Notesareferred to Ford et al (2009)breferred to Dunning amp Savolainen (2010)creferred to Hasebe et al (1994)dreferred to Kress et al (2009)ereferred to Tate amp Simpson (2003)freferred to Sang Crawford amp Stuessy (1997)greferred to Taberlet et al (1991)hreferred to Taberlet et al (2007)

We calculated the inter-specific divergence within genera and families with at leasttwo species to compare their discriminating power Maximal minimal and mean inter-specific distances were calculated for seven dominant genera and six dominant families(Table 3) Neither gene could differentiate species of Vallisneria (mean= 0000 plusmn 0000)or Artemisia (mean = 0000 plusmn 0000) But trnL showed a larger divergence rangefor the other six genera and five families Hence we chose trnL as the barcoding genefor reference library constructing and high-throughput sequencing for our study Thediscriminating power of trnL was strong for most species (Table 4) However some speciescould only be identified at genus-level or family-level with trnL For instance five speciesof Potamogetonaceae shared the same sequences and this made them to be identified atgenus-level Species could be identified easily to genus and family except for three grasses(Poaceae) Beckmannia syzigachne Phalaris arundinacea and Polypogon fugax which sharedidentical sequences

Data processing for estimating diet compositionIn total 021 and 018 million reads were generated for greater white-fronted goose(GWFG) and bean goose (BG) respectively (Table 5) The number of recovered OTUsranged from 8 to 123 for GWFG and BG samples We used local BLAST to compare thesesequences with the Shengjin Lake reference database Finally with DNA metabarocoding12 items were discovered in the feces of GWFG including one at family-level three atgenus-level and eight at species-level (Table 6) Four items were discovered in the feces ofBG including one at genus-level and three at species-level In total this method identified15 taxa in feces of these geese

Yang et al (2016) PeerJ DOI 107717peerj2345 821

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 9: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Table 2 Plant species in the reference libraryWe collected these samples from Shengjin Lake

Species No of samples Species No of samples

Curculigo orchioides 1 Trapella sinensis 2Artemisia capillaris 1 Plantago asiatica 1Artemisia selengensis 2 Alopecurus aequalis 2Aster subulatus 1 Beckmannia syzigachne 1Bidens frondosa 1 Bromus japonicus 1Erigeron annuus 1 Cynodon dactylon 2Gnaphalium affine 1 Phalaris arundinacea 1Hemistepta lyrata 1 Phragmites australis 1Kalimeris incisa 1 Poa annua 1Bothriospermum kusnezowii 1 Polypogon fugax 1Lobelia chinensis 1 Roegneria kamoji 2Sagina japonica 1 Zizania latifolia 1Stellaria media 2 Polygonum lapathifolium 4Calystegia hederacea 1 Polygonum orientale 1Cardamine lyrata 1 Polygonum perfoliatum 1Carex heterolepis 3 Polygonum persicaria 1Carex capricornis 1 Rumex trisetiferus 3Carex paxii 1 Potamogeton crispus 1Carex remotiuscula 1 Potamogeton maackianus 1Fimbristylis dichotoma 1 Potamogeton malaianus 1Eleocharis migoana 1 Potamogeton natans 1Scripus karuizawensis 1 Potamogeton pectinatus 1Nymphoides peltatum 1 Clematis florida 1Myriophyllum spicatum 1 Ranunculus chinensis 2Hydrilla verticillta 1 Ranunculus sceleratus 2Hydrocharis dubia 1 Potentilla freyniana 2Vallisineria spiralis 1 Gratiola japonica 1Vallisneria spinulosa 1 Mazus miquelii 2Juncus effusus 1 Veronica undulata 1Juncus gracillimus 1 Trapa bispinosa 1Leonurus japonicus 1 Trapa maximowiczii 1Salvia plebeia 1 Trapa pseudoincisa 1Glycine soja 1 Trapa quadrispinosa 1Vicia sativa 1 Hydrocotyle sibthorpioides 1Euryale ferox 1 Torilis japonica 2

However the sequence percentage of each food item varied greatly (Table 6) ForGWFG the majority of sequences (9636) were composed of only five itemsmdashPoaceaespp (4798 except Poa annua) Poa annua (2186) Carex heterolepis (1751) Carexspp (901 except Carex heterolepis) and Alopecurus aequalis (321) For BG almostall the sequences belonged to Carex heterolepis (9949) Other items only occupied arelatively small proportion of sequences In addition the presence of each item per samplewas also unequal (Table S2) For example in GWFG Carex heterolepis Carex spp Poa

Yang et al (2016) PeerJ DOI 107717peerj2345 921

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 10: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Table 3 Inter-specific divergences within dominant genera and families of rbcL gene and trnL gene with Kiruma 2-Parameter modelUnder-scores indicate the most common food composition based on earlier microhistologic analysis (Zhao et al 2012 Zhao et al 2015)

Inter-specific divergence Taxa rbcL trnL

Maximal Minimal Mean Maximal Minimal Mean

Artemisia 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Carex 0013 0000 0008plusmn 0006 0058 0000 0027plusmn 0021Polygonum 0027 0000 0010plusmn 0006 0076 0000 0033plusmn 0022Potamogeton 0012 0000 0005plusmn 00034 0016 0000 0005plusmn 0005Ranunculus 0031 0000 0020plusmn 0009 0042 0021 0024plusmn 0022Trapa 0000 0000 0000plusmn 0000 0081 0000 0049plusmn 0030

Withingenera

Vallisneria 0000 0000 0000plusmn 0000 0000 0000 0000plusmn 0000Cyperaceae 0043 0000 0018plusmn 0010 0178 0000 0084plusmn 0046Asteraceae 0120 0000 0049plusmn 0017 0087 0000 0023plusmn 0018Poaceae 0025 0000 0016plusmn 00009 0166 0000 0074plusmn 0039Hydrocharitaceae 0122 0000 0078plusmn 0020 0159 0000 0100plusmn 0054Polygonaceae 0043 0000 0020plusmn 0009 0129 0000 0031plusmn 0022

Withinfamilies

Ranunculaceae 0033 0016 0017plusmn 0015 0045 0000 0018plusmn 0013

Table 4 Number of species and unique sequences for families with more than one species in ShengjinLake plant database

Family No of species No of sequences

Asteraceae 8 7Caryophyllaceae 2 2Cyperaceae 7 5Fabaceae 2 2Hydrocharitaceae 4 3Lamiaceae 2 2Poaceae 10 8Polygonaceae 5 5Potamogetonaceae 5 1Ranunculaceae 3 3Scrophulariaceae 3 3Trapaceae 4 3Umbelliferae 2 2

annua and Potentilla supina were present in almost all the samples while Stellaria mediaAsteraceae sp and Lapsana apogonoldes occurred in only about one third of samples

When the microhistologic examination was performed using the same samples eightitems were found in the feces of greater white-fronted goose including one at family-levelfour at genus-level and three at species-level (Table 6) Dominant items were Poaceae spp(4568) Alopecurus Linn (3093) and Carex heterolepis (1639) Seven items werefound in the feces of bean goose including four at genus-level and three at species-level(Table 6) Dominant items were Carex heterolepis (6285) Asteraceae sp (1455) andAlopecurus Linn (1318)

Yang et al (2016) PeerJ DOI 107717peerj2345 1021

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 11: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Table 5 Summary of the process and results of high-throughput sequencing analysis

Sample Pair-endsequences

Sequences for whichprimers and tags wereidentified and with lengthgt100 bp

Uniquesequences

OTUs Food items

GWFG1 16303 8627 1288 78 8GWFG2 25482 13449 1091 102 8GWFG3 19063 10056 1277 48 10GWFG4 23856 12548 1419 114 8GWFG5 20955 11249 1720 123 9GWFG6 11677 7205 973 52 9GWFG7 13377 6782 1328 59 9GWFG8 7749 3959 774 89 9GWFG9 16833 8799 1436 90 6GWFG10 18474 9819 449 32 9GWFG11 19648 10458 617 31 6BG1 20225 10254 784 23 4BG2 14195 7161 564 16 2BG3 2229 1149 255 12 4BG4 517 268 77 8 3BG5 28152 14033 1000 15 3BG6 16723 8484 740 17 4BG7 30166 15403 974 15 4BG8 30928 15706 1028 15 3BG9 8382 4489 446 13 4BG10 10714 5526 537 13 4

NotesGWFG Greater white-fronted goose BG Bean goose

DISCUSSIONMarker selection and reference library constructing for diet analysisWith greatly reduced cost extremely high throughput and information contentmetabarcoding has revolutionized the exploration and quantification of dietary analysiswith noninvasive samples containing degraded DNA (Fonseca et al 2010 Shokralla et al2014) Despite enormous potential to boost data acquisition successful application of thistechnology relies greatly on the power and efficiency of genetic markers and correspondingprimers (Bik et al 2012 Zhan et al 2014) In order to select the most appropriate markergene for our study we compared the performance of eight commonly used chloroplastgenes (rbcL rpoB rpoC1 matK trnL trnH-psbA atpF-atpH and psbK-psbI) and theircorresponding primers Although a higher level of discriminating power was shown inseveral studies atpF-atpH and psbK-psbI were not as commonly used as other barcodinggenes (Hollingsworth Graham amp Little 2011) As one of the most rapidly evolving codinggenes of plastid genomes matK was considered as the closest plant analogue to theanimal barcode COI (Hilu amp Liang 1997) However matK was difficult to amplify using

Yang et al (2016) PeerJ DOI 107717peerj2345 1121

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 12: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Table 6 List of the lowest taxonomic food items in the diet of geese

Food items Level of identification GWFG BG

N reads Fs () Fm () N reads Fs () Fm ()

Poaceae spp (except Poa annua) Family 51705 4798 4568 0 000 000Poa annua Species 23554 2186 000 167 020 000Carex heterolepis Species 18867 1751 1639 81457 9949 6285Carex spp (except Carex heterolepis) Genus 9706 901 231 191 023 349Alopecurus aequalis Species 3458 321 000 0 000 000Potentilla chinensis Species 184 017 118 65 008 206Cynodon dactylon Species 155 014 000 0 000 000Polygonum spp Genus 56 005 000 0 000 000Stellaria media Species 26 002 000 0 000 000Ranunculus chinensis Species 14 002 000 0 000 000Lapsana apogonoides Species 11 002 000 0 000 000Asteraceae sp Genus 16 001 233 0 000 1455Alopecurus Genus 0 000 3093 0 000 1318Carex thunbergii Species 0 000 054 0 000 279Fabaceae sp Genus 0 000 064 0 000 108

NotesGWFG Greater white-fronted goose BG Bean goose Fs percentage of sequences in DNA metabarcoding Fm percentage of epidermis squares in microhistological analysis

available primer sets with only 43 of successful amplification in this study In spite ofthe higher species discrimination success of trnH-psbA than rbcL+matK in some groupsthe presence of duplicated loci microinversions and premature termination of readsby mononucleotide repeats lead to considerable proportion of low-quality sequences andover-estimation of genetic difference when using trnH-psbA (Graham et al 2000WhitlockHale amp Groff 2010) In contrast the barcode region of rbcL is easy to amplify sequenceand align in most plants and was recommended as the standard barcode for land plants(Chase et al 2007) The relatively modest discriminating power (compared to trnL) pre-cludes its application for our study aiming to recover high resolution of food items Conse-quently trnL was selected out of eight candidate markers with 100 amplification successmore than 90 of high quality sequences and relatively large inter-specific divergence

One of the biggest obstacles in biodiversity assessment and dietary analysis is the lack ofa comprehensive reference library without which it is impossible to accurately interpretand assign sequences generated from high-throughput sequencing (Valentini Pompanonamp Taberlet 2009 Barco et al 2015) In this study we constructed a local reference libraryby amplifying the most common species (70 morpho-species in total) during the winteringperiod with the trnL gene Although not all of them could be identified at species-levelwith trnL due to relatively low inter-specific divergence many species could be separatedwith distinctive sequences Previous studies have recommended group-specific barcodes todifferentiate closely related plants at the species level (Li et al 2015) For instance matKhas been proved to be more efficient for the discrimination of Carex spp (Starr Naczi ampChouinard 2009) However the primer set of matK failed to amplify species of Carex spp

Yang et al (2016) PeerJ DOI 107717peerj2345 1221

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 13: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

in our study suggesting the universality of selected primer pairs should be tested in eachstudy (Zhan et al 2014)

Applications of metabarcoding for geese diet analysisA variety of recent studies have demonstrated the great potential of metabarcoding fordietary analysis mainly owing to the high throughput high discriminating power and theability to process large-scale samples simultaneously (Creer et al 2010 Taberlet et al 2012Shehzad et al 2012) In this study we applied this method to recover diets of herbivorousgeese and provided standard protocols for dietary analysis of these two ecologicallyimportant waterbirds Our results further proved the more objective less experience-dependent and more time-efficient character of DNA metabarcoding However not allthe species in the reference library could be identified at species-level owing to low inter-specific divergence We suggest that multiple group-specific markers to be incorporatedin the future as in De Barba et al (2014) Two species Carex thunbergii and Fabaceae spwere only discovered via microhistologic analysis rather than metabarcoding This failuremight reflect the biased fragment amplification of current technology of which dominanttemplates could act as inhibitors of less dominant species (Pintildeol et al 2015) Howeverthree species of Poaceae were only discovered using metabarcoding In total more taxaand higher resolution were attained using metabarcoding But microhistology still proveda powerful supplementary Previous studies using metabarcoding usually detected dozensof food items even as many as more than one hundred species For instance 18 taxaprey were identified for leopard cat (Prionailurus bengalensis) (Shehzad et al 2012) 44plant taxa were recovered in feces of red-headed wood pigeon (Columba janthina nitens)(Ando et al 2013) while more than 100 taxa were found in diet studies of brown bear(Ursus arctos) (De Barba et al 2014) The relatively narrow diet spectrum of herbivorousgeese may lead to misunderstanding that this result of our study is merely an artefact dueto small sampling effort However this result is credible since these two geese species onlyfeed on Carex meadow where the dominant vegetation is Carex spp with other speciessuch as Poaceae and dicots (Zhao et al 2015) Even though other wetland plants exist theyusually composed only a small proportion of the geese diets

Quantification of food composition is another key concern in dietary analysis Althoughthe relative percentage of sequences was not truly a quantitative estimate of diet taxa ofthe majority sequences in this study were in accord with microhistologic observationswhich was considered an efficient way to provide quantitative results (Wang et al 2013)Discrepancies might come from the semi-quantitative nature of metabarcoding methods(Sun et al 2015) This is likely derived from PCR amplification which always entails biasescaused by universal primer-template mismatches annealing temperature or number ofPCR cycles (Zhan et al 2014Pintildeol et al 2015) Othermethods such as shotgun sequencingor metagenomic sequencing could be incorporated in the future to give information onabundances of food items (Srivathsan et al 2015)

Implications for waterbird conservation and wetland managementFor long-distance migratory waterbirds such as the wild geese in this study theirabundance and distribution are greatly influenced by diet availability and habitat use

Yang et al (2016) PeerJ DOI 107717peerj2345 1321

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 14: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

(Wang et al 2013) For example waterbirds may be restricted at (forced to leave) certainareas due to favoring (loss) of particular food (Wang et al 2013) while the recoveryof such food may contribute to return of bird populations (Noordhuis Van der Molenamp Van den Berg 2002) Results of both metabarcoding and microhistologic analysis inthis study revealed that Carex and Poaceae were dominant food components which is inaccordance with previous studies The increasing number of these two geese wintering atthe Shengjin Lake may be attributed to the expansion of Carex meadow which offers accessto abundant food resources (Zhao et al 2015) Considering the long-distance migratorycharacter of these birds it is important to maintain energy balances and good bodyconditions in wintering areas because this might further influence their departure datesand reproductive success after arriving at breeding areas (Prop Black amp Shimmings 2003)Based on this it is important for wetland managers to maintain the suitable habitats andfood resources for sustainable conservation of waterbirds which highlights the significanceof diet information Our study also indicated that overlap and dissimilarity existed betweenthe diets of these two geese Animals foraging in the same habitats may compete for limitedfood resources (Madsen amp Mortensen 1987) This discrepancy of food composition mayarise from the avoidance of inter-specific competition (Zhao et al 2015) However withthe increase of these two species in Shengjin Lake further research is needed to investigatethe mechanisms of food resource partitioning and spatial distribution

Shengjin Lake is one of themost important wintering sites for tens of thousands ofmigra-tory watebirds while annual life cycles of these birds depend on the whole migratory routeincluding breeding sites stop-over sites and wintering sites (Kear 2005) Thus a molecularreference library covering all the potential food items along the whole migratory routewill be useful both for understanding of wetland connections and waterbird conservationIn addition the ability of DNA metabarcoding to process lots of samples simultaneouslyenables rapid analyses and makes this method helpful for waterbird studies

ACKNOWLEDGEMENTSWe are very grateful to the stuff of the Shengjin Lake National Nature Reserve for theirexcellent assistance during the field work Great thanks to Zhujun Wang and An An forfeces collection in the field We thank Song Yang for collecting plants in the Shengjin LakeReserve We also thank Profs Zhenyu Li and Shuren Zhang for plant identification Specialthanks to Drs Meijuan Zhao Xin Wang Fanjuan Meng and Peihao Cong for preparingthe epidermis database and guiding microhistologic analysis

ADDITIONAL INFORMATION AND DECLARATIONS

FundingThis work was supported by the National Basic Research Program of China (973 ProgramGrant No 2012CB956104) to EZ the National Natural Science Foundation of China(Grant No 31370416) State Key Laboratory of Urban and Regional Ecology ChineseAcademy of Sciences (No SKLURE2013-1-05) to LC the National Natural Science

Yang et al (2016) PeerJ DOI 107717peerj2345 1421

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 15: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Foundation of China (Grant No 31500315) to XW and 100-Talent Program of theChinese Academy of Sciences to AZ The funders had no role in study design datacollection and analysis decision to publish or preparation of the manuscript

Grant DisclosuresThe following grant information was disclosed by the authorsNational Basic Research Program of China 2012CB956104National Natural Science Foundation of China 31370416State Key Laboratory of Urban and Regional EcologyChinese Academy of Sciences SKLURE2013-1-05National Natural Science Foundation of China 31500315

Competing InterestsThe authors declare there are no competing interests

Author Contributionsbull Yuzhan Yang conceived and designed the experiments performed the experimentsanalyzed the data contributed reagentsmaterialsanalysis tools wrote the paperprepared figures andor tables reviewed drafts of the paperbull Aibin Zhan conceived and designed the experiments analyzed the data contributedreagentsmaterialsanalysis tools reviewed drafts of the paperbull Lei Cao conceived and designed the experiments contributed reagentsmaterialsanalysistools reviewed drafts of the paperbull Fanjuan Meng and Wenbin Xu performed the experiments contributed reagentsmate-rialsanalysis tools reviewed drafts of the paper

Animal EthicsThe following information was supplied relating to ethical approvals (ie approving bodyand any reference numbers)

Our research work did not involve capture or any direct manipulation or disturbancesof animals

Field Study PermissionsThe following information was supplied relating to field study approvals (ie approvingbody and any reference numbers)

We collected samples of plants and feces for molecular analyses We received permissionto access the reserve for sampling from the Shengjin Lake National Nature ReserveAdministration (Chizhou Anhui China) which is responsible for the management of theprotected area and wildlife We were forbidden to capture or disturb geese in the field

DNA DepositionThe following information was supplied regarding the deposition of DNA sequences

NCBI Sequence Read Archive (SRA) databaseAccession number SRP070470

Yang et al (2016) PeerJ DOI 107717peerj2345 1521

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 16: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Data AvailabilityThe following information was supplied regarding data availability

The raw data has been supplied as a Supplemental Dataset

Supplemental InformationSupplemental information for this article can be found online at httpdxdoiorg107717peerj2345supplemental-information

REFERENCESAltschul SF GishWMillerWMyers EW Lipman DJ 1990 Basic local alignment

search tool Journal of Molecular Biology 215403ndash410 DOI 101006jmbi19909999Aacutelvarez I Wendel JF 2003 Ribosomal ITS sequences and plant phylogenetic inference

Molecular Phylogenetics and Evolution 29417ndash434DOI 101016S1055-7903(03)00208-2

Ando H Setsuko S Horikoshi K Suzuki H Umehara S Inoue-MurayamaM IsagiY 2013 Diet analysis by nextmdashgeneration sequencing indicates the frequentconsumption of introduced plants by the critically endangered redmdashheaded woodpigeon (Columba janthina nitens) in oceanic island habitats Ecology and Evolution34057ndash4069 DOI 101002ece3773

Barco A RaupachMJ Laakmann S NeumannH Knebelsberger T 2015 Identificationof North Sea molluscs with DNA barcodingMolecular Ecology Resources 16288ndash297DOI 1011111755-099812440

Bik HM Porazinska DL Creer S Caporaso JG Knight R ThomasWK 2012 Sequenc-ing our way towards understanding global eukaryotic biodiversity Trends in Ecologyand Evolution 27233ndash243 DOI 101016jtree201111010

Bohmann K Evans A Gilbert MTP Carvalho GR Creer S KnappM YuWD DeBruynM 2014 Environmental DNA for wildlife biology and biodiversity monitor-ing Trends in Ecology and Evolution 29358ndash367 DOI 101016jtree201404003

Brander LM Florax RJ Vermaat JE 2006 The empirics of wetland valuation acomprehensive summary and a meta-analysis of the literature Environmental andResource Economics 33223ndash250 DOI 101007s10640-005-3104-4

Chase MW Cowan RS Hollingsworth PM Van den Berg C Madrintildeaacuten S Petersen GSeberg O Jorgsensen T Cameron KM Carine M Pedersen N Hedderson TAJConrad F Salazar GA Richardson JE HollingsworthM Barraclough TG KellyL WilkinsonM 2007 A proposal for a standardised protocol to barcode all landplants Taxon 56295ndash299

Cota-Sanchez JH Remarchuk K Ubayasena K 2006 Ready-to-use DNA extracted witha CTAB method adapted for herbarium specimens and mucilaginous plant tissuePlant Molecular Biology Reporter 24161ndash167 DOI 101007BF02914055

Creer S Fonseca VG Porazinska DL Giblin-Davis RM SungW Power DM Packer MCarvalho GR Blaxter ML Lambshead P ThomasWK 2010 Ultrasequencing ofthe meiofaunal biosphere Practice pitfalls and promisesMolecular Ecology 194ndash20DOI 101111j1365-294X200904473x

Yang et al (2016) PeerJ DOI 107717peerj2345 1621

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 17: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

De BarbaMMiquel C Boyer F Mercier C Rioux D Coissac E Taberlet P 2014 DNAmetabarcoding multiplexing and validation of data accuracy for diet assessmentapplication to omnivorous dietMolecular Ecology Resources 14306ndash323DOI 1011111755-099812188

Deagle BE Kirkwood R Jarman SN 2009 Analysis of Australian fur seal diet bypyrosequencing prey DNA in faecesMolecular Ecology 182022ndash2038DOI 101111j1365-294X200904158x

Dunning LT Savolainen V 2010 Broad-scale amplification of matK for DNA barcodingplants a technical note Botanical Journal of the Linnean Society 1641ndash9DOI 101111j1095-8339201001071x

Edgar RC 2010 Search and clustering orders of magnitude faster than BLAST Bioinfor-matics 262460ndash2461 DOI 101093bioinformaticsbtq461

Edgar RC 2013 UPARSE highly accurate OTU sequences from microbial ampliconreads Nature Methods 10996ndash998 DOI 101038NMETH2604

Elliott TL Jonathan Davies T 2014 Challenges to barcoding an entire floraMolecularEcology Resources 14883ndash891 DOI 1011111755-099812277

Fazekas AJ Burgess KS Kesanakurti PR Graham SW Newmaster SG Husband BCPercy DM Hajibabaei M Barrett SC 2008Multiple multilocus DNA barcodesfrom the plastid genome discriminate plant species equally well PLoS ONE 3e2802DOI 101371journalpone0002802

Fonseca VG Carvalho GR SungW Johnson HF Power DM Neill SP Packer MBlaxter ML Labmshead PJD ThomasWK Creer S 2010 Second-generationenvironmental sequencing unmasks marine metazoan biodiversity Nature Commu-nications 1Article 98 DOI 101038ncomms1095

Ford CS Ayres KL Haider N Toomey N Van-Alpen-Stohl J 2009 Selection ofcandidate DNA barcoding regions for use on land plants Botanical Journal of theLinnean Society 1591ndash11 DOI 101111j1095-8339200800938x

Fox AD Bergersen E Tombre IM Madsen J 2007Minimal intra-seasonal dietaryoverlap of barnacle and pink-footed geese on their breeding grounds in SvalbardPolar Biology 30759ndash768 DOI 101007s00300-006-0235-1

Fox AD Cao L Zhang Y Barter M ZhaoMMeng FWang S 2011 Declines in thetuber-feeding waterbird guild at Shengjin Lake National Nature Reserve Chinandashabarometer of submerged macrophyte collapse Aquatic Conservation Marine andFreshwater Ecosystems 2182ndash91 DOI 101002aqc1154

Graham SW Reeves PA Burns AC Olmstead RG 2000Microstructural changes innoncoding chloroplast DNA interpretation evolution and utility of indels andinversions in basal angiosperm phylogenetic inference International Journal of PlantSciences 161S83ndashS96 DOI 101086317583

HasebeM Omori T NakazawaM Sano T KatoM Iwatsuki K 1994 rbcL genesequences provide evidence for the evolutionary lineages of leptosporangiate fernsProceedings of the National Academy of Sciences of the United States of America915730ndash5734 DOI 101073pnas91125730

Yang et al (2016) PeerJ DOI 107717peerj2345 1721

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 18: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Hibert F Taberlet P Chave J Scotti-Saintagne C Sabatier D Richard-Hansen C 2013Unveiling the diet of elusive rainforest herbivores in next generation sequencing eraThe tapir as a case study PLoS ONE 8e60799 DOI 101371journalpone0060799

Hilu KW Liang H 1997 ThematK gene sequence variation and application in plantsystematics American Journal of Botany 84830ndash839 DOI 1023072445819

Hollingsworth PM Graham SW Little DP 2011 Choosing and using a plant DNAbarcode PLoS ONE 6e19254 DOI 101371journalpone0019254

James HF Burney DA 1997 The diet and ecology of Hawaiirsquos extinct flightless water-fowl evidence from coprolites Biological Journal of the Linnean Society 62279ndash297DOI 101111j1095-83121997tb01627x

Kear J 2005Ducks geese and swans Oxford Oxford University PressKressWJ Erickson DL Jones FA Swenson NG Perez R Sanjur O Bermingham E

2009 Plant DNA barcodes and a community phylogeny of a tropical forest dynamicsplot in Panama Proceedings of the National Academy of Sciences of the United States ofAmerica 10618621ndash18626 DOI 101073pnas0909820106

Li X Yang Y Henry RJ Rossetto MWang Y Chen S 2015 Plant DNA barcoding fromgene to genome Biological Reviews 90157ndash166 DOI 101111brv12104

Ma Z Cai Y Li B Chen J 2010Managing wetland habitats for waterbirds an interna-tional perspectiveWetlands 3015ndash27 DOI 101007s13157-009-0001-6

Madsen J Mortensen CE 1987Habitat exploitation and interspecific competition ofmoulting geese in East Greenl Ibis 12925ndash44DOI 101111j1474-919X1987tb03157x

Noordhuis R Van der Molen DT Van den BergMS 2002 Response of herbivorouswater-birds to the return of Chara in Lake Veluwemeer The Netherlands AquaticBotany 72349ndash367 DOI 101016S0304-3770(01)00210-8

Parameswaran P Jalili R Tao L Shokralla S Gharizadeh B Ronaghi M Fire AZ 2007A pyrosequencing-tailored nucleotide barcode design unveils opportunities for large-scale sample multiplexing Nucleic Acids Research 35e130 DOI 101093nargkm760

Pintildeol J Mir G Gomez-Polo P Agustiacute N 2015 Universal and blocking primermismatches limit the use of high-throughput DNA sequencing for the quanti-tative metabarcoding of arthropodsMolecular Ecology Resources 15819ndash830DOI 1011111755-099812355

Pompanon F Deagle BE SymondsonWO Brown DS Jarman SN Taberlet P 2012Who is eating what diet assessment using next generation sequencingMolecularEcology 211931ndash1950 DOI 101111j1365-294X201105403x

Prop J Black JM Shimmings P 2003 Travel schedules to the high arctic barnaclegeese trademdashoff the timing of migration with accumulation of fat deposits Oikos103403ndash414 DOI 101034j1600-0706200312042x

Rayeacute G Miquel C Coissac E Redjadj C Loison A Taberlet P 2011 New insightson diet variability revealed by DNA barcoding and high-throughput sequenc-ing chamois diet in autumn as a case study Ecological Research 26265ndash276DOI 101007s11284-010-0780-5

Yang et al (2016) PeerJ DOI 107717peerj2345 1821

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 19: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Reynolds C Miranda NA Cumming GS 2015 The role of waterbirds in the dispersalof aquatic alien and invasive species Diversity and Distributions 21744ndash754DOI 101111ddi12334

Samelius G Alisauskas RT 1999 Diet and growth of glaucous gulls at a large Arcticgoose colony Canadian Journal of Zoology 771327ndash1331 DOI 101139z99-091

Sang T Crawford DJ Stuessy TF 1997 Chloroplast DNA phylogeny reticulateevolution and biogeography of Paeonia (Paeoniaceae) American Journal of Botany841120ndash1136

ShehzadW Riaz T NawazMAMiquel C Poillot C Shah SA Pompanon F CoissacE Taberlet P 2012 Carnivore diet analysis based on next-generation sequencingapplication to the leopard cat (Prionailurus bengalensis) in PakistanMolecularEcology 211951ndash1965 DOI 101111j1365-294X201105424x

Shokralla S Gibson JF Nikbakht H Janzen DH HallwachsW Hajibabaei M 2014Next-generation DNA barcoding using next-generation sequencing to enhance andaccelerate DNA barcode capture from single specimensMolecular Ecology Resources14892ndash901 DOI 1011111755-099812236

Shokralla S Spall JL Gibson JF 2012 Next-generation sequencing technologies forenvironmental DNA researchMolecular Ecology 211794ndash1805DOI 101111j1365-294X201205538x

Srivathsan A Sha J Vogler AP Meier R 2015 Comparing the effectiveness of metage-nomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrixnemaeus)Molecular Ecology Resources 15250ndash261 DOI 1011111755-099812302

Starr JR Naczi RFC Chouinard BN 2009 Plant DNA barcodes and species res-olution in sedge (Carex Cyperaceae)Molecular Ecology Resources 9151ndash163DOI 101111j1755-0998200902640x

Sun C Zhao Y Li H Dong Y MacIsaac HJ Zhan A 2015 Unreliable quantificationof species abundance based on high-throughput sequencing data of zooplanktoncommunities Aquatic Biology 249ndash15 DOI 103354ab00629

Swennen C Yu YT 2005 Food and feeding behavior of the black-faced SpoonbillWaterbirds 2819ndash27 DOI 1016751524-4695(2005)028[0019FAFBOT]20CO2

SymondsonWOC 2002Molecular identification of prey in predator dietsMolecularEcology 11627ndash641 DOI 101046j1365-294X200201471x

Taberlet P Coissac E Pompanon F Brochmann CWillerslev E 2012 Towards next-generation biodiversity assessment using DNA metabarcodingMolecular Ecology212045ndash2050 DOI 101111j1365-294X201205470x

Taberlet P Coissac E Pompanon F Gielly L Miquel C Valentini A Vermat TCorthier G Brocmann CWillerslev E 2007 Power and limitations of the chloro-plast trnL (UAA) intron for plant DNA barcoding Nucleic Acids Research 35e14DOI 101093nargkl938

Taberlet P Gielly L Pautou G Bouvet J 1991 Universal primers for amplifica-tion of three non-coding regions of chloroplast DNA Plant Molecular Biology171105ndash1109 DOI 101007BF00037152

Yang et al (2016) PeerJ DOI 107717peerj2345 1921

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 20: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

Tamura K Stecher G Peterson D Filipski A Kumar S 2013MEGA6 molecular evolu-tionary genetics analysis version 60MolecularBiology and Evolution 302725ndash2729DOI 101093molbevmst197

Tate JA Simpson BB 2003 Paraphyly of Tarasa (Malvaceae) and diverse origins of thepolyploidy species Systematic Botany 28723ndash737 DOI 10104302-641

Valentini A Pompanon F Taberlet P 2009 DNA barcoding for ecologists Trends inEcology and Evolution 24110ndash117 DOI 101016jtree200809011

Wang X Fox AD Cong P Barter M Cao L 2012 Changes in the distribution and abun-dance of wintering Lesser White-fronted Geese Anser erythropus in eastern ChinaBird Conservation International 22128ndash134 DOI 101017S095927091100030X

Wang X Fox AD Cong P Cao L 2013 Food constraints explain the restricted dis-tribution of wintering Lesser white-fronted Geese Anser erythropus in China Ibis155576ndash592 DOI 101111ibi12039

Whitlock BA Hale AM Groff PA 2010 Intraspecific inversions pose a challenge for thetrnH-psbA plant DNA barcode PLoS ONE 5e11533DOI 101371journalpone0011533

Wolfe KH LiWH Sharp PM 1987 Rates of nucleotide substitution vary greatlyamong plant mitochondrial chloroplast and nuclear DNAs Proceedings ofthe National Academy of Sciences of the United States of America 849054ndash9058DOI 101073pnas84249054

Xu C DongW Shi S Cheng T Li C Liu YWu PWuH Gao P Zhou S 2015 Acceler-ating plant DNA barcode reference library construction using herbarium specimensimproved experimental techniquesMolecular Ecology Resources 151366ndash1374DOI 1011111755-099812413

Xu L XuW Sun Q Zhou Z Shen J Zhao X 2008 Flora and vegetation in ShengjinLake Journal of Wuhan Botanical Research 27264ndash270

Zhan A Bailey SA Heath DDMacIsaac HJ 2014 Performance comparison of geneticmarkers for high-throughput sequencing-based biodiversity assessment in complexcommunitiesMolecular Ecology Resources 141049ndash1059DOI 1011111755-099812254

Zhan A MacIsaac HJ 2015 Rare biosphere exploration using high-throughput sequenc-ing research progress and perspectives Conservation Genetics 16513ndash522DOI 101007s10592-014-0678-9

Zhang Y Cao L Barter M Fox AD ZhaoMMeng F Shi H 2011 Changing distributionand abundance of Swan Goose Anser cygnoides in the Yangtze River floodplainthe likely loss of a very important wintering site Bird Conservation International2136ndash48 DOI 101017S0959270910000201

ZhaoM Cao L Fox AD 2013 Distribution and diet of wintering Tundra Bean GeeseAnser fabalis serrirostris at Shengjin LakeYangtze River floodplain ChinaWildfow6052ndash63

ZhaoM Cao L KlaassenM Zhang Y Fox AD 2015 Avoiding competition Site usediet and foraging behaviours in two similarly sized geese wintering in China Ardea10327ndash38 DOI 105253ardev103i1a3

Yang et al (2016) PeerJ DOI 107717peerj2345 2021

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121

Page 21: Selection of a marker gene to construct and the application ...We concluded that DNA metabarcoding provides new perspectives for studies of herbivorous waterbird diets and inter-specific

ZhaoM Cong P Barter M Fox AD Cao L 2012 The changing abundanceand distribu-tion of Greater White-fronted Geese Anser albifrons in the Yangtze River floodplainimpacts of recenthydrological changes Bird Conservation International 22135ndash143DOI 101017S0959270911000542

Yang et al (2016) PeerJ DOI 107717peerj2345 2121