Nature Metrics - Environmental Monitoring

37

Transcript of Nature Metrics - Environmental Monitoring

Page 1: Nature Metrics - Environmental Monitoring
Page 2: Nature Metrics - Environmental Monitoring

Wildlife protection: weak institutions

Should we allocate resources to patrolling, community payments, and/or snare removal? If so, where? Are nature reserves effective?

Page 3: Nature Metrics - Environmental Monitoring
Page 4: Nature Metrics - Environmental Monitoring

iDNA: Collect the collectors

Page 5: Nature Metrics - Environmental Monitoring

21 of 25 leeches

contained mammal DNA

Page 6: Nature Metrics - Environmental Monitoring
Page 7: Nature Metrics - Environmental Monitoring

Metabarcoding

Bulk DNA extraction per tube Expensive and difficult

PCR amplification of 16S gene (16Smam + human blocker) (~89-90 bp amplicon for verts)

Amplicon sequencing: 218 M reads1,598 98% OTUs assigned to Vertebrata

Bulk samples of leeches (1 sample = 1 collecting session) ~40 leeches per tube with RNALater 468 samples from WWF

Page 8: Nature Metrics - Environmental Monitoring

Resultsof16Smamprimerset404of468samplesweresuccessfullyamplifiedandsequenced

Rawdata 218,733,600 reads

Denoiseandmerge 163,584,969 reads

Qualitycontrol 139,580,727 reads

Mergeidenticalreads 4,325,333 uniqueseqs

Removechimeras 4,321,587 uniquenon-chimeraseqs

Clusterseqsat98%(CROP) 2,028 OTUs

BLASTandremovenon-targetOTUs 1,598 16S-vertOTUs

Page 9: Nature Metrics - Environmental Monitoring

Awiderangeofmammals

BovidaeCervidaeTragulidaeSuidaeCanidaeMustelidae

UrsidaeHerpestidaeFelidaeChiropteraCercopithecidaeCricetidae

HystricidaeLeporidaeMuridaeSciuridaeSpalacidaeTupaiidae

Page 10: Nature Metrics - Environmental Monitoring

Douglas Chesters

Taxonomic assignment via phylogenetic placement

Constrained to the topology from Bininda-Emonds et al.

(2007) Nature (multi-gene phylogeny)

Page 11: Nature Metrics - Environmental Monitoring
Page 12: Nature Metrics - Environmental Monitoring
Page 13: Nature Metrics - Environmental Monitoring

BostaurusBubalusbubalis MuntiacusRusaunicolorTraguluskanchilCapricornissumatraensis/milneedwardsii

SusscrofaMartesflavigulaArctonyxcollarisUrsusthibetanusHerpestesjavanicusCatopumatemninckii

Prionailurusspp.Macacaspp.NesolagustimminsiNiviventerRhizomyspruinosusTupaiabelangeri

Page 14: Nature Metrics - Environmental Monitoring

RusaunicolorCapricornis

sumatraensis/milneedwardsii

Ursusthibetanus

Page 15: Nature Metrics - Environmental Monitoring

We are using leeches and 16S rRNA to search for the saola antelope

Page 16: Nature Metrics - Environmental Monitoring

Wehaveaverylow-confidencedetectionofSaola(Pseudoryxnghetinensis)

289readsinHueSaoLaReserve2readsinQuangnamSaoLaNationalReserve

reference sequencequery OTU sequence

Page 17: Nature Metrics - Environmental Monitoring

Carnivora

Page 18: Nature Metrics - Environmental Monitoring
Page 19: Nature Metrics - Environmental Monitoring

Is this nature reserve effective at protecting

wildlife?

How to demonstrate effectiveness to the

government?

Page 20: Nature Metrics - Environmental Monitoring
Page 21: Nature Metrics - Environmental Monitoring

Using leeches to create a wildlife performance indicator

for nature reserves

• 7520 leeches in 141 sample locations (total area 678 km2)

• mammals, birds, amphibians, reptiles

• repeated samples per ranger for occupancy correction

environmental data on topography, distance to human settlements

• yearly sampling by rangers to detect contractions and expansions in species ranges

• how to design ranger incentives?

Page 22: Nature Metrics - Environmental Monitoring

Challenge: To protect wildlife populations

Govts,NGOs,

Park Mgmt

Park Mgmt, Rangers,

Communities

Patrolling, Snare removal, Payments for Ecological Services

iDNA + a lot of statistics

Informative signal, and cheaper than standard

censuses

Not timely, & challenges for incentive design

Page 23: Nature Metrics - Environmental Monitoring

Bioinformatic challenges: Basic pipeline is pretty good

• PCR -> hybrid capture + shotgun? • Amplicon pipeline improvements?

• Sequence wrangling with QIIME scripts • remove primers and indices, split libraries by indices

• Denoise with github.com/lh3/bfc, usearch8 (maxee=1)

• Pair-merge with usearch8 • Chimera removal with uchime • Cluster with CROP (Hao et al. 2011. Bioinformatics 27:611–618.)

Page 24: Nature Metrics - Environmental Monitoring

Bioinformatic challenges: Taxonomic assignment

• Sequence similarity • BLAST (+ MEGAN): Gallus gallus example • Naive Bayesian Classifier • HMMER? • Metagenomic binning programs?

• Sequence similarity + phylogenetic tree • SAP (Munch et al.): build local database for speed? • Geneious8 Sequence Classifier

• Phylogenetic placement on ML reference trees • RaxML/EPA • pplacer

• PROTAX • unpublished wrapper to call consensus taxonomy, takes into account known

species without a sequence • SEPP? TIPP?

Page 25: Nature Metrics - Environmental Monitoring

25

100s to 1000s of wild bee species per country

Necessary for pollinating wild and crop plants

Page 26: Nature Metrics - Environmental Monitoring
Page 27: Nature Metrics - Environmental Monitoring
Page 28: Nature Metrics - Environmental Monitoring

200 sites to have a 93% probability of detecting 2% annual decline in species richness and abundance.

3120bees x 200sites x 2yrs1&5 = 1.25 million bees to be identified to species

Conservation Biology, 27, 113–120 (2012)

Page 29: Nature Metrics - Environmental Monitoring

Conservation Biology, 27, 113–120 (2012)

1.25 million bees identified to species

~£1.33 M @ < 2 minutes per bee, if ID’d visually by experts

Would this dataset hold up in court?

Page 30: Nature Metrics - Environmental Monitoring

DNA extraction, sequencing (Hiseq 2000), assembly (SOAPdenovo-Trans), mito-annotation, protein-coding-gene extraction alignment (ClustalW2, MEGA6), and verification with reads (BWA)

...

Sample 1, Sample 2 Sample 10

Bulk bee samples

1. Reference construction 2. Morphological identification

#1

...1 5 3...

#2 3 1 1

2 0 1

...

3. Mitogenomic resequencing

Bulk DNA extraction and sequencing (Hiseq 2000), and mapping reads onto references (BWA)

48 bee mito-references

Taxonomist

Mito-scaffolds

...

#1

#2

#48

...

Biomasses

...

Sample 1 Sample 2 ... Sample 10

Bee spp.#33

1. Mitogenomic skimming

3. Mitogenomic resequencing

2. Morphological identification

Page 31: Nature Metrics - Environmental Monitoring

DNA extraction, sequencing (Hiseq 2000), assembly (SOAPdenovo-Trans), mito-annotation, protein-coding-gene extraction alignment (ClustalW2, MEGA6), and verification with reads (BWA)

...

Sample 1, Sample 2 Sample 10

Bulk bee samples

1. Reference construction 2. Morphological identification

#1

...1 5 3...

#2 3 1 1

2 0 1

...

3. Mitogenomic resequencing

Bulk DNA extraction and sequencing (Hiseq 2000), and mapping reads onto references (BWA)

48 bee mito-references

Taxonomist

Mito-scaffolds

...

#1

#2

#48

...

Biomasses

...

Sample 1 Sample 2 ... Sample 10

Bee spp.#33

1. Mitogenomic skimming

3. Mitogenomic resequencing

2. Morphological identification

Page 32: Nature Metrics - Environmental Monitoring

0.0

0.2

0.4

0.6

0.8

0 25 50 75 100% Coverage

Freq

uenc

ies

True Negatives

True Positives

0 10,000 16,000 bp

0.0%

14.2%

0.0%

73.1%

0.0%

0.0%

0.0%

0.0%

2.0%

0.0%

94.0%

0.0%

0.0%

0.0%

0.0%

0.0%

86.4%

0.0%

0.0%

0.0%

0.0%

0.0%

0.7%

2.4%

0.0%

4.7%

0.0%

1.1%

0.0%

0.0%

0.0%

0.0%

0.0%

0.8%

0.0%

0.0%

2.6%

0.0%

1.3%

69.1%

29.1%

29.4%

0.0%

0.0%

1.4%

0.0%

0.0%

0.0%

Andrena angustiorAndrena bicolorAndrena chrysoscelesAndrena cinerariaAndrena dorsataAndrena flavipesAndrena fulvagoAndrena haemorrhoaAndrena labiataAndrena minutulaAndrena nigroaeneaAndrena nitidaAndrena semilaevisAndrena subopacaApis melliferaBombus hortorumBombus lapidariusBombus lucorumBombus pascuorumBombus pratorumBombus sylvestrisBombus terrestrisHalictus rubicundusHalictus tumulorumHylaeus confususHylaeus dilatatusLasioglossum calceatumLasioglossum fulvicorneLasioglossum laevigatumLasioglossum lativentreLasioglossum leucopusLasioglossum leucozoniumLasioglossum malachurumLasioglossum minutissimumLasioglossum morioLasioglossum parvulumLasioglossum pauxillumLasioglossum punctatissimumLasioglossum villosulumLasioglossum xanthopusNomada fabricianaNomada flavaNomada flavoguttataNomada goodenianaNomada ruficornisOsmia bicornisSphecodes ephippiusSphecodes miniatus

Page 33: Nature Metrics - Environmental Monitoring

Bee#speciesCodicote_4A_1

Codicote_6A_1

Codicote_8A_1

Collings_12A_1

Collings_6A_1

Crux_1A_2

Crux_6A_2

Crux_7A_3

Malham_2A_3

Tismans_3A_2

Codicote_4A_1

Codicote_6A_1

Codicote_8A_1

Collings_12A_1

Collings_6A_1

Crux_1A_2

Crux_6A_2

Crux_7A_3

Malham_2A_3

Tismans_3A_2

Andrena'angustiorAndrena'bicolor 2 1534

Andrena'chrysosceles 5 11077Andrena'cineraria 1 2 5395 44216Andrena'dorsata 1 7464Andrena'flavipes 5 14895Andrena'fulvago

Andrena'haemorrhoa 1 1 1 5 2830 7781 2648 35563Andrena'labiata

Andrena'minutula 9 2 4 1371 242 1810Andrena'nigroaenea 3 1 4 2 2 21272 1000 14213 19369 7184

Andrena'nitida 1 3 3 2 6511 18994 33721 51098Andrena'semilaevis 1 2337Andrena'subopaca 2 1

Apis'mellifera 1 1 1 1 385 10652 733 4609

Bombus'hortorum 1 2005Bombus'lapidarius 1 44769Bombus'lucorum 3002 8609

Bombus'pascuorum 1 1439Bombus'pratorum 1 9657Bombus'sylvestrisBombus'terrestris 2 2 4 8377 1078 5869

Halictus'rubicundus 2 3634Halictus'tumulorum 1 1520

Hylaeus'confususHylaeus'dilatatus

Lasioglossum'calceatum 1 10 10 2 13 15 5 5882 6522 290 3670 12161 1458Lasioglossum'fulvicorne

Lasioglossum'laevigatumLasioglossum'lativentreLasioglossum'leucopus 2 2 238 245

Lasioglossum'malachurum 10 4 867 4866 5185Lasioglossum'minutissimum 4 399 220

Lasioglossum'morioLasioglossum'parvulum 14 6 1293 184 1732Lasioglossum'pauxillum 5 2 3 1007 2217 852

Lasioglossum'punctatissimum 1Lasioglossum'villosulumLasioglossum'xanthopusLasioglossum'zonulum 1 160 181

Nomada'fabriciana 1 222Nomada'flava

Nomada'flavoguttataNomada'goodeniana 1 385

Nomada'ruficornis 1 513

Osmia'bicornis

Sphecodes'ephippius 3 1 1883 779Sphecodes'miniatus 4 2139

Bee#count#data Mitogenomic#read#numbers#per#species#and#specimen

We can detect bee species in bulk samples (93.7% detection rate)Morphological IDs Mitogenomic resequencing

11 bee species detected using morphology

10 bee species detected using mitogenomic resequencing

(we think we are correct)

Page 34: Nature Metrics - Environmental Monitoring

Bee#speciesCodicote_4A_1

Codicote_6A_1

Codicote_8A_1

Collings_12A_1

Collings_6A_1

Crux_1A_2

Crux_6A_2

Crux_7A_3

Malham_2A_3

Tismans_3A_2

Codicote_4A_1

Codicote_6A_1

Codicote_8A_1

Collings_12A_1

Collings_6A_1

Crux_1A_2

Crux_6A_2

Crux_7A_3

Malham_2A_3

Tismans_3A_2

Andrena'angustiorAndrena'bicolor 2 1534

Andrena'chrysosceles 5 11077Andrena'cineraria 1 2 5395 44216Andrena'dorsata 1 7464Andrena'flavipes 5 14895Andrena'fulvago

Andrena'haemorrhoa 1 1 1 5 2830 7781 2648 35563Andrena'labiata

Andrena'minutula 9 2 4 1371 242 1810Andrena'nigroaenea 3 1 4 2 2 21272 1000 14213 19369 7184

Andrena'nitida 1 3 3 2 6511 18994 33721 51098Andrena'semilaevis 1 2337Andrena'subopaca 2 1

Apis'mellifera 1 1 1 1 385 10652 733 4609

Bombus'hortorum 1 2005Bombus'lapidarius 1 44769Bombus'lucorum 3002 8609

Bombus'pascuorum 1 1439Bombus'pratorum 1 9657Bombus'sylvestrisBombus'terrestris 2 2 4 8377 1078 5869

Halictus'rubicundus 2 3634Halictus'tumulorum 1 1520

Hylaeus'confususHylaeus'dilatatus

Lasioglossum'calceatum 1 10 10 2 13 15 5 5882 6522 290 3670 12161 1458Lasioglossum'fulvicorne

Lasioglossum'laevigatumLasioglossum'lativentreLasioglossum'leucopus 2 2 238 245

Lasioglossum'malachurum 10 4 867 4866 5185Lasioglossum'minutissimum 4 399 220

Lasioglossum'morioLasioglossum'parvulum 14 6 1293 184 1732Lasioglossum'pauxillum 5 2 3 1007 2217 852

Lasioglossum'punctatissimum 1Lasioglossum'villosulumLasioglossum'xanthopusLasioglossum'zonulum 1 160 181

Nomada'fabriciana 1 222Nomada'flava

Nomada'flavoguttataNomada'goodeniana 1 385

Nomada'ruficornis 1 513

Osmia'bicornis

Sphecodes'ephippius 3 1 1883 779Sphecodes'miniatus 4 2139

Bee#count#data Mitogenomic#read#numbers#per#species#and#specimen

We can detect bee species in bulk samples (93.7% detection rate)Morphological IDs Mitogenomic resequencing

Bombus lucorum

Bombus terrestris

Bombus lucorum

Bombus terrestris

Species-specific PCR confirmed Bombus lucorum in these two

samples

Workers cannot be differentiated morphologically

Page 35: Nature Metrics - Environmental Monitoring

We can estimate bee biomasses (p=0.001, R2=24.9%), and from this, we should be able to follow population trajectories

35−1 0 1 2 3

−10

12

3

Biomass versus Read numbers

Read numbers, z−transformed, mito−nuc ratio−corrected

Biom

asse

s, z−t

rans

form

ed

Page 36: Nature Metrics - Environmental Monitoring

Challenge: To protect pollinator populations

UK Govt, NGOs,

CitizensFarmers,UK Govt

Insecticide bans, Nature reserves, Agri-environment funds, Consumer choice

mitogenomics + a lot of statistics

Timely, cheap, and informative signal

Robust to contamination, scalable, and auditable

...

Sample 1, Sample 2 Sample 10

Bulk bee samples

1. Reference construction2. Morphological identification

#1

...

1 5 3...#2 3 1 1

2 0 1

...

3. Mitogenomic resequencing

48 bee mito-references

Mito-scaffolds

...

#1

#2

#48

...

Biomasses

...

Sample 1 Sample 2 ... Sample 10

Bee spp.#33

DNA extraction

Sequencing (Hiseq 2000)

Assembly (SOAPdenovo

-Trans)

Mitochondrial protein-coding-

gene annotation

Assembly verification through alignment and

read-mapping

Mitogenome reference

construction

Bulk DNA extraction

Sequencing (Hiseq 2000)

Read-mapping (BWA)

Page 37: Nature Metrics - Environmental Monitoring

Bioinformatic challenges

• Improve biomass frequency estimate? • standard DNA aliquot?

• Assembly of reference species • Remove adapter contamination with trimmomatic • Denoise and pair-merge with bfc, usearch8 (maxee=1) • Assembly with IDBA, org.asm, DBG2OLC, ? • Annotation with MITOS webserver • 1 library per reference species: nuDNA

• Taxonomic/Read assignment • bwa • kraken etc. • what to do with nuDNA ref seqs, which will be non-orthologous?

• SNP calling (in bulk, so no phasing unless long-read sequencers) • ????

• Pollen metagenomics: reference genomes huge

−1 0 1 2 3

−10

12

3

Biomass versus Read numbers

Read numbers, z−transformed, mito−nuc ratio−corrected

Biom

asse

s, z−t

rans

form

ed