Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... ·...

104
Genomic studies of speciation and gene flow

Transcript of Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... ·...

Page 1: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Genomic studies of speciation and gene flow

Page 2: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Why study speciation genomics?

Find speciation genes

How do genomes diverge?

Long-standing questions (role of geography/gene flow)

Page 3: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Genomic divergence during speciation

evolution.berkeley.edu

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

Page 4: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Genomic divergence during speciation

evolution.berkeley.edu

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Sd

frequ

ency

Cline theory - e.g. Barton and Gale 1993

Page 5: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Genomic divergence during speciation

evolution.berkeley.edu Wu 2001, JEB

1. Speciation as a bi-product of physical isolation

2. Speciation due to selection – without isolation

?

Time

Page 6: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 7: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stage 1 - one or few loci under disruptive selection

Genome

FST

Gene under

selection

Feder, Egan and Nosil TiG

Page 8: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stage 2 - Divergence hitchhiking

Genome

FST

Feder, Egan and Nosil TiG

Page 9: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stage 2b - Inversion

Genome

FST

Feder, Egan and Nosil TiG

Inversion links co-adapted alleles

Page 10: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stage 3 - Genome hitchhiking

Feder, Egan and Nosil TiG

Genome

FST

Page 11: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stage 4 - Genome wide isolation

Feder, Egan and Nosil TiG

Genome

FST

Page 12: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Some sub-species clearly in stage 1lWing pattern “races” of Heliconius melpomene

80 100 120 140 160 180

0.0

0.4

0.8

b fre

quen

cy

19862011

n11050

80 100 120 140 160 180

0.0

0.4

0.8

D fr

eque

ncy

80 100 120 140 160 180

0.0

0.4

0.8

N fr

eque

ncy

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Yb

frequ

ency

80 100 120 140 160 180

0.0

0.4

0.8

D fr

eque

ncy 1986

2011n11050

80 100 120 140 160 180

0.0

0.4

0.8

Cr f

requ

ency

80 100 120 140 160 180

0.0

0.4

0.8

Transect position (km)

Sd

frequ

ency

Heliconius erato Heliconius melpomene

Page 13: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Some sub-species clearly in stage 1

S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014).

lWing pattern “races” of Heliconius melpomene

B (red/orange patterns) Yb (yellow/white patterns)

Page 14: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Some sub-species clearly in stage 1lCarrion and hooded Crows

Poelstra, J. W. et al. Science 344, 1410–4 (2014).

FST

Page 15: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

And an example with multiple islands?

Malinsky et al., Science 350, 1493 (2015).

Page 16: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Other species have islands…but are they real?

Ellegren, et al. Nature 491, 756- (2012).

Page 17: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Anopheles gambiae and A. coluzzi Formerly M and S forms of A. gambiae

Clarkson et al. 2014 Nature Communications

Other species have islands…but are they real?

Page 18: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Other species have islands…but are they real?

Page 19: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Seehausen et al., Nature Reviews Genetics, 2014

Page 20: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

• Fst measures relative divergence

• Peaks indicate regions of higher than expected between population divergence, given the within population divergence

• Peaks can therefore result from reduced diversity within species

• This could be due to lower Ne within species (selective sweeps, background selection)

• So peaks NOT NECESSARILY due to reduced gene flow

What do patterns of Fst really mean?

Page 21: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Note that sometimes sweeps within species = speciation genes

Page 22: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Sweeps across the species barrier can also lead to Fst peaks

Double peaks??

Nicolas Bierne, Daniel Berner and others

Page 23: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Anopheles M-S divergence

Relative divergence higher in low recombination regions - not significant for absolute divergence

see also: Charlesworth 1998 MBE Measures of divergence… More recently see papers by Reto Burri

Page 24: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 25: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

No evidence for higher Dxy in wing pattern loci

S. H. Martin et al. Genome Res. 23, 1817–1828 (2013). O. Seehausen et al. Nat. Rev. Genet. 15, 176–92 (2014).

lWing pattern “races” of Heliconius melpomene

B (red/orange patterns) Yb (yellow/white patterns)

Page 26: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Suggestion that we use absolute measures of divergence?

Page 27: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Understanding genomic divergence

No single statistic will capture the complex history of mutation, migration and selection

Patterns need to be interpreted in the specific context of the study species

Page 28: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Much better to use explicit tests for gene flow

Need to design sampling so the expectations in the absence of gene flow are clear and testable

The key is to identify ‘control’ populations that are not influenced by admixture

Page 29: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

•Isolated DNA from bones 38,000 yrs old in Croatia •We diverged from Neanderthals around 270-440,000yrs ago

•Evidence for gene exchange with humans (1-4% of genome?)

Green et al., 328:710 Science 2010

Explicit tests for gene flow: Neanderthal genome

Page 30: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Explicit tests for gene flow: ABBA-BABA test

Green et al. 2010 Science 328:710-722EXPECT: 50% ABBA 50% BABA

OBSERVE: 103612 ABBA 94029 BABA

Page 31: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Explicit tests for gene flow: ABBA-BABA test

Page 32: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Explicit tests for gene flow: ABBA-BABA test

Page 33: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

The genomic landscape of Neanderthal ancestry in present-day humans - Sankararaman et al. Nature 2014

1) Derived alleles at high frequency shared with Neanderthal 2) High divergence to Africa but low to Neanderthal 3) Long haplotype blocks

Explicit tests for gene flow: Combining multiple signals

Page 34: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Sampled 10 complete high coverage genomes per population

Page 35: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Explicit tests for gene flow: Heliconius butterflies

Many sources of reproductive isolation:

Female hybrids are sterile Different host plant use Different habitat preference Strong assortative mating

Page 36: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Using Simon Martin’s Twisst method to characterise relationships

Page 37: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

cydmelG melW

Martin et al., MBE 2015

outgroup

Simon Martin

ABBA-BABA statistics

Page 38: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 39: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

x x

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

x

Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows

(A) Trees were superimposed using Densitree (Bouckaert 2010). There were 2848 trees for the H.

cydno – H. melpomene dataset (left) and 2453 for the H. timareta – H. melpomene dataset (right).

Tree-lengths were equalized so that all trees could be superimposed, and then a random jitter was

added to all branch lengths to show density. Trees supporting each of the four possible topologies

are coloured accordingly: blue for the species tree, red for the geography tree, green for the control

tree and black for unresolved trees. (B) The four topologies scored, along with the number and

29

617

618

619

620

621

622

623

Cold Spring Harbor Laboratory Press on September 18, 2013 - Published by genome.cshlp.orgDownloaded from

PstI RAD sequencing(site every ~10kb)

Linkage maps builtwith Lep-MAP

300+ offspring for each cross typeJohn Davey

Davey et al., Evol Letters 2017

Page 40: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Pop

gen

vs a

ctua

l est

imat

es o

f re

combi

natio

n ra

te

Page 41: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Recombination rate strongly correlated with admixture proportion

Simon Martin

Page 42: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Short chromosomes have more admixture:

Page 43: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

And chromosome ends have more admixture:

Admixture tree

Species tree

Page 44: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Burri et al., Genome Research 2015

Sequenced 20 individuals per population at 20x coverage

Page 45: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 46: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 47: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

An alternative is to take an explicit modelling approach

IM and IMa Jody Hey

Page 48: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Martin et al., Biorxiv 2015

Page 49: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 50: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 51: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 52: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

The effect of background selection on introgression in humans

Harris and Nielson Genetics 2016, Juric, Aeschbacher and Coop 2016

Admixture is less in gene rich regions supporting this model…..

Page 53: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Population and speciation genomics: Conclusions

• Great power to detect subtle signals of selection and gene flow

• Can make more general observations about genes and regions involved in adaptation

• BUT genomic processes complicate the picture

• Best approaches combine multiple signals to infer process

• Eventually we need to combine background selection, recombination, positive selection

Page 54: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

And finally a plug….

Page 55: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Adaptive introgression

Page 56: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 57: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 58: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Photo credit Andrei Sourakov

Page 59: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 60: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 61: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

43 s

peci

es 77 s

peci

es

Page 62: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Fritz Müller

Page 63: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

–G-test:G=7.25,d.f.=1,p=0.007

Expected number

0

15

30

No.

Mod

els

Atta

cked

H. melpomene H. cydno F1 hybrid

Merrill et al., Proc. Roy. Soc 2012

Page 64: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

102 204 76

Page 65: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

van Belleghem et al., Nature Ecol Evol

Peaks of divergence correspond to wing pattern genes

Page 66: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Reed et al., 2011 Science

Page 67: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Expressed in larva and pupa

Cell cycle, related to fizzy (Nadeau et al., Nature

2016)Putative morphogen defines black-fated regions in larval disc (Arnaud

Martin & Bob Reed)

Transcription factor,

paints red in pupal wing(Richard Wallbank)

Yb - yellow patternsChromosome 15cortex

D - red patternsChromosome 18optix

Ac - band shapeChromosome 10WntA

Wing pattern controlled almost entirely by large effect loci

knockoutWild type

Page 68: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

4

5

6

7

8

9Day 1 Day 3 Day 5 Day 7

P HD P HD P HD P HD

hthD

P

Hw

H. m

elpo

men

e pl

esse

ni

D

P

Hw

H. m

elpo

men

e m

alle

ti

*

*

**

P HD P HD P HD P HD

Day 1 Day 3 Day 5 Day 7

5

6

7

8

9

10

11

12

optix

*

*

*

*

*

*

**

*

2

3

3.5

4

4.5

5

Day 1 Day 3 Day 5 Day 7

P HD P HD P HD P HD

cinnabar

*

*

*

PATTERN

COLOUR

A B

D

Rela

tive

expr

essi

on

Rela

tive

expr

essi

on

Rela

tive

expr

essi

on

Rela

tive

expr

essi

on

E

Heliconius

Drosophila

C

Richard Wallbank

Page 69: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Adaptive introgression

Heliconius Genome Consortium Nature 2012

Page 70: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Phylogenies across B/D

ML tree based on

50,000 bp

Heliconius Genome Consortium Nature 2012

Page 71: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Phylogenies across B/D

ML tree based on

50,000 bp

Page 72: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Phylogenies across B/D

ML tree based on

50,000 bp

Page 73: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Phylogenies across B/D

ML tree based on

50,000 bp

Page 74: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Phylogenies across B/D

ML tree based on

50,000 bp

Page 75: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Okay, so introgression causes mimicry

But mimicry is weird, right?

Page 76: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Novelty can arise through introgression and recombination

Heliconius cydno cordula

Heliconius melpomene melpomene

NNbb

nnBB

Heliconius heurippa

NNBB

Mavarez et al., Nature 2006

Camilo Salazar

Page 77: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

RedBlue

optix100kb

Wallbank et al., PLoS Biology 2016

Page 78: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Wallbank et al., PLoS Biology 2016

Page 79: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Generate dated trees using this node as a reference point

Wallbank et al., PLoS Biology 2016

Page 80: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

H. cydno

H. pachinus

H. heurippa

H. tristeroH. timareta

H. melpomene

H. elevatus

H. pardalinus

H. luciana

H. athis

H. hecale

H. ethilla

H. nattereri

H. besckei

H. ismenius

H. numata

Million Years Ago4 3 2 1 0

AB C D E

Wallbank et al., PLoS Biology 2016

Page 81: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 82: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Stryjewski & Sorensen Nat Ecol Evol 2017

Page 83: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

What about behaviour?

Page 84: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

H. melpomene X H. cydno

Page 85: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Backcross design:

Page 86: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

bb Bb

-10

010

2030

Genotype

Num

ber o

f app

roac

hes

Diff

eren

ce b

etw

een

appr

oach

es to

cyd

no a

nd

mel

ponm

ene

Richard Merrill

Page 87: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

5% genome-wide significance threshold

Significant QTL detected on three linkage groups

Z

Richard Merrill unpub.

Page 88: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

0.0

0.2

0.4

0.6

0.8

1.0

Rel

ativ

e pr

obab

ility

of c

ourti

ng ~

italic

(H. m

elpo

men

e)

CC CM CC CM CC CM CC|CC|CC CM|CM|CM

LG 1 LG 17 LG 18 All three

GenotypeatputativeQTL

Relative

pro

babi

lity

of c

ourt

ing

H. m

elpo

men

e

Together explain ~ 50% measured differences between H. melpomene and H. cydno

H. melpomenemales

H. cydnomales

Z

Page 89: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Heliconius cydno cordula

Heliconius melpomene melpomene

NNbb

nnBBHeliconius heurippa

NNBB

Mavarez et al., Nature 2006

Page 90: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

cordula heurippa melpomene

010

2030

40

Num

ber o

f app

roac

hes

Melo et al., Evolution 2009

Page 91: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 92: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Lamichhaney et al., Nature 2015

Page 93: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Pointy

Blunt

ALX1 associated with beak shape

Page 94: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Most of these studies use phenotype associations to identify introgressed loci

But can we identify them a priori using the ABBA-BABA method?

Page 95: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Green et al. 2010 Science 328:710-722

Page 96: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

D is quite dependent on the number of informative sites (the denominator)

D is not at all good at detecting outlier windows

Martin et al., MBE 2014

Where s is numerator from the D equation f is the fraction of introgression compared to maximum possible

Page 97: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Martin’s F

But Martin’s F is quite good at finding the introgression outliers

Martin et al., MBE 2014

Page 98: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

• Smith and Kronforst argued that introgression could be inferred where ABBA-BABA outliers showed lower Dxy compared to genome-wide average

Page 99: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 100: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

• Be wary of window based D statistics

• F is better than D…

• Sampling design is very important!

Page 101: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Implications for tree-thinking

Archaea

Eukarya

Bacteria

The tree of life is reticulatedThe tree of life is reticulated

Page 102: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Implications for tree-thinking

Mallet, Hahn and Besansky BioEssays 2015 Hahn and Nakhleh Evolution 2015

Page 103: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed
Page 104: Genomic studies of speciation and gene flowevomicsorg.wpengine.netdna-cdn.com/wp-content/... · Figure 2. Four-taxon maximum-likelihood trees for 100 kb windows (A) Trees were superimposed

Okay, so what have we learnt and where do we go from here?