More sequence or more individuals, to combine or not?
-
Upload
ethan-dalton -
Category
Documents
-
view
218 -
download
4
Transcript of More sequence or more individuals, to combine or not?
![Page 1: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/1.jpg)
Data: how much is needed?
more sequence or more individuals, to combine or not?
![Page 2: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/2.jpg)
14.4. Tue Introduction to models (Jarno) 16.4. Thu Distance-based methods (Jarno) 17.4. Fri ML analyses (Jarno)
20.4. Mon Assessing hypotheses (Jarno) 21.4. Tue Problems with molecular data
(Jarno) 23.4. Thu Problems with molecular data (Jarno) Phylogenomics 24.4. Fri Search algorithms, visualization, and other computational aspects (Jarno)
Schedule
J
![Page 3: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/3.jpg)
The trivial truth◦ All extant species◦ The whole genome
Impractical? Well, then◦ As many species as possible◦ As much data as possible
How much data?
![Page 4: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/4.jpg)
Finite constraints on resources (time, money)◦ Know your group – which taxa are the most
relevant for your study?◦ Know what gene sequences are available from
previous studies
Choosing taxa or data
![Page 5: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/5.jpg)
The days of single gene datasets are over Mitochondrial and chloroplast DNA have
been popular because they are easy to amplify and sequence
It is worth increasing the number of nuclear genes
One should aim for at least 3 genes, preferably more (maybe 10?)
Number of genes
![Page 6: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/6.jpg)
![Page 7: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/7.jpg)
It is now possible to increase the number of genes being sequenced significantly
Whole genome analyses will allow us to understand:◦ Intron-exon boundary dynamics◦ Gene duplication-deletion dynamics◦ Gene transfer dynamics
Soon we will have a good understanding of the regions of the genome that are most suitable for systematics
Phylogenomics
![Page 8: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/8.jpg)
Sometimes not all genes amplify from all samples◦ Should these samples be discarded?
Increased taxon sampling, despite missing data, increases resolution
All possible data should be used!
Missing data?
![Page 9: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/9.jpg)
Can separate independent data sets be combined for analysis?
How can we assess the possibility of conflict between different data?
What does the potential conflict then mean?
To combine or not to combine?
![Page 10: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/10.jpg)
For instance◦ Different genes may have different phylogenetic
signal (different history?)
What is the problem?
![Page 11: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/11.jpg)
If both genes have equally strong signal
Possible effects on results
![Page 12: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/12.jpg)
If one gene has a stronger signal than the other
Possible effects on results
![Page 13: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/13.jpg)
If one gene has a stronger signal than the other
Possible effects on results
![Page 14: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/14.jpg)
Never combineCombine sometimesAlways combine
Schools of thought
![Page 15: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/15.jpg)
The different data sets may represent different evolutionary histories (e.g. different selection pressures)
Big data sets dominate small data sets When analyzed separately, the different
data sets can be tests of each others phylogenetic hypotheses
Never combine!
![Page 16: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/16.jpg)
Consensus trees of separate analyses
+ =
Data set A Data set B Their consensus
![Page 17: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/17.jpg)
A
B
C
D
E
F
G
H
My own experience:
![Page 18: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/18.jpg)
Would be fantastic to get genealogical histories of individual genes
But!◦ Single genes generally short 1000-2000 bases◦ Lots of homoplasy◦ Unreliable phylogenies
Problems with the approach
![Page 19: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/19.jpg)
If the data sets are congruent, combine them
If the data sets are incongruent, don’t combine them
One can use the ILD test to decide whether data sets are incongruent
Well, sometimes you can combine...
![Page 20: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/20.jpg)
If there is no conflict between data sets:◦ The length of most parsimonious tree from the
combined data [L(x+y)] is equal to the sum of the lengths of the MP trees from the separately analyzed data [L(x) + L(y)]
Dxy = L(x+y) – (L(x) + L(y))Dxy = 0
(Farris et al 1994)
ILD (Incongruence Length Difference)
![Page 21: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/21.jpg)
Combining the data sets leads to increased homoplasy
But is it statistically significant? Can be tested with the Mann-Whitney U
test, where the null hypothesis is that the data sets are combinable
If Dxy > 0
![Page 22: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/22.jpg)
Data set x Data set y
Data sets x + y
Data set p Data set q
Original
Combine data
Sample randomly to get equally large data sets
![Page 23: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/23.jpg)
Search for MP trees and calculate Dpq values Repeat many times (e.g. 1000), which gives
us a distribution for the value of D Compare whether Dxy differs from random
distribution at P < 0.05 However:
◦ ILD-test is sensitive to relative sizes of compared data sets and to the evolutionary history of the different data sets
For the randomly generated data sets
![Page 24: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/24.jpg)
But what if the conflict is only partial?
![Page 25: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/25.jpg)
Combining all available data leads to more resolved trees = the combined data has higher explanatory power
”Hidden support” can only be detected through combined analysis
Conflicts at different nodes can only be discovered in a combined analysis framework
The effects of combined analysis can be investigated using indices related to Bremer support
Always combine!
![Page 26: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/26.jpg)
Partitioned Bremer Support (PBS)◦ Baker & DeSalle 1997: Syst Biol 46:654
Partition Congruence Index (PCI)◦ Brower 2006: Cladistics 22:378
Hidden Bremer Support (HBS)◦ Gatesy et al 1999: Cladistics 15:271
Indices related to Bremer Support
![Page 27: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/27.jpg)
The different data partitions in a data set contribute to the Bremer support in an additive way
For each node:◦ A negative Partitioned Bremer support value
indicates conflict◦ A positive Partitioned Bremer support value
indicates congruence
PBS (Partitioned Bremer Support)
![Page 28: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/28.jpg)
PBS in practice
![Page 29: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/29.jpg)
PBS in practice
7
7
3,4
-6,13
![Page 30: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/30.jpg)
Morpholgy, COI, EF1a, Wgl
Bremer Support
![Page 31: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/31.jpg)
Tells us about the magnitude of conflict between data partitions in a combined analysis
PCI is always equal to or less than BS for a given branch
PCI = BS when there is no conflict PCI is negative when there is low BS
because of strong conflicts between data partitions
Partition Congruence Index
Brower 2006: Cladistics 22:378-386
![Page 32: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/32.jpg)
Underlying phylogenetic signal can be confounded by homoplasy in separate analyses
Combining datasets can bring out this signal, as homoplasy is largely random noise
Can be measured using HBS and Partitioned HBS
Hidden support
![Page 33: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/33.jpg)
Hidden support can be defined as increased support for the node of interest in the simultaneous analysis of all data partitions relative to the sum of support for that node in the separate analyses of each partition
Hidden support
![Page 34: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/34.jpg)
For a particular combined data set and a particular node, HBS is the difference between BS for that node in the combined analysis and the sum of BS values for that node from each data partition
Measuring hidden support
![Page 35: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/35.jpg)
With a small dataset, it is probably always best to combine everything
With large datasets (10 or 20 gene regions?) one can find sets of congruent genes and combine them
But!◦ Is there a biological reason for incongruence, or is
it just a property of the data?
So, what to do?
![Page 36: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/36.jpg)
Problems inherent in molecular data
Niklas Wahlberg
![Page 37: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/37.jpg)
Saturation Bias in nucleotide composition Orthology vs paralogy Lineage sorting Lateral Gene Transfer
What are the problems?
![Page 38: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/38.jpg)
Saturation
![Page 39: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/39.jpg)
Saturation is due to multiple changes at the same site subsequent to lineage splitting
Models of evolution attempt to infer the missing information through correcting for “multiple hits”
Most data will contain some fast evolving sites which are potentially saturated (e.g. in proteins often position 3)
In severe cases the data becomes essentially random and all information about relationships can be lost
Saturation in sequence data
![Page 40: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/40.jpg)
C A
C G T A1 2 3
1
Seq 1
Seq 2
Number of changes
Multiple changes at a single site - hidden changes
Ancest GGCGCGSeq 1 AGCGAGSeq 2 GCGGAC
![Page 41: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/41.jpg)
Saturation
Time since divergence
Pair
wis
e d
ista
nce
ca
lcula
ted
from
sequ
ence
s
![Page 42: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/42.jpg)
Homoplasy is a problem with molecular data
Elevated rates of molecular evolution in unrelated lineages
Sparse taxon sampling leading to long branches
Saturation and long branch attraction
![Page 43: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/43.jpg)
The classical long-branch attraction example
Based on one gene 18S
![Page 44: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/44.jpg)
Nardi et al. 2003: Science 299: 1887-1889
![Page 45: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/45.jpg)
Taxon sampling is important For divergent taxa with few extant species,
can be a problem More data from different sources
◦ Could be that molecular data are not able to resolve the position of some taxa
◦ Morphological data!
Is saturation a problem?
![Page 46: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/46.jpg)
Biased base composition
![Page 47: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/47.jpg)
Do sequences manifest biased base compositions (e.g thermophilic convergence) or biased codon usage patterns which may obscure phylogenetic signal?
Biased base compositions?
![Page 48: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/48.jpg)
% Guanine + Cytosine in 16S rRNA genes
Thermophiles:Thermotoga maritimaThermus thermophilusAquifex pyrophilus
Mesophiles:Deinococcus radioduransBacillus subtilis
626465
5555
%GCall sites
727273
5250
737071
4838
variable sites
parsimonysites
![Page 49: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/49.jpg)
A case study in phylogenetic analysis:Deinococcus and Thermus
Deinococcus are radiation resistant bacteria Thermus are thermophilic bacteria
BUT:◦ Both have the same very unusual cell wall based
upon ornithine◦ Both have the same menaquinones (Mk 9)◦ Both have the same unusual polar lipids
Congruence between these complex characters supports a phylogenetic relationship between Deinococcus and Thermus
![Page 50: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/50.jpg)
An appropriate method can correct for GC bias
Aquifex
Thermotoga
Deinococcus
Bacillus
Thermus
Parsimony tree
Aquifex
Thermotoga
Deinococcus
Bacillus
Thermus
Aquifex
Thermotoga
Deinococcus
Thermus
Bacillus
Jukes & Cantor Tree Log Det Tree
![Page 51: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/51.jpg)
Orthology and paralogy
![Page 52: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/52.jpg)
Are the sequences being generated from different species the same (homologous)?
Gene duplication◦ duplicate gene degenerates◦ duplicate gene aquires new function
A problem particular accute currently as we search for new genes
Orthology or paralogy?
![Page 53: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/53.jpg)
ORTHOLOGY
Orthology: gene trees and species trees
Gene phylogeny
a
b
c
Organism phylogeny
A
B
C
![Page 54: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/54.jpg)
Darwin’s theory reinterpreted homology as common ancestry.
ATCGGCCACTTTCGCGATCA
ATAGGCCACTTTCGCGATCA ATCGGCCACTTTCGCGATCG
ATAGGCCACTTTCGCGATTA ATCGGCCACTTTCGTGATCG
ATAGGGCAGTTTCGCGATTA ATCGGCCACGTTCGTGATCG
ATAGGGCAGTTTTGCGATTA ATCGGCCACGTTCGCGATCG
ATAGGGCAGTTTCGCGATTA ATCGGCCACCTTCGCGATCG
ATAGGGCAGTCTCGCGATTA ACCGGCCACCTTCGCGATCG
ACCGGCCACCTTCGCGATCGATAGGGCAGTCTCGCGATTA
Ancestral sequence
Homologous sequences
![Page 55: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/55.jpg)
Orthologs arise by speciation
ATCGGCCACTTTCGCGATCA
ATAGGGCAGTCTCGCGATTA ACCGGCCACCTTCGCGATCG
Sequence in ancestralOrganism
Orthologous sequences
Speciation event
Modern species A Modern species B
Orthologs are “evolutionary counterparts” – Koonin (2001)
![Page 56: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/56.jpg)
Paralogs arise by duplications
ATCGGCCACTTTCGCGATCA
ATAGGGCAGTCTCGCGATTA ACCGGCCACCTTCGCGATCG
Sequence in ancestralOrganism
Paralogous sequences
Duplication event
Modern duplicate A Modern duplicate B
![Page 57: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/57.jpg)
An evolutionary tale…
Duplication of A in worm
Duplication of A in human
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 58: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/58.jpg)
The yeast gene is orthologous to all worm and human genes, which are all co-orthologous to the yeast gene
Evolutionary Relationships
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 59: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/59.jpg)
all genes in the HA* set are co-orthologous to all genes in the WA* set
Evolutionary Relationships
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 60: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/60.jpg)
The genes HA* are hence ‘inparalogs’ to each other when comparing human to worm.
Evolutionary Relationships
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 61: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/61.jpg)
duplication speciation
By contrast, the genes HB and HA* are ‘outparalogs’ when comparing human with worm
Evolutionary Relationships
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 62: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/62.jpg)
HB and HA*, and WB and WA* are inparalogs when comparing with yeast, because the animal–yeast split pre-dates the HA*–HB duplication
duplication
speciationEvolutionary Relationships
Sonnhammer & Koonin (2002) TIGs 18 619-220
![Page 63: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/63.jpg)
PARALOGY
a1*
b1
c1*
a2
b2*
c2
Gene phylogenies Organism phylogeny
A
B
C
gene duplication
Misleading tree
A
B
C
a1
b2
c1
Paralogy can produce misleading trees
![Page 64: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/64.jpg)
Ancient gene duplications can be used to root the tree of life
Ancestral Elongation Factor Gene
Gene Duplication Prior To Split Into 3 Domains Of Life
EF-Tu/ 1-alpha
EF-2/ EF-G
Sequences from one paralogue can be used to root a tree formed using sequences from the other and vice versa
= paralogues of each other
+
EF-Tu/ 1-alpha
EF-2/ EF-G
![Page 65: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/65.jpg)
Lineage sorting
![Page 66: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/66.jpg)
Gene trees may not be the same as species trees
Extant populations may retain ancestral polymorphisms
Species level phylogenies should never sample single individuals of different species
Lineage sorting
![Page 67: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/67.jpg)
Implicit assumption in many studies using mtDNA
The mode of speciation can now be studied using DNA sequences
Theoretical studies predict that DNA lineages pass through several phases in a species
Are species monophyletic?
![Page 68: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/68.jpg)
Time
A B
Ancestral gene pool
The assumption: monophyly
![Page 69: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/69.jpg)
Time
A BThe assumption: monophyly
![Page 70: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/70.jpg)
Paraphyly can occur when one population in a set of locally panmictic populations speciates
Polyphyly occurs when a highly polymorphic population is subdivided
Can be highly informative of the history of divergence
The presence of poly- and paraphyletic lineages
![Page 71: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/71.jpg)
Time
A B
Ancestral gene pool
Paraphyly
![Page 72: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/72.jpg)
Time
A BParaphyly
![Page 73: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/73.jpg)
Time
A BPolyphyly
![Page 74: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/74.jpg)
Time
A BPolyphyly
![Page 75: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/75.jpg)
Polyphyly
![Page 76: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/76.jpg)
tharos orantain (35-6) CO4
tharos riocolorado (35-9) CO8
tharos tharos (47-3) MNtharos orantain (52-9) AB4
tharos orantain (47-2) CO7, (60-6, 60-7) AB6
batesii apsaalooke (35-8) WYcocyta selenis (47-12) CO1
pulchella pulchella (47-6, 49-14, 50-6) CA3pulchella pulchella (49-13) CA3
phaon phaon (25-17) FLphaon jalapeno (35-11) Mexico
mylitta mylitta (32-3) NVmylitta mylitta (32-6) MT
mylitta arizonensis (32-1) AZ1, (47-1) NM
orseis orseis (37-1) CA1
pallida pallida (34-6, 47-9, 47-10, 47-11) CO3
mylitta mylitta (11-10, 11-11, 58-1, 58-2) BC1
pallida barnesi (58-5, 58-6) BC1
picta canace (44-11, 44-12) AZ
vesta (41-1) TXvesta (41-2) TX
picta picta (34-7) CO
batesii lakota (35-4) NEpulchella camillus (48-8, 49-12) CO1
pulchella camillus (48-14) CO1
pulchella camillus (49-3) CO6
pulchella camillus (49-5) CO6pulchella camillus (50-3) CO1
pulchella camillus (50-4) CO1
pulchella tutchone (23-11) Alaska
pulchella montana (27-5) CA2
pulchella owimba (56-1, 56-5, 56-7, 60-2) BC2
pulchella owimba (52-14, 55-7) AB5pulchella owimba (54-1) AB5
cocyta selenis (11-5) BC1pulchella owimba (24-10) MT
cocyta selenis (47-13) CO1cocyta selenis (48-3) CO1
cocyta selenis (58-8) BC1
batesii maconensis (60-13, 60-15) NC
tharos tharos (25-18) FLtharos tharos (34-2) MN
tharos tharos (44-1) NY
tharos tharos (44-2) NYtharos tharos (44-3, 44-4) NY
tharos tharos (47-4) MNtharos tharos (47-8) MN
tharos tharos (53-8) MD
tharos tharos (54-9) MD
cocyta selenis (11-4) BC1, (55-8) AB7
cocyta selenis (48-10) CO1cocyta (49-8) MNdiminutor
cocyta selenis (11-6) BC1
batesii lakota (60-5) AB6
probably (52-2) AB1batesii lakotacocyta selenis (55-6) AB6
batesii anasazi (34-1) CO2cocyta selenis (47-14, 48-6) CO1
cocyta (49-9) MNdiminutor
batesii lakota (52-7, 52-8) AB3
cocyta selenis (55-2) AB7
cocyta selenis (60-12) BC2cocyta selenis (58-7) BC1
pulchella camillus (35-5, 48-2, 48-7, 48-9, 48-13) CO1, (50-2) NM
pulchella camillus (48-4) CO5pulchella camillus (49-1) NMpulchella camillus (49-2) CO6
pulchella camillus (49-4) CO6
orseis orseis (67-3) CA1
orseis orseis (67-4) CA1orseis orseis (67-6) CA1
vesta (67-9) Mexico
pallescens (64-2) Mexicopallescens (64-1) Mexico
mylitta arida (67-10) Mexico
cocyta cocyta (72-8) ONT
tharos distincta (73-4) Mexico
cocyta cocyta (72-9) ONTbatesii batesii (73-9) MNbatesii batesii (72-1) ONT
batesii maconensis (69-1, 69-2) NC
cocyta cocyta (72-10) ONT
pulchella montana (67-15) ORpulchella montana (67-16) OR
pulchella inornata (67-11) OR
pulchella inornata (67-13) ORpulchella inornata (67-14) OR
pulchella inornata (73-1) ORpulchella inornata (73-2) OR
95
100100
99100
100
100
10073
51
8086
7163
91
100
88
56
52
74
7862
100
95
99100
62
74
68
6152
91
80
7275
8968
10062
99
88
72
77
An empirical example:
Phyciodes butterflies
Wahlberg et al. 2003. Syst Ent 28:257-273
![Page 77: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/77.jpg)
Paraphyly of a species can be due to incomplete lineage sorting and/or secondary gene flow
![Page 78: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/78.jpg)
G = generations, starting with ten unrelated females at G = 0
![Page 79: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/79.jpg)
Lateral gene transfer
![Page 80: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/80.jpg)
Widely spread in single celled organisms◦ Even between distantly related lineages
In multi-celled organisms more a problem in closely related species◦ hybridization
Lateral Gene Transfer
![Page 81: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/81.jpg)
Is the Tree of Life really a Web of Life?
Lateral Gene Transfer
![Page 82: More sequence or more individuals, to combine or not?](https://reader035.fdocuments.in/reader035/viewer/2022062713/56649f505503460f94c72218/html5/thumbnails/82.jpg)
These ”problems” are highly interesting phenomena in themselves!
When taking the different factors into account, can be informative about evolutionary history
”When in doubt, get more data”- Brooks and McLennan 2002
Problems inherent in molecular data?