Examensarbete vid Institutionen för geovetenskaper Degree Project at the Department of Earth Sciences
ISSN 1650-6553 Nr 471
Estimating the Early Evolution of
Brachiopods Using an Integrated
Approach Combining Genomics
and Fossils
En uppskattning av armfotingarnas tidiga evolution
– med hjälp av genomik och fossil
Chloé Robert
INSTITUTIONEN FÖR
GEOVETENSKAPER
D E P A R T M E N T O F E A R T H S C I E N C E S
Examensarbete vid Institutionen för geovetenskaper Degree Project at the Department of Earth Sciences
ISSN 1650-6553 Nr 471
Estimating the Early Evolution of
Brachiopods Using an Integrated
Approach Combining Genomics
and Fossils
En uppskattning av armfotingarnas tidiga evolution
– med hjälp av genomik och fossil
Chloé Robert
ISSN 1650-6553
Copyright © Chloé Robert
Published at Department of Earth Sciences, Uppsala University (www.geo.uu.se), Uppsala, 2019
Abstract
Estimating the Early Evolution of Brachiopods Using an Integrated Approach Combining
Genomics and Fossils
Chloé Robert
The Brachiopoda, a major group of the Lophotrochozoa, experienced a rapid early evolutionary
diversification during the well-known Cambrian explosion and subsequently dominated the Palaeozoic
benthos with its diversity and abundance. Even though the phylogeny of the Lophotrochozoa is still
hotly debated, it is now known that the Brachiopoda are a monophyletic grouping. However, the early
evolutionary rates for the Brachiopoda have never been studied in the framework of a study combining
molecular data and fossil time calibration points. In order to investigate the expected higher evolutionary
rates of the Phylum at its origin, we conducted phylogenetic studies combining different methodologies
and datasets. This work has at its foundation Maximum Likelihood and Bayesian analyses of 18S and
28S rRNA datasets followed by analyses of phylogenomic sequences. All material was obtained from
previously available sequences and from sequencing of genetic material from specimens from a
concerted worldwide collection effort.
While the analyses of the phylogenomic dataset produced a robust phylogeny of the Brachiopoda
with good support, both the results of the novel rRNA and phylogenomic dating analyses provided
limited insights into the early rates of evolution of the Brachiopoda from a newly assembled dataset,
demonstrating some limitations in calibration dating using the software package BEAST2. Future
studies implementing fossil calibration, possibly incorporating morphological data, should be attempted
to elucidate the early rates of evolution of Brachiopoda and the effect of the Push of the Past in this
clade.
Keywords: Brachiopoda, evolutionary history, phylogeny, Maximum likelihood, Bayesian inference,
fossil calibrations Degree Project E1 in Earth Science, 1GV025, 30 credits Supervisors: Aodhán Butler and Graham Budd Department of Earth Sciences, Uppsala University, Villavägen 16, SE-752 36 Uppsala (www.geo.uu.se) ISSN 1650-6553, Examensarbete vid Institutionen för geovetenskaper, No. 471, 2019 The whole document will be available at www.diva-portal.org
Populärvetenskaplig sammanfattning
En uppskattning av armfotingarnas tidiga evolution – med hjälp av genomik och fossil
Chloé Robert
Det är ofta antaget att evolution (förändringar i arvsmassan hos en grupp organismer) sker i en konstant
hastighet men i slutändan ändå osäkert om så är fallet. Stora grupper av organismer har ofta associerats
med en högre evolutionär hastighet, speciellt nära deras uppkomst, vilket ökar sannolikheten för
överlevnad.
Armfotingar (Brachiopoda) är marina ryggradslösa djur med skal som tidigare var allmänt
spridd, idag är istället musslor (Bivalvia) betydligt mer spridda. Armfotingar har funnits och utvecklats
under flera miljoner år med ursprung under tidigt kambrium. Genom år av forskning och många fossil
har vi fått mer information om utseendet hos utdöda organismer vilket har bidragit till att antalet fossila
arter som vi känner till har ökat tusenfalt. Under den senaste tiden har det också skett innovationer inom
molekylära tekniker som gjort det möjligt att applicera dessa kunskaper även på utdöda arter. Dessa
molekylära tekniker har nyligen hjälpt till att bestämma några av släktskapsförhållandena inom
armfotingar som tidigare ansetts vara väldigt svåra att lösa.
Det finns fortfarande vissa släktskapsförhållanden inom armfotingar som inte är kända och man
vet ännu inte hur fort de utvecklades. Genom att undersöka just evolutionens hastighet kan man börja
förstå gruppens tidiga framgång under Kambrium och Ordovicium samt minskningen som följde. Syftet
med den här studien var att beräkna evolutionshastigheten hos armfotingar med särskild fokus på den
tidiga diversifieringen av gruppen. För att undersöka detta använde vi oss av molekylära data för att
analysera släktskapsförhållandena inom armfotingar. Dessutom använde vi fossil för att datera stora
händelser i armfotingarnas evolutionära historia. Med hjälp av statistiska analyser kunde vi beräkna
evolutionshastighet och släktskapsförhållandena inom gruppen. Vi kom fram till att armfotingar
härstammar från en gemensam förfader. Dateringen kring när detta skedde blev inte fastställd då det
beräknades ske miljoner år före det äldsta djurfossilet. Det kommer behövas mer forskning för att ta
reda på om armfotingar hade en högre evolutionär hastighet i tidigt skede.
Keywords Brachiopoda, evolutionhistoria, fylogeni, Maximum likelihood-metoden, Bayesiansk
statistik, fossilkalibering
Examensarbete E1 i geovetenskaper, 1GV025, 30 hp
Handledare: Aodhán Butler och Graham Budd
Institutionen för geovetenskaper, Uppsala universitet, Villavägen 16, 752 36 Uppsala
(www.geo.uu.se)
ISSN 1650-6553, Examensarbete vid Institutionen för geovetenskaper, Nr 471, 2019
Hela publikationen finns tillgänglig på www.diva-portal.org
Table of Contents
1 Introduction ............................................................................................................................. 1
2 Aims ........................................................................................................................................ 2
3 Background ............................................................................................................................. 3
3.1 Brachiopods .................................................................................................................................. 3 3.1.1 Classification ......................................................................................................................... 3 3.1.2. Description ............................................................................................................................ 3 3.1.3 Lophophorata ......................................................................................................................... 4
3.2 Phylogenetic analyses ................................................................................................................... 5 3.2.1 Maximum Likelihood ............................................................................................................ 6 3.2.2 Bayesian inference ................................................................................................................. 6 3.2.3 Model selection ...................................................................................................................... 6
3.3 Molecular clock............................................................................................................................. 7
3.4 Push of the past ............................................................................................................................. 7
4 Material and methods .............................................................................................................. 8
4.1 rRNA pilot study ........................................................................................................................... 9 4.1.1 Alignment and model choice ................................................................................................. 9 4.1.2 Bayesian analyses ................................................................................................................ 13 4.1.3 RAxML analysis .................................................................................................................. 15
4.2 Phylogenomic analyses ............................................................................................................... 15 4.2.1 Dataset and model choice .................................................................................................... 15 4.2.2 RAxML analysis .................................................................................................................. 17 4.2.3 Bayesian inference for phylogenetic reconstruction ............................................................ 17 4.2.4 Bayesian inference for molecular dating ............................................................................. 17
5 Results ................................................................................................................................... 18
5.1 rRNA datasets ............................................................................................................................. 18 5.1.1 Topology .............................................................................................................................. 18 5.1.2 rRNA molecular dating ........................................................................................................ 22
5.2 Phylogenomic datasets ................................................................................................................ 25 5.2.1 PhyloBayes topology ........................................................................................................... 25 5.2.2 RAxML LG and Dayhoff .................................................................................................... 26 5.2.3 Molecular dating .................................................................................................................. 27
6 Discussion ............................................................................................................................. 29
6.1 Limitations of rRNA datasets in Brachiopod phylogenetics ....................................................... 29
6.2 BEAST2: a problematic dating methodology ............................................................................. 30
6.3 Future studies .............................................................................................................................. 31
7 Conclusion ............................................................................................................................. 32
8 Acknowledgments ................................................................................................................. 32
8 References ............................................................................................................................. 33
APPENDIX .............................................................................................................................. 39
1
1 Introduction
The first appearance of metazoans and the subsequent diversification of all animal phyla during
the Cambrian is one of the key events in the history of life on Earth. The multitude of processes
and drivers for this diversification (e.g. Budd & Jensen, 2000) as well as other effects (e.g. Push
of the Past (Nee et al., 1994; Budd & Mann, 2018) that may skew our view of these events are
a central research topic in Evolutionary Biology and Palaeontology.
The Cambrian explosion was this unprecedented event that witnessed the appearance
and diversification of the animal Phyla (Conway Morris, 1989). Outside the very specific
conditions and faunas found in Cambrian Lagerstätten (Chengjiang, Sirius Passet, Burgess
Shale) a widespread fauna of chiefly biomneralizing organisms with the capacity to produce
protective structures such as shells, dominate the fossil record. These biomineralizers came to
be overrepresented in the fossil diversity up to the present, something likely caused by the high
fossilization potential of biomineralized structures (calcitic or apatitic). While biomineralizers
remain dominant to this day, their clade abundances and diversification patterns varied greatly
from the Cambrian to the Present evidenced by, for example, the relative marginalization of
brachiopods today by bivalves (Gould and Calloway, 1980).
Patterns of diversification through time have been of considerable interest to
evolutionary theory. Changes in diversification patterns can be easily identified in major
radiation events (Budd & Jensen 2000) or after mass extinctions (Erwin 1993, Hull et al., 2011).
These events are responsible for faunal/floral turnovers, major drops in the diversity of several
groups with a concurrent increase in others and a striking shift in fossil morphologies.
Predominant among these events, the Cambrian Explosion and the Great Ordovician
Biodiversification Event have attracted researchers interested in the origin of life and its
subsequent rapid diversification and occupation of most major environmental niches.
Rates of evolution in ancestral lineages can be inferred using molecular and phenotypic
data, for instance, these were analysed in previous analyses focusing on arthropods (Lee et al.,
2013), shedding light on the surprisingly faster rates of evolution of this clade during the
Cambrian. Although a great wealth of information has accumulated over the past decades for
the Brachiopoda, similar analyses for this phylum are lacking. Dominant during the Palaeozoic,
around 15000 extinct species described, Brachiopoda has a diverse fossil record (Foote 1999),
with very few extant representatives but with a broad geographic distribution. Metazoan groups
with a wealth of fossil occurrences, diversity and stratigraphic congruence are prime candidates
2
for the study of the early diversification of animal groups as well as the effect of the Push of
the Past in Evolutionary patterns (Budd & Mann, 2018).
In this project, I investigated whether an early higher rate of diversification of the
Brachiopoda close to the putative Cambrian emergence is responsible for the apparent success
of this phylum in accordance to the hypothesis put forward by the Push of the Past, a concept
which will be explained in a following section. Instead, exceptional features of the Brachiopod
body plan, such as valves enclosing their body, are singularly responsible for their rather
successful course through time. This question was approached by using genetic data and fossil
calibrations, an attempt to explore the early rates of evolution of the brachiopods.
2 Aims
This project was designed to study the rate of evolution for the Phylum Brachiopoda since its
first appearance and through time. I aim to identify patterns in the rate of evolution and to
clarify, if present, the effects of the Push of the Past in the observed diversification/evolutionary
patterns. To ascertain the accuracy of the results, this work has as its foundation analyses of
sequences of genetic material obtained through wide, global collection of specimens and
consequent laboratory work to extract genetic material that is then sequenced and analysed. To
supplement our analyses, readily available sequences published online at GenBank were also
used (Benson et al., 2008). The most widely utilised statistical methodologies to approach such
questions as mentioned above are, Maximum Likelihood and Bayesian analyses. These
methodologies were used to produce a well-supported phylogeny of brachiopods and produce
an accompanying, robust, phylogenetic tree. Molecular clocks being notoriously difficult to
calibrate (e.g. Wilke et al., 2009) and to give accurate results about the timing of the divergence
of clades pose a significant problem for this analysis. In order to fine-tune and calibrate our
results, I made extensive use of the most up to date fossil occurrence data for the Brachiopoda
to calibrate key events in their evolutionary history such as their first unequivocal appearance.
3
3 Background
3.1 Brachiopods
3.1.1 Classification
There are nearly 5,000 brachiopod genera described and classified morphologically into 26
orders and eight classes (Carlson 2016 citing (Kaesler & Selden 1997–2007)). This Phylum has
a very rich fossil record: 400 extant species are described, contrasting with the 15000 extinct
species known (Brusca et al., 2016). Brachiopods are divided into three distinct subphyla:
Linguliformea, Craniformea and Rhynchonelliformea (Kaesler & Selden 1997–2007). These
are comprised of superfamilies, of which some contain only extinct taxa. The 25 extant species
belonging to Linguliformea are divided in two superfamilies: Linguloidea and Discinoidea.
Almost as many extant species are classified within the subphylum Craniformea but all belong
to the same superfamily: Cranioidea. Rhynchonelliformea, the last subphylum, contains the
greatest diversity of all, with more than 350 species. The extant species of this superfamily are
divided in three orders: Rhynchonellida, Terebratulida and Thecideida (Kaesler & Selden
1997–2007).
3.1.2. Description
The Brachiopoda constitute a Phylum of the Animal Kingdom. The name is influenced by the
morphological features of the group and originates from the Greek words brachium (meaning
arm) and poda (feet). They are commonly known as lamp shells, referring to the shape
similarities with oil lamps. A pair of dorsoventrally oriented valves protect the animal’s body
and organs. These valves are typically unequal in size and are connected to each other, either
through the function of muscles or through a hinge system with a tooth-in-socket morphology.
Extant Brachiopods are marine and benthic Metazoa, mostly found on the continental
shelf. They can be found attached to the benthos, at the lower levels of the water column with
their specialised pedicle, an outgrowth of the body wall with the function of providing
attachment. There are two types of pedicles, either developing from an outgrowth of the body
wall or separating dorsal and ventral body walls (Ekman, 1896). Some brachiopods do not
attach themselves to hard substrata, as is the case for Lingulida, which are infaunal burrowers.
Other brachiopod species can also be found cemented to a hard substratum. Brachiopods
4
possess a lophophore, a ciliated and tentacular structure surrounding the mouth (Carlson, 2016).
This, combined with the epistome, a muscular structure bringing food to the mouth, enables
efficient suspension feeding. The lophophore tentacles’ cilia draw water in the mantle cavity
by generating a unidirectional current, making suspension feeding possible.
Most brachiopods are gonochoristic animals, reproducing by external fertilization. They
have a free-swimming juvenile stage, also called planktotrophic stage, but there are quite
different ontogenetic stages for each subphyla (Brusca et al., 2016). It is thought that
brachiopods originated at least 530Ma years ago (e.g. Aldanotreta sunnaginensis Pelman,
1977). Brachiopods were among the most abundant Metazoa and suspension feeding organisms
during the Palaeozoic, their diversity increased in the Ordovician through the Carboniferous
(Valentine & Jablonski, 1983). They experienced an extinction event at Permian-Triassic at the
end of which they never recovered their previous diversity and steadily declined. This decline
is probably exacerbated by competition with mussels (Thayer, 1985).
3.1.3 Lophophorata
Brachiopods are protostomes, more precisely belonging to the Lophophorata. For several
centuries from their discovery in the 16th century, brachiopods were grouped with molluscs
and were only in the late 19th century recognized as lophophorates and grouped with other
Lophophorata, under the name Tentaculata (Hatschek, 1988). Traditionally, the lophophorates
have been seen as a monophyletic grouping consisting of the Bryozoa (6000 extant species),
Phoronida (11 species) and Brachiopoda (Brusca et al., 2016).
These phyla are all sessile, organisms generally have a U-shaped gut and a relatively
simple reproductive system (Brusca et al., 2016). But these organisms were first grouped
together because of their unique feeding structure: the lophophore.
Traditionally the Lophophorata were first considered as deuterostomes (Hennig, 1979
and Schram, 1991), because they share various ontogenetic similarity with this group, such as
a secondary development of the mouth. Molecular data showed that they were actually
genetically closer to Protostomia, and specifically belonging to the Spiralia (Halanych et al.,
1995). The internal phylogenetic relationships of the Spiralia remain a subject of heated debate
and a great amount of uncertainty remains (Fig. 1) (e.g. Eernisse et al., 1992; Hausdorf et al.,
5
2007; Laumer et al., 2015;). The phylogeny of the Lophophorata unsurprisingly is also
unresolved (e.g. Hejnol et al., 2009; Nesnidal et al., 2013; Dunn et al., 2014). Phoronida is
often found as the sister group of Brachiopoda but is also found as genetically closer to Bryozoa.
3.2 Phylogenetic analyses
Evolutionary histories can be visualised with phylogenetic trees, providing information on the
last shared common ancestry between two taxa. Recreating patterns of evolution has been the
research focus of many works since the 19th century, starting with Larmarck’s Scala Naturae
ordering organisms from simple to more complex (1809), followed by Lyell’s theory of
common ancestry (1832) and Darwin’s evolutionary theory depicting evolutionary
relationships in the form of a branching tree (1837). Since that first depiction of evolution as a
tree of interconnected clades and organisms, there have been great developments in the
visualisation, methodology and accuracy of the models used to describe the divergence of
organismal groups. Today’s reconstruction methods of phylogenetic trees can achieve precise
estimates of evolutionary history by using mathematical modelling of character evolution, such
as Maximum Likelihood (ML) and Bayesian inference. These methods use a substitution model
Figure 1 Cladogram depicting the phylogenetic relationships among Protostomia. Tip names written in purple are
Spiralian phyla. Phyla belonging in Lophophorata are indicated by a crown. Nodes with several branches
represent soft polytomies. From Brusca et al., 2016.
6
that specifies which characters can evolve between states based on the data of interest, for
example give information for each transversion and transition when DNA data is used and
inform about the rates of these evolutionary changes.
3.2.1 Maximum Likelihood
Maximum Likelihood (ML) analyses aim at finding the tree that has the highest probability of
giving rise to the observed data. The process of finding this tree starts with computing a single
tree, calculating the likelihood for each site, for example calculation the likelihood for each
nucleotide when DNA is the material of the study and then calculating the overall likelihood of
the tree. Then, by modifying the parameters of this tree, its likelihood is maximized, and the
same steps are repeated for each possible tree. The highest likelihood tree is found by comparing
the likelihood of each tree (Baum & Smith, 2013).
3.2.2 Bayesian inference
Contrary to Maximum Likelihood analysis, Bayesian inference assesses the posterior
probability of the tree, which is the probability that the tree is true given the data, our models
of evolution and our prior beliefs:
With Pr(H|D), the probability of the hypothesis (H) given the data (D), also known as posterior
probability, Pr(D|H) the probability of the data given the hypothesis (also known as the
likelihood). Pr(H) and Pr(D) both refer to priors probability. With this mathematical model, the
starting information (Prior) is determinant, hence the importance to choose these with caution
(Baum & Smith, 2013).
3.2.3 Model selection
Before starting a phylogenetic analysis, the substitution model fitting the data best has to be
identified. This will specify the way the characters can evolve between their different states
(Baum & Smith, 2013). Several substitution models exist, firstly with a model with the least
parameters, such as the Jukes-Cantor model (1969), which assumes equal frequencies, rates and
substitution for all DNA bases, to more complex models such as the general time-reversible
model (GTR) (Travaré, 1986). The GTR model assigns different substitution rates for each
7
transition and transversion as well as variable base frequencies to the characters. The best fitting
model is chosen with the help of statistical tests, such as the corrected Akaike Information
Criterion (AICc, Akaike 1974) or the Bayesian Information Criterion (BIC, Schwarz, 1978).
The model obtaining the smallest value in these tests are set to fit the data better.
3.3 Molecular clock
Obtaining accurate evolutionary timescales and correctly dating past events is fundamental to
the reconstruction of evolutionary histories. Pauling and Zuckerkandl (1963) observed from
studies on mammals, constant rates of base changes of amino-acids through time, the sole
principle of molecular clock. Based on the assumption that the number of changes in the
genome of two organisms is proportional to the time elapsed since they last shared a common
ancestor, the concept of molecular clock encompasses methods and models that assess
variations of rates of genetic evolution across the tree of life (Lee and Ho, 2016). Thanks to
different molecular clock models, one can accommodate for diversifications of rate of evolution
among genes and lineages in order to replicate the real evolutionary rate, which will be useful
to calculate evolutionary time. In Maximum Likelihood and Bayesian inference, a molecular
clock can be used to compute phylograms in which branch lengths are proportional to past
evolutionary changes, shedding light on information that might complement the information
gathered from the fossil record. However, prior knowledge is fundamental to compute such
trees, for example in the form of calibration points, that enable dating evolutionary events such
as the emergence of a particular clade. Hence the existing fossil record is useful and fossil data
are essential to calibrate and date the phylogenetic tree. Calibrating trees consists of using the
age the older fossil known for a monophyletic group as the minimum age for that clade, or even
assigning minimum and maximum age constraints for a specific event. The principles outlined
above were used in the present study.
3.4 Push of the past
Phylogenetic models aim at reconstructing relationships among selected organisms,
highlighting heterogeneous performance of different clades in a macroevolutionary context.
These variations can partially be explained by differences in their early evolutionary rates. The
8
effect of the push of the past (POTPa) is a concept introduced in the 1990s, which suggests an
“apparently higher rate of cladogenesis at the beginning of the growth of the actual phylogeny”
(Nee et al., 1994). This effect would be one of the causes of the great diversification and
survivability of today’s extant clades (Budd & Mann, 2018). These clades would have benefited
from a high diversification rate at their origin that consequently increased their probability of
survival. Clades that do not benefit from such greater early diversification rates are thus much
more prone to extinction. This implies that the monophyletic groups that had the “good fortune”
to have a distinctly greater early diversity also had much more time to diversify and survive to
the present. Hence, when the effects of the POTPa are not recognized in phylogenetic analyses,
if one considers a rather homogeneous rate of evolution throughout the tree, the age of the root
will be set excessively early, affecting the accuracy of the evolutionary history depicted in the
phylogenetic tree, resulting in erroneous observations. Rates of evolution in ancestral lineages
can be reconstructed using molecular and phenotypic data, for instance these were used in a
previous analysis focusing on arthropods (Lee et al., 2013), where higher rates of evolution
were observed at the origin of the Arthropoda. However, no similar study has taken place to
address the diversification and survival of the Brachiopoda to date.
4 Material and methods
The methodology for this study can be divided into three sections. The primary step is the
acquisition of the material for the study. This section includes everything from the collection
of animals to the extraction of genetic material. The second part of the data collection focuses
on retrieving the genes of interest to our study and grouping them in corresponding alignment
files. The third and final step is the actual analysis of the data and using fossil data to calibrate
the molecular clock. The first and second sections were carried out before the time frame of the
present thesis and thanks to the collaboration of other scientists, hence will only be described
in the Appendix.
Two analyses were conducted using two distinct datasets. First, the method was tested
in a pilot study (proof of principle study) using ribosomal RNA (rRNA) sequences and the
second analyses took advantage of a phylogenomic dataset assembled prior this study.
9
4.1 rRNA pilot study
In order to familiarize with the methodology and the software, ribosomal RNA (rRNA)
sequences were used for a pilot study. For the pilot study 18S rRNA and 28S rRNA sequences
were used. The genetic material used in this first study consisted of newly sequenced material
and GenBank material. Accession numbers of these sequences are available in Table 1. Separate
analyses were launched for 18S, 28S and a concatenated dataset grouping both 18S and 28S.
4.1.1 Alignment and model choice
Sequences were realigned using the Q-INS-i algorithm in MAFFT (Katoh & Standley, 2013),
that considers the secondary structure of RNA. With 152 taxa, belonging to height different
phyla (Table 2), the final 18S rRNA alignment contained 1952 sites, while the 28S rRNA
dataset that had 140 taxa and 1450 sites. In order to concatenate two datasets, they must contain
the same taxa, therefore, the final concatenated dataset included 76 taxa and 3234 sites (1884
from the 18S partition and 1350 from the 28S partition). Before setting the parameters for the
Bayesian analysis, PAUP* (Swofford, 2003) was used to find the model fitting the data the
best, based on a distance matrix and a neighbour joining tree created from the alignment:
#open the alignment in nexus format
exe [filename].nexus
#create a distance matrix and save it in a text file
savedist format=tabText triangle=both diagonal file=[filename]
#create a neighbour joining tree
nj bionj=yes showtree=yes
#find the best model
automodel ShowParamEst=yes
Results are presented in Table 2.
10
Table 1 Species names, authorship and GenBank accession numbers of taxa included in this rRNA study. If an
asterisk (*) follows a GenBank numbers it indicates that this sequence was used in the concatenated dataset. If
the GenBank accession numbers are followed by a superscripted B (B) or a superscripted R (R), it denotes that
the sequence was only used respectively in the BEAST2 or RAxML analysis.
Species 18S 28S Autorship LINGULIFORMEA
Discinoidea
Discinisca stella
JN415657* Dall, 1871
Discinisca tenuis AY842020*; AF202444
Dall, 1871 Discina striata U08333* JQ414037* (Schumacher, 1817)
Discinisca tenuis U08327
(Sowerby, 1847)
Pelagodiscus atlanticus
JQ414034; JQ414035; JQ414036
(King, 1868)
Linguloidea
Glottidia palmeri AF201744* JN618994* Dall, 1871 Glottidia pyramidata U12647
(Stimpson, 1860)
Lingula adamsi U08329
Dall, 1873
Lingula anatina AB747095; U08331; X81631*
Lamarck, 1801
Lingula reevii AB747096B; AH001678;
LC334155
Davidson, 1880
Lingula rostrum AB855774
(Shaw, 1797)
Lingula sp. Not published yet* Not published yet*, AY839250* Brugière, 1791
CRANIFORMEA
Cranioidea
Neoancistrocrania sp. HQ852066* HQ852082*; HQ852083* Laurin, 1992 Neoancistrocrania norfolki AY842019; HQ852055*;
HQ852057*; HQ852058*
HQ852075*; JX575602* Laurin, 1992
Neocrania anomala DQ279934; U08328
(O. F. Müller, 1776) Neocrania huttoni U08334
(Thomson, 1916)
Novocrania anomala HQ837667* HQ852076* (O. F. Müller, 1776)
Novocrania anomala
HQ852068 (O. F. Müller, 1776) Novocrania huttoni
U08334 (Thomson, 1916)
Novocrania lecointei
JX645243*; JX645254* Lee & Brunton, 2001
Novocrania pourtalesi HQ837658*
Dall, 1871 Novocrania sp. HQ852060*; HQ852065*;
HQ837653*; HQ852061*;
HQ852064; Not published
yet*
HQ852070*; HQ852072*;
HQ852073; HQ852077;
HQ852079; HQ852080;
HQ852084*; HQ852086;
HQ852087; HQ852088;
HQ852089: HQ918260*; JN673266*; JN673267*;
JN985738*; Not published yet*
Lee & Brunton, 2001
RHYNCHONELLIFORMEA
Thecideida
Thecideoidea
Lacazella caribbeanensis KC577459; KC577460;
KC577461; KC577462
Cooper, 1977
Minutella sp. KC577466B
Hoffmann & Lüter, 2010 Ospreyella palauensis KC577456
Logan, 2008
Ospreyella depressa KC577455
Lüter, 2003
Ospreyella sp. KC577451; KC577452; KC577453; KC577454
Lüter & Wörheide, 2003
Pajaudina atlantica KC577457; KC577458
Logan, 1988
Thecidellina japonica KC577465B
(Hayasaka, 1838) Thecidellina williamsi KC577464B
Lüter & Logan, 2008
Thecidellina sp. KC577463B
Thomson, 1915
Terebratulida
Cancellothyroidea
Cancellothyris hedleyi AF025929* HQ918285* (Finlay, 1927) Cancellothyris sp. HQ918285
Thomson, 1926
Chlidonophora sp. HQ822064
Dall, 1903
Chlidonophora sp. AF025930* HQ918275* Dall, 1903 Eucalathis murrayi
JN632510 (Davidson, 1878)
Terebratulina retusa GU125759*; U08324* MF619926; MF619927;
MF619928; MF619930; MF619932; MF619933;
MF619934; MF619935;
MF619936*; MF619937*
(Linnaeus, 1758)
11
Species 18S 28S Autorship Terebratulina septentrionalis
HQ918277; MF619938;
MF619940; MF619941;
MF619944; MF619949
(Couthouy, 1838)
Terebratulina unguicula
HQ918278.1 (Carpenter, 1864)
Dyscolioidea
Abyssothyris wyvillei
HQ918286 (Davidson, 1878) Abyssothyris sp. AF025928* HQ918274* Thomson, 1927
Dyscolia wyvillei HQ918276* HQ918276* (Davidson, 1878)
Dyscolia sp. AF025931
Fischer & Oehlert, 1890 Xenobrochus africanus
HQ918283 (Cooper, 1973)
Gwynioidea
Gwynia capsula AF025940* HQ918267* (Jeffreys, 1859) Kingenidae
Ecnomiosa sp. HQ852096
Cooper, 1977
Fallax neocaledonensis AF025939* HQ918265* Laurin, 1997 Kraussinoidea
Kraussina mercatori HQ852100
Helmcke, 1939
Megerlia truncata U08321
(Linnaeus, 1767) Megerlina sp. AF025943* HQ918269* Deslongchamps, 1884
Pumilus antiquatus HQ852097 HQ918268 Atkins, 1958
Laqueoidea
Coptothyris adamsi KF038315 GQ275348 (Davidson, 1871)
Laqueus californicus U08323; Not published
yetB*; Not published yet*; Not published yet*;
Not published yet*; Not
published yet*; Not published yet*
(Koch, 1848)
Terebratalia coreanica HQ852098
(Adams & Reeve, 1850) Terebratalia transversa AF025945; JF509725;
FJ196115; U12650
(Sowerby, 1846)
Megathyridoidea
Argyrotheca cordata AF119078
(Risso, 1826)
Megathiris detruncata HQ852093
(Gmelin, 1791)
Platidioidea
Amphithyris sp. HQ852094* HQ918270* Thomson, 1918
Terebratelloidea
Anakinetica cumingi HQ852095
(Davidson, 1852) Aneboconcha obscura
GQ261959 Cooper, 1973
Calloria inconspicua AF025938* DQ871292* (Sowerby, 1846)
Calloria variegata
DQ871293 Cooper & Doherty, 1933 Campages sp.
HQ918259 Hedley, 1905
Dallina septigera
JF273494 (Lovén, 1846)
Dallina sp.
GQ265554; GQ261960 Beecher, 1893 Gyrothyris mawsoni AF025941* DQ871294* Thomson, 1918
Gyrothyris williamsi
DQ871295 Bitner, Cohen, Long, Richer de
Forges & Saito, 2008 Magellania venosa
GQ265555; GQ265556;
DQ871297
(Dixon, 1789)
Neothyris lenticularis
DQ871298 (Deshayes, 1839) Neothyris parva AF025944* DQ871299* Cooper, 1982
Nipponithyris sp.
KF313140 Yabe & Hatai, 1934
Terebratella dorsata
GQ261961; GQ265558; GQ265559
(Gmelin, 1791)
Terebratella sanguinea DQ871291* DQ871300; GQ265896* (Leach, 1814)
Terebratuloidea
Dallithyris murrayi
HQ918280 Muir-Wood, 1959
Gryphus vitreus AF025932* HQ918272* (Born, 1778)
Kanakythyris pachyrhynchos HQ852090* HQ918281* Laurin, 1997 Liothyrella neozelanica HQ822070; U08332* GQ261962* Thomson, 1918
Liothyrella uva U08330* GQ261963* (Broderip, 1833)
Magellania fragilis AF202110* DQ871296* Smith, 1907 Stenosarina crosnieri AF025934
(Cooper, 1983)
Stenosarina globosa
HQ918273, HQ918282 Lurin, 1997
Terebratulidae sp.
HQ918284 Gray, 1840 Zeillerioidea
Macandrevia cranium AF025942* HQ918266* (O. F. Müller, 1776)
Macandrevia sp.
HQ918271 King, 1859
Rhynchonellida
Dimerelloidea
Cryptopora gnomon HQ852102B
Jeffreys, 1869
Hemithiridoidea
Hemithiris psittacea HPU08322* GQ265895* (Gmelin, 1791) Notosaria nigricans
AY839243 (Sowerby, 1846)
Norelloidea
Frieleia halli
GQ261964 Dall, 1895 Manithyris rossi
GQ261965 Foster, 1974
12
Species 18S 28S Autorship Neorhynchia sp. AF025937
Thomson, 1915
Parasphenarina cavernicola HQ852101* JQ663539* Motchuro-Dekova, Saito & Endo,
2002 Tethyrhynchia mediterranea HQ852105B* HQ918264* Logan in Logan & Zibrowius,
1994
Pugnacoidea
Acanthobasiliola doederleini HQ852099
(Davidson, 1886)
Basiliola beecheri HQ852103B* HQ918262* (Dall, 1895)
Basiliola lucida HQ852104B* HQ918263* (Gould, 1862) Eohemithiris grayi AF025936* AY839242* (Woodward, 1855)
Brachiopod sp. AF025935
PHORONIDA
Actinotrocha harmeri KC161253; KC161254
Zimmer, 1964
Actinotrocha sp. KC161255
Müller, 1846 Phoronis architecta AF025946* EU334109*
Phoronis australis AF119079; AF202111;
EU334122*; EU334123*; KT030908; KT030909;
KT030910
EU334110; EU334111*;
EU334112*
Haswell, 1883
Phoronis embryolabi
KY643695* Temereva et Chichvarkhin, 2016 Phoronis emigi AB621913
Hirose, Fukiage, Katoh &
Kajihara, 2014
Phoronis hippocrepia AF202112; JF509726; KT030907; U08325
Wright, 1856
Phoronis ijimai AB752304* KY643697*; KY643698* Oka, 1897
Phoronis muelleri EU334125*; KJ193748 EU334114* Selys-Longchamps, 1903 Phoronis ovalis EU334126*; GU125758B EU334115* Wright, 1856
Phoronis pallida EU334127* EU334116* (Schneider, 1862)
Phoronis psammophila U36271
Cori, 1889 Phoronis sp. AB106269*; KT030901*;
KT030903*; KT030904*;
KT030906*
Wright, 1856
Phoronis vancouverensis AY210450; FJ196118
Pixell, 1912
Phoronopsis californica EU334129* EU334118* Hilton, 1930
Phoronopsis harmeri AF123308*; EU334130* EU334119*; EU334120 Pixell, 1912
ANNELIDA
Urechis caupo AF119076R AF342804 Fisher & MacGinitie, 1928 Phascolosoma granulatum X79874R AF519279R Wesenberg-Lund, 1959
Drawida sp.
HQ728964B Michaelsen, 1900
MOLLUSCA
Abra alba AM774533* KC429505* (W. Wood, 1802)
Astarte undata KPD68176* KP068126* Gould, 1841
Chaetoderma sp. MG595728B* AY145397B* Lovén, 1844 Dentalium octangulatum AY145372B* AY145403B* Donovan, 1804
Koreamya arcuata AB907557B; AB907558B;
AB907560B
(A. Adams, 1856)
Koreamya sp. AB907561B
Lützen, Hong & Yamashita,
2009
NEMERTEA
Antarctonemertes riesgoae KF935322R KF935378 Taboada et al., 2013
Micrura fasciolata
HQ856846B Ehrenberg, 1828 Ovicides paralithodis
AB704416 Kajihara & Kuris, 2013
Tubulanus tamias LCO42092R AB854622B Kajihara, Kakui, Yamasaki &
Hiruta, 2015
PLATHYELMINTES
Discocelis tigrina U70078B* MF398475B* (Blanchard, 1847)
Geocentrophora wagini AJ012509B* AY157156B* Timoshkin, 1984
Neoheterocotyle rhinobatidis
AF348361B (Young, 1967)
Stenostomum leucops FJ384804B* AY157151B* (Duges, 1828)
BRYOZOA
Favosipora rosea FJ409605B* FJ409581B* Gordon & Taylor, 2001
Fredericella sultana JN680926B* JN681029B* Blumenbach, 1779 Pectinatella magnifica
FJ409576B (Leidy, 1851)
Plumatella emarginata KC461939B* KC462025B* Allmann, 1844
ROTIFERA
Lecane bulla DQ297698B* AY829083B* (Gosse, 1851)
Notommata sp. DQ297710B* DQ297748B* Ehrenberg, 1830 Sinantherina socialis AY210451B* AY210471B* (Linnaeus, 1758)
13
Table 2 Best fitting model according to PAUP*. The best model are found thanks to BIC and AICc criterions
score. The best models found for each dataset considers a proportion of invariant sites and gamma distributed rate
variation among sites. p.i. stands for proportion of invariant sites. These data are used to parameter the BEAST2
analyses.
4.1.2 Bayesian analyses
The BEAST2 package (Bouckaert et al., 2014), v.2.4.8, was used for the Bayesian inference.
First, BEAUti was used to create xml files processed by BEAST2 and containing the alignments
(nexus format) and priors, as well as all the parameters of the MCMC chain for the analysis.
This fundamental and determinant part of a Bayesian analysis requires parameters specific to
the selected dataset. In total, three different xml files were created: one containing 18S rRNA
alignment, the second containing 28S rRNA sequences and the third containing a concatenation
of both alignments.
Site model and substitution model parameters were chosen using the results of the
PAUP* analysis (Table 2). The Tamurei-Nei model (TrN, also known as TN93) fits best the
18S rRNA partition and sets equal transversion rates but variable transition rates (Tamura &
Nei, 1993). A model integrating different nucleotide frequencies and different rates for every
change called the general time-reversible (GTR) model has been chosen for the 28S rRNA
dataset. Also, a rate of variation across sites (Yang, 1994) and a proportion of invariable sites
(Shoemaker & Fitch, 1989) are combined with the models, respectively indicated by +G and +I
(Table 2).
The relaxed clock lognormal model (Drummond et al., 2006) was used for all datasets,
as well as fossil calibrations used as priors are listed in Table 3. Long MCMC chain lengths of
100 millions generations were used, with trees stored every 20 000.
Analyses were either run directly on computer memory (smaller ones) or on clusters,
such as Rakham, a high performance computer cluster at Uppsala Multidisciplinary Center for
Advanced Computational Science (UPPMAX) or on CIPRES (Miller et al., 2010), the
Dataset Best model rates shape p. i. lset
nst rclass rmatrix base frequencies
18S TRN + I + G gamma 0.49 0.26 6 abaaea 1 1.96 1 1 4.61 0.25 0.22 0.29
28S GTR + I + G gamma 0.58 0.05 6 abcdef 0.77 1.86 1.36 0.71 3.28 0.18 0.28 0.33
18S reduced
TRN + I + G gamma 0.5 0.3 6 abaaea 1 1.91 0.87 0.87 4.27 0.26 0.29 0.27
28S reduced
GTR + I + G gamma 0.69 0.2 6 abcdef 0.79 1.93 1.44 0.75 3.66 0.18 0.27 0.33
14
Cyberinfrastructure for Phylogenetic Research. To launch analyses on UPPMAX, bash scripts
are created following this structure:
#!/bin/bash #SBATCH -A [project-name] #SBATCH -p [partition-used] -n [number-of-nodes] #SBATCH -t D-HH:MM:SS #SBATCH --mail-user=[email] #SBATCH --mail-type=ALL #SBATCH -J [job name] #SBATCH -o [output file name] #SBATCH -e [error file name]
module load [ necessary modules for the analysis] java -jar [jar file of the program and its directory] -
overwrite [input file and its directory]
The three xml files were processed by BEAST2, and gave as output information about
the run, as a .log file and information about all the trees explored during the analysis, as a .trees
file. Tracer v1.7.1 (Rambaut et al., 2018) was used to summarize parameter estimates and check
for convergence of the run thanks to the .log file. Then, in order to summarize the results of the
Bayesian analysis contained in the .trees file, maximum clade credibility trees were built using
TreeAnnotator. Different burn-in values were chosen for each analysis, 40 million states for
18S, 10 million states for 28S and 40 million states for the combine analysis. Finally, FigTree
v.1.4.3 (Rambaut, 2017) was used to visualize the final phylogenetic tree.
Table 3 Fossil calibrations used in this study. For brachiopods, the age refers to the minimum age of the crown
groups, based on the oldest fossil known (Kaesler & Selden, 1997–2007; Peters & McClennen, 2016). Sometimes
a maximum age value is included.
Crown Age
(MYA)
Geological period Species Specimen
number
Reference
Brachiopoda 530 Early Cambrian Aldanotreta sunnaginensis,
Pelman, 1977
Pelman, 1977
Rhynchonelliformea 460.9 Middle Ordovician Dorytreta ovata, Cooper 1956 117192 Cooper, 1956
Thecideida 232 Late Triassic Thecospira tirolensis
Terebratulida 419.2 Boundary Silurian/
Devonian
Brachyzyga absoleta 64/12904
Craniformea 467.5 Middle Ordovician Pseudocrania petropolitana,
Pander 1830
588-42 Pander, 1830
Linguliformea 242 Middle Triassic Glottidia tenuissima
Discinoidea 471.8 Early Ordovician Schizotreta sp. 1, Hansen and
Holmer, 2011
TSGF17017 Hansen &
Holmer, 2011
Annelida 467.5 –
636.1
Middle Ordovician
/Late Ediacaran
Xanioprion viivei, Hints &
Nõlvak, 2006
GIT 424–19 Parry et al.,
2014; Benton et
al., 2015
Arthropoda 515 -
636.1
Cambrian/ Late
Ediacaran
Yicaris dianensis, Zhang et
al., 2007
YKLP
10840
Zhang et al.,
2007; Benton et
a.l, 2015
Mollusca 532 -
636.1
Early Cambrian/ Late
Ediacaran
Aldanella yanjiahensis, Chen
1984
TU YXII02-
02
Chen, 1984;
Benton et al.,
2015
15
4.1.3 RAxML analysis
Similarly to the Bayesian inference, three Maximum Likelihood analyses were launched, and
support values for the phylogenetic trees were calculated using bootstrap, a method that asseses
the confidence in a statistical analysis by producing random, pseudoreplicate datasets and
analysing each of them to see how often a specific result is obtained (Baum & Smith, 2013). A
RAxML (Stamatakis, 2014) command combining a message passing interface (MPI) and
threads for parallelization was used in CIPRES:
raxmlHPC-HYBRID -T 4 -n [result file] -s [input file] -m [GTRCAT] -p 12345 -f a -N
1000 -x 12345 --asc-corr lewis
This command set 1000 rapid bootstrap replicates, with 12345 informing on the random seed
value guaranteeing compatibility between runs. The standard ascertainment bias correction
Lewis is chosen. The final best ML tree was drawn in FigTree with bootstrap values plotted.
4.2 Phylogenomic analyses
For these analyses, our own new transcriptome data (n=13) were clustered with published (n=5)
and ortholog genes from Kocot et al. (2016) across Lophotrochozoa, known as the LB106 gene
subset (n=106). The latter are protein coding metabolic genes, which were corrected for
heterogeneous evolutionary rates and were analysed as individual partitions. The full
phylogenomic matrix consisted of 103 taxa with 75% matrix occupancy, and 22 000 amino acid
sites.
4.2.1 Dataset and model choice
The final dataset used for the study was reduced, which considerably decreased the computation
time of the analysis. Preferably, taxa with a poor fossil record were removed from the analysis.
Sequences used for the phylogenomic analyses are summarized in the Table 4. The best fitting
evolutionary model for this dataset, LG, was determined using PartitionFinder (Lanfear et al.,
2017). This empirical model for amino acid substitution and gamma model of rate heterogeneity
accounts for variability of evolutionary rates across sites.
16
Table 4 Species names in alphabetical order, authorship, GenBank accession numbers and references of taxa
included in this phylogenomic study.
Species name Phyla Accession Number(s) /
Version / URL /
Autorship References
Capitella teleta Annelida JGI v1.0 Blake, Grassle and
Eckelbarger, 2009
Kocot et al., 2016
Crassostrea gigas Mollusca v9 protein models (Thunberg, 1793) Kocot et al., 2016
Daphnia pulex Arthropoda JGI filtered gene
models v1.1
Leydig, 1860 Kocot et al., 2016
Drosophila melanogaster Arthropoda Inparanoid 7.0
processed sequences
Meigen, 1830 Kocot et al., 2016
Glottidia pyramidata Brachiopoda SRX731468 (Stimpson, 1860) Kocot et al., 2016
Hemithiris psittacea Brachiopoda SRX731469 (Gmelin, 1791) Kocot et al., 2016
Kraussina rubra Brachiopoda PRJNA289348
(BioProject)
(Pallas, 1760) Laumer et al.,
2015
Laqueus erythraeus Kocot Brachiopoda Dall, 1920
Lineus lacteus Nemertea SRX565174-
SRX565175
(Rathke, 1843) Kocot et al., 2016
Lineus ruber SRX565182-
SRX565183
(Müller, 1774) Kocot et al., 2016
Lingula anatina Brachiopoda Larmarck, 1801 Luo et al., 2015
Lingula Gert Brachiopoda Bruguière. 1791
Liothyrella uva Brachiopoda (Broderip, 1833)
Magellania venosa Brachiopoda (Dixon, 1789)
Mytilus edulis Mollusca SRX565221-
SRX565224
Linnaeus, 1758 Kocot et al., 2016
Novocrania anomala Brachiopoda SRX731472 (O.F. Müller, 1776) Kocot et al., 2016
Octopus vulgaris Mollusca http://datadryad.org/h
andle/10255/dryad.34
644
Cuvier, 1797 Kocot et al., 2016
Phoronis architecta Phoronida Andrews, 1890
Phoronis australis Phoronida Haswell, 1883
Phoronis vancouverensis Phoronida SRX731474 Pixell, 1912 Kocot et al., 2016
Phoronopsis harmeri Phoronida Pixell, 1912
Pomatoceros lamarckii Annelida SRR516531 (Quatrefages, 1866) Kocot et al., 2016
Priapulus caudatus Cephalorhynch
a
SRX731477 Lamarck, 1816 Kocot et al., 2016
Terebratalia transversa Brachiopoda (Sowerby, 1846)
Tubulanus polymorphus
Halanych
Nemertea SRX732127 Renier, 1804 Kocot et al., 2016
Tubulanus polymorphus
Struck
Nemertea SRX534879-
SRX534881
Renier, 1804 Kocot et al., 2016
X1 BC Laqueus
vancouveriensis
Brachiopoda Not published yet Davidson, 1887 This study
X1 SF Hemithiris psittacea Brachiopoda Not published yet (Gmelin, 1791) This study
X2 Liothyrella sp S Ork Brachiopoda Not published yet Thomson, 1916 This study
X2 TN Terebratulina sp Brachiopoda Not published yet dOrbigny, 1847 This study
X3 Pictothyris picta Japan Brachiopoda Not published yet (Dillwyn, 1817) This study
X3 Terebratulina
septentrionalis
Brachiopoda Not published yet (Couthouy, 1838) This study
X4 GM Mollusc Mollusca Not published yet This study
X4 lingula reevi Hawaii Brachiopoda Not published yet Davidson, 1880 This study
X4 Pla1 Platidia sp Japan Brachiopoda Not published yet Costa, 1852 This study
X5 Laqueus blanfordi Japan Brachiopoda Not published yet (Dunker, 1882) This study
X5 P2 Pictothyris picta Brachiopoda Not published yet Thomson, 1927 This study
X6 PA3 Pajaudina atlantica Brachiopoda Not published yet Logan, 1988 This study
17
4.2.2 RAxML analysis
Maximum Likelihood analysis of complete and reduced taxon sets were undertaken, using the
-CAT-LG ProtGamma model and substitution amino matrix. To find a single best tree, 350
rapid bootstrap replicates and subsequent thorough ML were done. Similarly to the rRNA
datasets, a RAxML version combining a message passing interface and threads for
parallelization was used:
raxmlHPC-HYBRID_8.2.12_comet -s infile -n result -q partition -k -f a -N 350 -p
12345 -x 12345 -m PROTCATLG
Alternatively to the LG empirical model, another ML analysis was carried out using
Dayhoff recoding (Dayhoff, 1978), which consists of separating the amino acids in six groups
with distinct biochemical properties. This recoding process is used to reduce the effects of
substitution saturation and compositional heterogeneity (Hrdy et al., 2004). The highest
likelihood trees and their corresponding bootstrap values were drawn in FigTree.
4.2.3 Bayesian inference for phylogenetic reconstruction
The Bayesian analysis was performed using the CAT-GTR model (Lartillot & Philippe, 2004)
implemented in PhyloBayes (Lartillot & Philippe, 2004 and 2006; Lartillot et al., 2007). This
model is particularly adapted for amino acid alignments as it accounts for site-specific features
of protein evolution. The PhyloBayes analysis were run on Stanford’s cluster Sherlock, using
a bash script:
mpirun -n [number of processes running in parallel] pb_mpi -d [alignment in PHYLIP
format] -cat -gtr [chain name]
Convergence of the chain was controlled using bpcomp and tracecomp programs
implemented in PhyloBayes. The readpb program produced output summarizing files, such
as a majority-rule posterior consensus tree.
4.2.4 Bayesian inference for molecular dating
Molecular dating in PhyloBayes requires a fixed tree topology. From this tree topology and by
specifying the outgroups to root the tree, information related to calibration can be added to a
new analysis to obtain the divergence time. The computation time for these two analyses
18
(topology and molecular dating) in PhyloBayes was longer than the time frame of this thesis.
For this reason, the molecular dating analysis was run using the BEAST2 package, utilizing the
majority-rule posterior consensus tree from PhyloBayes as starting tree. This calibration
analysis was run with the reduced dataset containing the transcriptomes of 38 taxa, the same
sequences used for the Bayesian inference for phylogenetic reconstruction.
The gamma site model was used, with substitution rate, shape and proportion of
invariant estimated. The Dayhoff substitution model was used as it was the most similar model
option implemented in BEAST2 to the CAT-GTR model. The relaxed clock lognormal model
was set and the clock rate was estimated. The birth-death model was used, and fossil
calibrations were added as originate, hence, giving information on the divergence time of this
clade from his sister clade. An MCMC chain of 100 million generation was set, with trees saved
every 15000 generations.
5 Results
5.1 rRNA datasets
5.1.1 Topology
The three maximum clade credibility trees have been rooted using the phyla most genetically
distant from brachiopods in our dataset, Rotifera. The rotifer Sinantherina socialis was set as
the outgroup. Analysing the 18S rRNA and the concatenated dataset, all the phyla are resolved
as monophyletic. However, within the Brachiopoda, Linguliformea is only monophyletic in the
tree resulting from the Bayesian analysis of the 18S rRNA dataset (Fig. 2). The RAxML tree
has a very low support at its basal nodes, such as a bootstrap support (BS) of only 23 for the
node splitting Brachiopoda from Urechis caupo, a Polychaeta. The 28S rRNA tree depicts a
different topology, resolving Brachiopoda as polyphyletic (Fig. 3). Contrary to the rRNA
dataset, the support value are high all over the tree. The polyphyletic of Brachiopoda is fully
supported by the bootstrap replicates. The concatenated dataset depicts the same evolutionary
history as 28S (Fig. 4). However, the polyphyly of Brachiopoda is poorly supported by the
bootstrap replicates. Interestingly, in all the RAxML analyses, Linguliformea are depicted as
polyphyletic.
19
Figure 2 Cladogram depicting phylogenetic relationships among Brachiopoda and relatives based on 18S rRNA.
Originated from RAxML analyses. Support values indicated for each node are the result of Maximum Likelihood
bootstrap replicates. Branches leading to Brachiopod taxa are highlighted, in dark green for Craniformea, in lighter
green for Linguliformea and in blue for Rhychonelliformea. LB added to the taxa name indicates a long branch
leading to the tip
20
Figure 3 Cladogram depicting phylogenetic relationships among Brachiopoda and relatives based on 28S rRNA.
Topology originated from RAxML analyses. Support values indicated for each node are the result of Maximum
Likelihood bootstrap replicates. Branches leading to Brachiopod taxa are highlighted, in dark green for
Craniformea, in lighter green for Linguliformea and in blue for Rhychonelliformea.
21
Figure 4 Cladogram depicting phylogenetic relationships among Brachiopoda and relatives based on a
concatenated dataset of 18S rRNA and 28S rRNA. Topology originated from RAxML analyses. Support values
indicated for each node are the result of Maximum Likelihood bootstrap replicates. Branches leading to
Brachiopod taxa are highlighted, in dark green for Craniformea, in lighter green for Linguliformea and in blue for
Rhychonelliformea. LB added to the taxa name indicates a long branch leading to the tip.
22
5.1.2 rRNA molecular dating
The three maximum clade credibility trees depict different but overall very early emergence
dates for all clades (Figs. 5-7). The root age is older than 1.4 billion years old for the three
datasets (18S=1487.59 Mya, 28S=1532.95 Mya, 18S+28S=1539.48 Mya). The emergence date
for the Brachiopoda is found around 850 Mya (18S=813.52 Mya, 28S=886.65 Mya,
18S+28S=875.03f Mya). The results for the dating of the different orders fluctuate more, as the
emergence of the Rhynchonelliformea has been estimated to be between 549.3 Mya (28S) and
718.53 Mya (18S). This 150 Mya difference is also found for Craniformea with 500.98 Mya
found for 18S to 567.6 found for the concatenated dataset and 661.53 found for 28S.
Figure 5 Dated maximum clade credibility trees depicting phylogenetic relationships among Brachiopoda and
relatives based on a concatenated dataset of 18S rRNA and 28S rRNA. Topology and node ages originated from
BEAST2 analyses. Node ages are indicated in million years. Branches leading to Brachiopod taxa are highlighted,
in dark green for Craniformea, in lighter green for Linguliformea and in blue for Rhychonelliformea
23
Figure 6 Dated maximum clade credibility trees depicting phylogenetic relationships among Brachiopoda and
relatives based on18S rRNA. Topology and node ages originated from BEAST2 analyses. Node ages are indicated
in million years. Branches leading to Brachiopod taxa are highlighted, in dark green for Craniformea, in lighter
green for Linguliformea and in blue for Rhychonelliformea.
24
Figure 7 Dated maximum clade credibility trees depicting phylogenetic relationships among Brachiopoda and
relatives based on 28S rRNA. Topology and node ages originated from BEAST2 analyses. Node ages are indicated
in million years. Branches leading to Brachiopod taxa are highlighted, in dark green for Craniformea, in lighter
green for Linguliformea and in blue for Rhychonelliformea.
25
5.2 Phylogenomic datasets
5.2.1 PhyloBayes topology
This tree has been rooted using the scalidophoran Priapulus caudatus as the outgroup (Fig. 8).
All the Phyla are resolved as monophyletic, with relatively high support. Nevertheless, the
bootstrap value for the monophyly of Brachiopoda is 0.7. Within Brachiopoda, Linguliformea
and Rhynchonelliformea are monophyletic. However, the two taxa comprising the data subset
for the order Craniformea are located at different parts of the tree. Novocrania anomalais
resolved as a sister taxon to Linguliformea, with a bootstrap support of 0.9. The other
Craniformea (X2_BC_Novocrania_sp) forms a clade with Kraussina rubra, a
Rhynchonlliformea, with a high support value of almost of 1. The branch of the taxon
X2_BC_Novocrania_sp is excessively long.The orders in Rhynchonelliformea are not
monophyletic: Terebratulida form a paraphyletic group, as Pajaudina atlantica, a Thecideida
is found among the Terebratulida.
Figure 8 Phylogenetic relationships among Brachiopoda and relatives based on a phylogenomic dataset.
Topology, branch lengths and support values originated from PhyloBayes analyses. Support for each node are
indicated as posterior probabilities.
26
5.2.2 RAxML LG and Dayhoff
The highest likelihood trees of the RAxML LG (Fig. 9) and RAxML Dayhoff (Fig. 10) have
the same topology and identical support values, which are overall very high. Moreover, the
topology of the tree for the Brachiopoda clade is fully supported by the bootstrap values. As
for the PhyloBayes analysis, the two trees have been rooted with Priapulus caudatus, however
the topology of the outgroups is different in the RAxML trees because Nemertea is resolved as
the sister group to Mollusca. This node has a quite low support value of 55.
For the RAxML analysis, the taxa X2_BC_Novocrania_sp was removed. The
monophyly of Linguliformea and Craniformea is fully supported by these two Maximum
Likelihood trees. Interestingly, a long branch is connected to Kraussina rubra, a
Rhynchonelliformea. These same taxa formed a clade with the problematic Craniformea,
X2_BC_Novocrania_sp.
Figure 9 Phylogenetic relationships among Brachiopoda and relatives based on a phylogenomic dataset.
Topology, branch lengths and support values originated from RAxML analyses, using the LG model. Support for
each node are indicated as Maximum Likelihood bootstrap values. Branches leading to Brachiopod taxa are
highlighted, in dark green for Craniformea, in lighter green for Linguliformea and in blue for Rhychonelliformea.
27
Figure 10 Phylogenetic relationships among Brachiopoda and relatives based on a phylogenomic dataset.
Topology, branch lengths and support values originated from RAxML analyses, using Dayhoff recoding. Support
for each node are indicated as Maximum Likelihood bootstrap values. Branches leading to Brachiopod taxa are
highlighted, in dark green for Craniformea, in lighter green for Linguliformea and in blue for Rhychonelliformea.
5.2.3 Molecular dating
The tree depicts very early divergence dates for the brachiopods and the other Lophotrochozoan
phyla (Fig.11). The age of the root is calculated at more than 1.1 billion years ago (1103.48
Mya). The divergence date of Brachiopoda and their sister group Phoronida is estimated at
612.97 Mya, and the age of 591.77 Mya is calculated for Brachiopoda. The clade comprising
the orders Linguliformea and Craniformea is 518.69 Ma old. Linguliformea is dated at 225.94
million years, twice younger as the Rhynchonelliformea (450.75).
The node bars indicating the support for the age values calculated are very large, around
100 million years for the nodes indicating the divergence of Phoronida and Brachiopoda as well
as for the Brachiopoda. The error interval gets larger towards deeper nodes of the tree, reaching
more than 200 million years at the divergence event between Annelida and the other phyla. The
28
younger nodes do not have however, greater support. For example, the Liothyrella uva and
Liothyrella sp. speciation event is set at 70.86 Mya but the error bars indicate 95% confidence
interval of 150 Ma.
Figure 11 Dated maximum clade credibility trees depicting phylogenetic relationships among Brachiopoda and
relatives based on a phylogenomic dataset. Topology, branch lengths and node ages originated from BEAST2
analyses. Node ages are indicated in million years, and 95% confidence intervals are drawn for each nodes as blue
bars.Branches leading to Brachiopod taxa are highlighted, in dark green for Craniformea, in lighter green for
Linguliformea and in blue for Rhychonelliformea.
.
29
6 Discussion
6.1 Limitations of rRNA datasets in Brachiopod phylogenetics
In phylogenetic studies, rRNA datasets are a powerful tool for analysing the evolutionary
relationships of the Metazoa, such as identifying the phylogenetic position of particular clades
(e.g. Halanych et al.,1995). In our study, using rRNA datasets, various topologies (monophyly
or polyphyly) for the Brachiopoda have been obtained through Maximum Likelihood and
Bayesian analyses. The analyses of the 18S dataset (Fig. 2 and Fig. 6) resolved the Brachiopoda
as monophyletic. Within the Brachiopoda the suborder Linguliformea is polyphyletic, while the
other two suborders Craniformea and Rhynchonelliformea are monophyletic. However, due to
the low support values calculated for the majority of the internal nodes of this tree (Fig. 2), this
topology should not be considered valid.
When using 28S rRNA and the concatenated dataset (18S rRNA and 28S rRNA),
depicting Brachiopoda as polyphyletic, with a low bootstrap value for the concatenated dataset
but fully supported topology in the 28S analysis (Fig. 3 and Fig. 4). In previous studies, 18S
and 28S rRNA have also been used to investigate Brachiopoda and Lophotrochozoan
relationships (e.g. Cohen, 2000; Cohen and Weydmann, 2005; Passamaneck et al., 2006),
where similar topologies were found and deemed inaccurate. The combined dataset (18S and
28S) resolved brachiopods as polyphyletic. A first work resolved Brachiopoda and Phoronida
as a monophyletic group (Cohen and Weydmann, 2005), while another found this Phylum
nested between Phoronida, Mollusca and Annelida (Passamaneck et al., 2006). Nevertheless,
these results were due to a lack of phylogenetic signal and later disputed (e.g. Sperling et
al.,2011; Kocot et al., 2017). However, the Brachiopoda were resolved as monophyletic when
systematic error is considered, and a sufficient phylogenetic signal is ensured. Systematic errors
due to, for example, the heterogeneity of nucleotide composition and rate variation across
lineages (Som, 2014), can be corrected by using appropriate models, increasing or changing the
taxa sampling and removing fast evolving genes. Moreover, several indices have been
developed to test and quantify the phylogenetic signal (Münkemüller et al., 2012).
Thus, the results obtained in this study are consistent with previous observations from
rRNA analyses. Our results corroborate the lack of phylogenetic signal in ribosomal gene data,
making them inappropriate for phylogenetic analyses. In addition, unusually long branch
lengths have been recovered in our phylogenetic trees for some taxa (especially for the
outgroups Platyhelmintes, Annelida, Bryozoa and Mollusca (Fig. 2 and Fig. 4)). This introduces
30
the problem of the phylogenetic tree building artefacts known as long-branch attraction (LBA,
Hendy and Penny, 1989). This pattern occurs when higher evolutionary rates occur in few
branches, grouping the descendants of these branches in a clade, even if these are unrelated
(Baum & Smith, 2013). Such LBA artefacts impact the topology of the tree and have
consequence on the depicted evolutionary history. Nevertheless, solutions exist to suppress
LBA, such as removing problematic taxa or even adding closely related taxa to break long
branches. Also make sure to use fitting evolutionary models before analysis and carefully
selecting genetic markers prior to the sequencing ensure minimization of the LBA effects.
6.2 BEAST2: a problematic dating methodology
Using the phylogenomic dataset, a well-supported topology (Figs. 8, 9 and 10) and a dated
phylogenetic tree have been generated (Fig. 11). However, unrealistic dating values were
obtained for the basal nodes of the tree, if we consider the extensive fossil record of the
Brachiopoda. Dating the molecular phylogeny of the Brachiopoda using the software package
BEAST2 has shown its limitations in this work.
At the current state of the study, a dated tree has been produced, but we still cannot
accurately study the early rate of evolution of the Brachiopoda using rRNA and phylogenomic
data with the methodologies developed for BEAST2. The results obtained for all datasets,
including single rRNA datasets, concatenated S18 and S28 rRNA and phylogenomic material,
are in disagreement with the extensive fossil record of the Brachiopoda and the early Metazoa
and cannot be trusted. Values obtained reach dates of over 1 billion years for the divergence
node between the Priapulida and the other Protostome taxa included in the phylogenomic study.
The fossil record can provide useful insight for the validity of the results obtained by studies
focused on molecular data (phylogenomics, rRNA). For instance, the controversial oldest
protostome Kimberella (Fedonkin et al., 1997) has been dated between 550.25 and 636.1 Ma
but has been used as a calibration point for previous studies (e.g. Benton et al., 2015).
When setting calibration points for the Brachiopoda through their fossil record at 530
Ma, the posterior dating result estimates the age of the clade at 591 Ma. As for all the fossil
calibrations, the posterior ages were always estimated as relatively older to the parameters set
prior to the analysis. It seems here that BEAST2 always overestimated the ages, indicating large
error bars but a discrepancy of 50 Ma is indication that the methodology can potentially be
useful with further analysis and appropriate parameters.
31
I attempted to replicate the study from Lee et al. (2013), without success, as this analysis
was launched in a previous version of BEAST that is no longer available today (Drummond et
al., 2012). Being able to replicate this previous work would have been useful to fully
comprehend the process of molecular dating in BEAST2.
It is very likely that our results are impacted by incorrectly set priors, and more
specifically time priors. The software used for molecular dating is quite complex to adjust and
get accurate results from. The validity of the results also depends on very specific priors for
each dataset. Even though parameters set for the study were chosen because of their
consistency, the analyses might have been wrongly calibrated. It is surprising that posterior
dating values vary that much to the fossil calibration priors set for the study (Table 3). The
likelihood that issues with the prior parameters are what is producing perturbations in our
models is further reinforced by the fact that significant time and computational power was
required to obtain a convergence of the run i.e. for the posterior parameters of the analysis to
reach stable values. A considerably high amount of MCMC (Markov chain Monte Carlo) runs
were chosen (100 millions) as well as high burn-in values (up to 40%, i.e.40 millions) were set
to compute the maximum clade credibility tree. Several studies put emphasis on the impact of
uncertainties of prior calibrations to the validity of the posterior probabilities obtained (e.g.
Warnock et al., 2014; Guindon, 2018). Fossil calibration is a delicate task that requires full
control of the parameters and extra care for the calibration of the priors.
6.3 Future studies
The phylogenomic dataset analysis resulted in well supported trees, especially after removing
one problematic Craniformea causing LBA artifacts (e.g. X2_BC_Novocrania_sp, containing
a great proportion of missing data). Using a reduced LB106 dataset added resolution to the final
phylogenetic tree. The full dataset was composed with several low occupancy taxa, adding
uncertainties to the final study result. However, future studies should include several
Discinoidea taxa, to finally include specimen samples from every brachiopod order. We will be
able to study more in detail the relationships among this phylum and will be able to add fossil
calibrations in shallower nodes.
With the current methodology, it was not possible without further investigation to
discard or corroborate the hypothesis of fast evolutionary rates in early Brachiopod evolution.
Nevertheless, this hypothesis will be later tested using other software. For example, MCMCtree
32
(dos Reis and Yang, 2019), also using molecular clock models combined with soft fossil
constraints to obtain species divergence time could be used to compare results with BEAST2.
It would also be very interesting to investigate the result of a PhyloBayes molecular dating
analysis, as this software uses other substitution models and requires setting fewer parameters
and constraints.
Additionally, morphological data should be taken into consideration and further utilised
in dating and calibrating phylogenomic studies (e.g. tip dating). These could be used to infer
the validity of phylogenetic trees. And lastly, morphological evolution can later be mapped on
the molecular tree, as a total evidence type study (Eernisse et al., 2013).
7 Conclusion
For the time being, this study has encountered some limitations of BEAST2 to investigate
brachiopod evolutionary history. As correct estimations of early rates of evolution are only
possible by means of correctly dated trees, further study will be required to clarify, as has been
shown for the Arthropoda (Lee et al., 2013), the effect of the Push of the Past in the evolution
of the Brachiopod phylum. While accurate dating methodologies still need to be determined,
promising, well-supported results for a Brachiopod phylogeny have been obtained in this work.
The results obtained reinforce the value of using phylogenomic data in phylogenetic analyses.
8 Acknowledgments
I would like to thank Aodhàn Butler and Graham Budd for this opportunity. You have always
been available when I needed help and advice and I really appreciated that you always were so
positive. Thanks for trusting me with this project.
I also would especially like to thank Anneli, Bobby, Diem, Gianni, Jenni, Katerina and Raquel
for everything they did for me during this thesis. Your presence, advice and friendship have
helped me to chill and feel more confident.
I would also like to thank many people from systbio. Thank you Iker and Martin for your help
with bioinformatics, thank you Mikael and Petra for the fun and thank you Nahid for caring so
much.
I am also grateful to my opponents Eliott Capel and Michael Streng, who came up with
interesting comments on my work. Moerover, fewer results would have been presented without
the Stanford University and the Stanford Research Computing Center which provided
computational resources and support that contributed to these research results.
Enfin, merci à ma famille pour leur soutien indéfectible.
33
8 References
Akaike, H. (1974). A new look at the statistical model identification. In: Selected Papers of
Hirotugu Akaike. Springer, New York, NY, pp. 215-222.
Baum, D. A., & Smith, S. D. (2013). Tree thinking: an introduction to phylogenetic biology.
Greenwood Village (CO): Roberts.
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. (2008). GenBank.
Nucleic acids research, vol. 37, pp. D26-D31. DOI: 10.1093/nar/gkn723.
Benton, M. J., Donoghue, P. C. J., Asher, R. A., Friedman, M., Near, T. J. & Vinther, J. (2015).
Constraints on the timescale of animal evolutionary history. Palaeontologia Electronica,
18.1.1FC. DOI : 10.26879/424.
Bouckaert, R., Heled, J., Kühnert, D., Vaughan, T., Wu, C-H., Xie, D., Suchard, M. A.,
Rambaut, A., & Drummond, A. J. (2014). BEAST 2: A Software Platform for Bayesian
Evolutionary Analysis. PLoS Computational Biology, vol. 10.
DOI:10.1371/journal.pcbi.1003537.
Brusca., R. C., Moore, W., & Shuster, S. M. (2016). Invertebrates. 3rd ed. Sunderland.
Budd, G. E., & Jensen, S. (2000). A critical reappraisal of the fossil record of the bilaterian
phyla. Biological Reviews, vol. 75, pp. 253-295. DOI: 10.1017/S000632310000548X.
Budd, G. E., & Mann, R. P. (2018). History is written by the victors: The effect of the push of
the past on the fossil record. Evolution, vol. 72, pp. 2276-2291. DOI:10.1111/evo.13593.
Carlson, S. J. (2016). The evolution of Brachiopoda. Annual Review of Earth and Planetary
Sciences, vol. 44, pp. 409-438. DOI: 10.1146/annurev-earth-060115-012348.
Chen, P. (1984). Discovery of lower Cambrian small shelly fossils from Jijiapo, Yichang, West
Hubei and its significance (in Chinese). Professional Papers of Stratigraphy and
Palaeontology, vol. 13, pp. 49-64.
Cohen, B. L. (2000). Monophyly of brachiopods and phoronids: reconciliation of molecular
evidence with Linnaean classification (the subphylum Phoroniformea nov.). Proceedings of the
Royal Society B: Biological Sciences, vol. 267, pp. 225–231. DOI: 10.1098/rspb.2000.0991.
Cohen, B. L. & Weydmann A. (2005). Molecular evidence that phoronids are a subtaxon of
brachiopods (Brachiopoda: Phoronata) and that genetic divergence of metazoan phyla began
long before the early Cambrian. Organisms Diversity & Evolutio., vol. 5, pp. 253–273. DOI:
10.1016/j.ode.2004.12.002.
Cooper, G. A. (1956). Chazyan and related brachiopods. Smithsonian Miscellaneous
Collections, vol. 127, pp. 1-1024.
Darwin, C. (1987). Charles Darwin's notebooks, 1836-1844: geology, transmutation of species,
metaphysical enquiries. Ed. Paul H. Barrett et al. Ithaca, NY, Cornell University Press.
34
Dayhoff, N. O. (1978) in Dayhoff, N.O. (ed.) Atlas of protein sequences and structure, In Vol
5, suppl. 3. National Biomedical Research Foundation, Washington DC, pp. 345-352.
dos Reis, M. & Yang, Z. (2019). Bayesian molecular clock dating using genome- scale datasets.
In: Anisimova M (ed.) Evolutionary Genomics: Statistical and Computational Methods,
Springer (in press).
Drummond, A. J., Ho, S. Y., Phillips, M. J. & Rambaut, A. (2006). Relaxed phylogenetics and
dating with confidence. PLoS Biology, vol. 4 (5). DOI:10.1371/journal.pbio.0040088.
Drummond, A. J., Suchard, M. A., Xie, D., and Rambaut, A. (2012). Bayesian phylogenetics
with BEAUti and the BEAST 1.7. Molecular Biology and Evolution, vol. 29, 1969–1973.
Dunn, C. W., Giribet, G., Edgecombe, G. D. & Hejnol, A. (2014). Animal phylogeny and its
evolutionary implications. Annual review of ecology, evolution, and systematics, vol. 45, pp.
371-395. DOI: 10.1146/annurev-ecolsys-120213-091627.
Eddy, S. R. & the HMMER development team. (2018). HMMer Software. Version 3.2.1.
Available: http://hmmer.org.
Eernisse, D. J., & Kluge, A. G. (1993). Taxonomic congruence versus total evidence, and
amniote phylogeny inferred from fossils, molecules, and morphology. Molecular Biology and
Evolution, vol. 10, pp. 1170-1195. DOI: 10.1093/oxfordjournals.molbev.a040071.
Ekman, T. (1896). Beiträge zur kenntnis des stieles der Brachiopoden. Zeitschrift für
wissenschaftliche Zoologie, vol. 6, pp. 169-249.
Erwin, D. H. (1993). The great Paleozoic crisis: life and death in the Permian. Columbia
University Press.
Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A., Rhind,
N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N. & Regev, A.
(2011). Full-length transcriptome assembly from RNA-seq data without a reference genome.
Nature Biotechnology, vol. 29, pp. 644-652. DOI: 10.1038/nbt.1883.
Fedonkin, M. A. & Waggoner, B. M. (1997). The Late Precambrian fossil Kimberella is a
mollusk-like bilaterian organism. Nature, vol. 338, pp. 868-871.
Foote, M. & Sepkoski, J. J. (1999). Absolute measures of the completeness of the fossil record.
Nature, vol. 398, pp. 415-417.
Grabherr, M. G., Haas, B .J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis,
X., Eernisse, D. J., Albert, J. S. & Anderson, F. E. (1992). Annelida and Arthropoda are not
sister taxa: a phylogenetic analysis of Spiralian metazoan morphology. Systematic Biology, vol.
41, pp. 305-330. DOI: 10.2307/2992569
Giribet, G. (2008). Assembling the lophotrochozoan (= spiralian) tree of life. Philosophical
Transactions of the Royal Society B: Biological Sciences, vol. 363, pp. 1513-1522. DOI:
10.1098/rstb.2007.2241.
35
Gould, S., & Calloway, C. (1980). Clams and Brachiopods-Ships that Pass in the
Night. Paleobiology, vol. 6, pp. 383-396. DOI: 10.1017/S0094837300003572.
Guindon, S. (2018). Accounting for calibration uncertainty: Bayesian molecular dating as a
“doubly intractable” problem. Systematic Biology, pp. 651-661. DOI: 10.1093/sysbio/syy003
Halanych, K. M., Bacheller, J. D., Aguinaldo, A. M., Liva, S. M., Hillis, D. M. & Lake, J. A.
(1995). Evidence from 18S ribosomal DNA that the Lophophorates are protostome animals.
Science, vol. 267, pp. 1641-1643. DOI: 10.1126/science.7886451.
Hansen, H. and Holmer, L. E. (2011). Taxonomy and biostratigraphy of Ordovician
brachiopods from northeastern Ny Friesland, Spitsbergen. Zootaxa, vol. 3076, pp. 1-122.
Hausdorf, B., Helmkampf, M., Meyer, A., Witek, A., Herlyn, H., Bruchhaus, I., Hankeln, T.,
Struck, T. H. & Lieb, B. (2007). Spiralian phylogenomics supports the resurrection of Bryozoa
comprising Ectoprocta and Entoprocta. Molecular Biology and Evolution, vol. 24 , pp. 2723-
2729. DOI: 10.1093/molbev/msm214.
Hejnol, A., Obst, M., Stamatakis, A., Ott, M., Rouse, G. W., Edgecombe, G. D., Martinez, P.,
Baguñà, J., Bailly, X., Jondelius, U., Wiens, M., Müller, W. E., Seaver, E., Wheeler, W. C.,
Martindale, M. Q., Giribet, G. & Dunn, C. W. (2009). Assessing the root of bilaterian animals
with scalable phylogenomic methods. Proceedings of the Royal Society B: Biological
Sciences, vol. 276, pp. 4261-4270. DOI: 10.1098/rspb.2009.0896.
Hendy, M. D. & Penny, D. (1989). A framework for the quantitative study of evolutionary trees.
Systematic Zoology, vol. 38, pp. 297-309. DOI: 10.2307/2992396.
Hrdy, I., Hirt, R. P., Dolezal, P., Bardonová, L., Foster, P. G., Tachezy, J. & Embley, T. M.
(2004). Trichomonas hydrogenosomes contain the NADH dehydrogenase module of
mitochondrial complex I. Nature, vol. 432, pp. 618-621.
Hull, P. M., Norris, R. D., Bralower, T. J., & Schueth, J. D. (2011). A role for chance in marine
recovery from the end-Cretaceous extinction. Nature Geoscience, vol. 4, pp. 856–860.
Jukes, T. H. & Cantor, C. R. (1969). Evolution of Protein Molecules. New York: Academic
Press. pp. 21–132.
Kaesler, R. L., Selden, P. A., eds 1997-2007. Treatise on Invertebrate Paleontology, Part H
(Revised): Brachiopoda, Vols 1-6. Boulder, CO/Lawrence, KS: Geol. Soc. Am./Univ. Kansas.
Katoh, K. & Standley, D. M. (2013). MAFFT Multiple Sequence Alignment Software Version
7: Improvements in Performance and Usability. Molecular Biology and Evolution, vol. 30, pp.
772-780. DOI: 10.1093/molbev/mst010.
Kocot, K. M., Struck, T. H., Merkel, J., Waits, D. S., Todt, C., Brannock, P. M., Weese, D. A.,
Cannon J. T., Moroz, L. L., Lieb, B. & Halanych, K. M. (2017). Phylogenomics of
Lophotrochozoa with consideration of systematic error. Systematic Biology, vol. 66, pp. 256-
282. DOI: 10.1093/sysbio/syw079.
36
Lamarck, J. B. P. A. (1809). Philosophical zoology: An exposition with regard to the natural
history of animals. Chicago, Chicago, EEUU.
Lanfear, R., Frandsen, P. B., Wright, A. M., Senfeld, T. & Calcott B. (2017). PartitionFinder 2:
New Methods for Selecting Partitioned Models of Evolution for Molecular and Morphological
Phylogenetic Analyses. Molecular Biology and Evolution, vol. 34, pp. 772-773. DOI:
10.1093/molbev/msw260.
Lartillot, N. & Philippe, H. (2004). A bayesian mixture model for across-site heterogeneities in
the amino-acid replacement process. Molecular Biology and Evolution. vol. 21, pp. 1095-1109.
DOI:10.1093/molbev/msh112.
Lartillot, N. & Philippe, H. (2006). Computing Bayes factors using thermodynamic integration.
Systematic Biology, vol. 55, pp. 195-307. DOI: 10.1080/10635150500433722.
Lartillot, N., Brinkmann, H. & Philippe, H. (2007). Suppression of long-branch attraction
artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evolutionary
Biology, vol. 7. DOI: 10.1186/1471-2148-7-S1-S4.
Laumer, C. E., Bekkouche, N., Kerbl, A., Goetz, F., Neves, R. C., Sørensen, M. V., Kristensen,
R.M., Hejnol, A., Dunn, C.W., Giribet, G. & Worsaae, K. (2015). Spiralian phylogeny informs
the evolution of microscopic lineages. Current Biology, vol. 25, pp. 2000-2006. DOI:
10.1016/j.cub.2017.11.026
Lee, M. S. Y., Soubrier J. & Edgecombe, G. D. (2013). Rates of Phenotypic and Genomic
Evolution during the Cambrian Explosion. Current Biology, vol. 23, pp. 1889-1895. DOI:
10.1016/j.cub.2013.07.055.
Lee, M. S. & Ho, S. Y. (2016). Molecular clocks. Current Biology, vol. 26, pp. 399-402.
Lyell, C. (1832). The Principles of Geology (Vol. 3). London: Murray.
Miller, M. A., Pfeiffer, W., & Schwartz, T. (2010). Creating the CIPRES Science Gateway for
inference of large phylogenetic trees. In 2010 Gateway Computing Environments workshop
(GCE), pp. 1-8. DOI: 10.1109/GCE.2010.5676129.
Morris, S. C. (1989). Burgess Shale faunas and the Cambrian explosion. Science, vol. 24, pp.
339-346. DOI: 10.1126/science.246.4928.339.
Münkemüller, T., Lavergne, S., Bzeznik, B., Dray, S., Jombart, T., Schiffers, K., & Thuiller,
W. (2012). How to measure and test phylogenetic signal. Methods in Ecology and Evolution,
vol. 3, pp. 743-756. DOI: 10.1111/j.2041-210X.2012.00196.
Nee, S., May, R. M., & Harvey, P. H. (1994). The reconstructed evolutionary
process. Philosophical Transactions of the Royal Society of London. Series B: Biological
Sciences, vol. 344, pp. 305-311.
Nesnidal, M. P., Helmkampf, M., Meyer, A., Witek, A., Bruchhaus, I., Ebersberger, I., Hankeln,
T., Lieb, B., Struck, T. H. & Hausdorf, B. (2013). New phylogenomic data support the
37
monophyly of Lophophorata and an Ectoproct-Phoronid clade and indicate that Polyzoa and
Kryptrochozoa are caused by systematic bias. BMC evolutionary biology, vol. 13 (253).
Pander, C. H. (1830). Beitrage zur Geognosie des Russischen Reiches. St. Petersburg.
Pauling, L., Zuckerkandl, E., & Henriksen, T. (1963). Chemical paleogenetics. Acta Chemica
Scandinavica, vol. 17, pp. 9-16.
Parry, L., Tanner, A. & Vinther, J. (2014). The origin of annelids. Palaeontology, vol. 57, pp.
1091-1103. DOI: 10.1111/pala.12129.
Passamaneck, Y. J. & Halanych, K. M. 2006. Lophotrochozoan phylogeny assessed with LSU
and SSU data: evidence of lophophorate polyphyly. Molecular Phylogenetics and Evolution,
vol. 40, pp. 20–28. DOI: 10.1016/j.ympev.2006.02.001.
Pelman, Y. L. (1977). Early and Middle Cambrian inarticulate brachiopods of the Siberian
Platform. Trudy Instituta Geologii i Geofiziki, Sibirskoe Otdelenie, vol. 316, pp. 1-168.
Peters, S. E. & McClennen, M. (2016). The Paleobiology Database application programming
interface. Paleobiology, vol. 42, pp. 1-7. DOI: 10.1017/pab.2015.39.
Rambaut, A. (2017). FigTree-version 1.4. 3, a graphical viewer of phylogenetic trees.
Rambaut, A., Drummond, A. J., Xie, D., Baele, G., & Suchard, M. A. (2018). Posterior
summarization in Bayesian phylogenetics using Tracer 1.7. Systematic biology, vol. 67, pp.
901-904. DOI: 10.1093/sysbio/syy032.
Schram, F. (1991). Cladistic analysis of metazoan phyla and athe placement of fossil
problematica. In: Simonetta, A. M. & Conway-Morris, S. (ed.) The Early Evolution of Metazoa
and the Significance of Problematic Taxa. UK: Cambridge University Press, pp. 35-46
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, vol. 6, pp.
461-464.
Shoemaker, J. S. & Fitch, W. M. (1989). Evidence from nuclear sequences that invariable sites
should be considered when sequence divergence is calculated. Molecular Biology and
Evolution, vol. 6, pp. 270–289. DOI: 10.1093/oxfordjournals.molbev.a040550.
Som, A. (2014). Causes, consequences and solutions of phylogenetic incongruence. Briefings
in Bioinformatics, vol. 16, pp. 536-548. DOI: 10.1093/bib/bbu015.
Sperling, E. A., Pisani, D., & Peterson, K. J. (2011). Molecular paleobiological insights into
the origin of the Brachiopoda. Evolution & development, vol. 13, pp. 290-303. DOI:
10.1111/j.1525-142X.2011.00480.
Stamatakis, A. (2014). RAxML Version 8: A tool for Phylogenetic Analysis and Post-Analysis
of Large Phylogenies. Bioinformatics, vol. 30, pp. 1312-1313. DOI:
10.1093/bioinformatics/btu033.
38
Swofford, D. L. (2003). PAUP*. Phylogenetic Analysis Using Parsimony (*and Other
Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.
Tamura, K. & Nei, M. (1993). Estimation of the number of nucleotide substitutions in the
control region of mitochondrial DNA in humans and chimpanzees. Molecular Biology and
Evolution, vol. 10, pp. 512-526. DOI: 10.1093/oxfordjournals.molbev.a040023.
Tavaré, S. (1986). Some probabilistic and statistical problems in the analysis of DNA. Lectures
on Mathematics in the Life Sciences, vol. 17, pp. 57–86.
Thayer, C. W. (1985). Brachiopods versus mussels: competition, predation, and
palatability. Science, vol. 228, pp. 1527-1528. DOI: 10.1126/science.228.4707.1527.
Valentine, J. W., & Jablonski, D. (1983). Larval adaptations and patterns of brachiopod
diversity in space and time. Evolution, vol. 37, pp. 1052-1061. DOI: 10.1111/j.1558-
5646.1983.tb05632.x.
Warnock, R. C., Parham, J. F., Joyce, W. G., Lyson, T. R., & Donoghue, P. C. (2015).
Calibration uncertainty in molecular dating analyses: there is no substitute for the prior
evaluation of time priors. Proceedings of the Royal Society B: Biological Sciences, vol. 282,
pp. DOI: 10.1098/rspb.2014.1013.
Waterhouse, R. M., Seppey, M., Simão, F.A., Manni, M., Ioannidis, P., Guennadi, K.,
Kriventseva, E. V. & Evgeny M. Zdobnov, E. M. (2017). BUSCO applications from quality
assessments to gene prediction and phylogenomics. Molecular Biology and Evolution, vol. 35,
pp. 543-548. DOI: 10.1093/molbev/msx319.
Wilke, T., Schultheiß, R., & Albrecht, C. (2009). As time goes by: a simple fool's guide to
molecular clock approaches in invertebrates. American Malacological Bulletin, vol. 27, pp. 25-
46. DOI: 10.4003/006.027.0203
Yang, Z. (1994). Maximum likelihood phylogenetic estimation from DNA sequences with
variable rates over sites: approximate methods. Journal of Molecular Evolution, vol. 39, pp.
306-314.
Zhang, X.G., Siveter, D.J., Waloszek, D. & Maas, A. (2007). An epipodite-bearing crown-
group crustacean from the Lower Cambrian. Nature, vol. 449, pp. 595-598. DOI:
10.1038/nature06138
39
APPENDIX: Methodological background
Collection, RNA extraction and sequencing.
After world-wide collection of specimens, whole brachiopods and tissues were preserved and
protected in RNAlater at -20°C or -80°C, before being processed with an Invitrogen RNAqueous
Micro Total RNA isolation kit. RNA was isolated and separated using silica centrifuge spin
columns and purified using LiCl precipitation with a DNAse digestion step. The quality of the
resulting RNA extract (quantification of RNA extraction amount and purity) was controlled
using a NanoDrop spectrophotometer. Finally, cDNA libraries were created from raw mRNA
by polyA selection and sequencing of 5-6 taxa was done using Illumina HiSeq 2500 lane with
50-60 million read depth coverage each.
Assembly of transcriptomes and phylogenomic matrix
As few reference genomes are available for Lingulida, transcriptomes were created using de-
novo transcriptome assembly protocols. Illumina Hiseq 2500 paired short reads were assembled
using Trinity assembly software (Grabherr et al., 2011) at Stanford’s Sherlock cluster and at
Ludwig-Maximilians-Universität München (LMU). Data quality checks and standard
sequences statistics, conversion of sequenced nucleic acids to amino acid were carried out using
BUSCO gene set (Waterhouse et al., 2017). An ortholog gene set from a pre-existing multi-
gene alignment (Kocot et al. 2016) was used to search for orthologs in new brachiopods
sequences. Using a custom python script (https://github.com/wrf/supermatrix), Hidden
Markov Model searches (HMMs) were generated using HMMer software (Eddy et al., 2018)
for each gene partition in the phylogenomic matrix. Using a sequence similarity threshold,
transcriptomes were successively analysed for matches. Then, trimmed matching sequences
were aligned using MAFFT (Katoh & Standley, 2013), and overhanging aligned sections were
trimmed to preserve the alignment length. The resulting single matrix and partition file output
were saved in FASTA format and were then ready for phylogenetic analysis.
Top Related