PHYLOGENOMIC APPLICATIONS OF REPETITIVE ELEMENTSRosato et al. 2018. Annals of Botany 122(3),...
Transcript of PHYLOGENOMIC APPLICATIONS OF REPETITIVE ELEMENTSRosato et al. 2018. Annals of Botany 122(3),...
PHYLOGENOMIC APPLICATIONS
OF REPETITIVE ELEMENTSDaniel Vitales
Institut Botànic de Barcelona (IBB, CSIC-Ajuntament de Barcelona)
Laboratori de Botànica, Facultat de Farmàcia i Ciències de l’Alimentació, Universitat de Barcelona
I Simposio Anual de Botánica Española:
Filogenómica para comprender la diversidad y evolución de grupos complejos de plantas
February 8th, 2020
REPETITIVE ELEMENTS: INTRODUCTION
TANDEM REPEATS DISPERSED REPEATS
TRANSPOSONS OTHER DISPERSED(e.g. tRNA-like,
retropseudogens)
DNATRANSPOSONS
RNATRANSPOSONS
LTR RETROTRANSPOSONS
(e.g. copia, gypsy)
NON-LTR RETROTRANSPOSONS
(e.g. LINEs, SINEs)
TANDEM REPEAT GENES
SATELLITES
• SATELLITES
• MINISATELLITES
• MICROSATELLITES
• RIBOSOMAL DNA
• OTHER MULTIPLE-COPY GENES(e.g. histones)
“REPEATOME”REPETITIVE FRACTION OF
THE GENOME
Genlisea aurea 63.6 Mbp SMALLEST PLANT GENOMEArabidopsis thaliana 125 Mbp 25%Sugar beet Beta vulgaris 758 Mbp 63% Broad bean Vicia faba 12000 Mbp 85% Rye Secale cereale 8800 Mbp 92% Onion Allium cepa 15100 Mbp 95%Paris japonica 149000 Mbp LARGEST PLANT GENOMEHuman Homo sapiens 3000 Mbp >50%
Species Genome size Repeat content
Liu et al. 2013. International journal of molecular sciences, 14(7), 13559-13576.
7th Workshop on the Application of Next Generation Sequencing to Repetitive DNA Analysis in Plants. Ceske Budejovice. 22-24 May 2018.http://repeatexplorer.org/
Plant genome composition
Plant repeatome composition
REPETITIVE ELEMENTS: INTRODUCTION
• Limitations caused by short length of NGS sequences
• Repeat length > Read Length
• Copies of the repetitive elements accumulate mutations,
diverging along time
• (unless concerted evolution!)
Nieto Feliner & Rosselló. 2012. Plant genome diversity volume 1, 171-193.
Caveats of using repetitive elements for phylogenetic reconstruction
Repeats
Reads?
? ?
?
e.g. retrotransposon length: ~ 1000 - 20000bp
read length: 100~300nt
?
?
?
?
?
?
?
?
?
REPETITIVE ELEMENTS: INTRODUCTION
Identification of sequences clusters
Reconstruction of repetitive elements
Shotgun genomic sequencing
Dispersed RE(eg. transposons)
Tandem Repeats(e.g. rDNA, satellites)
Reads
Each cluster is a set of reads that frequently overlap and that are part of the same family of repetitive elements.
Novák, P., Neumann, P., Pech, J., Steinhaisl, J., & Macas, J. (2013). RepeatExplorer: a Galaxy-based web server for genome-wide characterization of eukaryotic repetitiveelements from next-generation sequence reads. Bioinformatics, 29(6), 792-793.
REPETITIVE ELEMENTS: CHARACTERIZATION
7th Workshop on the Application of Next Generation Sequencing to Repetitive DNA Analysis in Plants. Ceske Budejovice. 22-24 May 2018.http://repeatexplorer.org/
REPETITIVE ELEMENTS: CHARACTERIZATION
7th Workshop on the Application of Next Generation Sequencing to Repetitive DNA Analysis in Plants. Ceske Budejovice. 22-24 May 2018.http://repeatexplorer.org/
7th Workshop on the Application of Next Generation Sequencing to Repetitive DNA Analysis in Plants. Ceske Budejovice. 22-24 May 2018.http://repeatexplorer.org/
Cluster annotation and quantification
REPETITIVE ELEMENTS: CHARACTERIZATION
7th Workshop on the Application of Next Generation Sequencing to Repetitive DNA Analysis in Plants. Ceske Budejovice. 22-24 May 2018. http://repeatexplorer.org/
Proportion of reads
Novák et al. 2014. PloS one, 9(6).
Clu
ster
s
CL1CL2CL3CL4CL5CL6CL7CL8CL9CL10CL11CL12CL13CL14CL15CL16CL17
CL18CL19CL20CL21--
REPETITIVE ELEMENTS: CHARACTERIZATION
Dodsworth et al. 2015. Systematic biology, 64(1), 112-126.
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODSPhylogenetic reconstruction based on comparative repeat abundances
Phylogenetic reconstruction based on comparative repeat abundances
Dodsworth et al. 2015. Systematicbiology, 64(1), 112-126.
Fritillaria
Asclepias
Orobanche Fabeae
Drosophila
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODS
Similarity A-B [CL n] = Observed N edges A-B [CL n]
Expected N edges A-B [CL n]
=Observed N edges A-B [CL n]
(N reads A + N reads B) [CL n]
N reads total [cluster n]
Vitales, Garcia & Dodsworth. 2019. BioRxiv. doi: https://doi.org/10.1101/624064
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODSPhylogenetic reconstruction based on repeat similarities
Vitales, Garcia & Dodsworth. 2019. BioRxiv. doi: https://doi.org/10.1101/624064
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODSPhylogenetic reconstruction based on repeat similarities
Vitales, Garcia & Dodsworth. 2019. BioRxiv. doi: https://doi.org/10.1101/624064
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODSPhylogenetic reconstruction based on repeat similarities
Straub et al. 2012. American Journal of Botany, 99(2), 349-364.
Asclepias Sonoran Desert Clade
Vitales, Garcia & Dodsworth. 2019. BioRxiv. doi: https://doi.org/10.1101/624064
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODSPhylogenetic reconstruction based on repeat similarities
repeat abundances repeat similarities
Phylogenetic reconstruction based on repeat similaritiesGenome representation Repeat types
REPETITIVE ELEMENTS: PHYLOGENOMIC METHODS
Vitales, Garcia & Dodsworth. 2019. BioRxiv. doi: https://doi.org/10.1101/624064
Bello et al. 2012. Annals of Botany 112, 1597-1612.
GS (pg)
13.6
8.3
10.7
12.4
11.8
10.6
9.5
13.4
10.1
Rosato et al. 2018. Annals of Botany 122(3), 387-395.
SpeciesN pop
(N ind)ITR site N
Heliocauta atlantica 3 (10) 6-17
A. clavatus 14 (38) 0-14
A. homogamos 2 (8) 0
A. linearilobus 3 (5) 19
A. maroccanus 3 (9) 0
A. monanthos 3 (10) 0-4
A. radiatus 3 (8) 0
A. pyrethrum 2 (6) 26-45
A. valentinus 9 (31) 0-10
Interstitial telomeric-like repeats (ITR) variability
Rosato et al. 2017. PloS one, 12(10).
REPETITIVE ELEMENTS: ANACYCLUS STUDY CASE
Hypothesis: activation of the repeat
machinery drive homoploid changes in GS
Karyological 45S rDNA site phenotypes
Vitales et al. 2019. Annals of Botany (in press). doi: https://doi.org/10.1093/aob/mcz183
Conservation levels of highly abundant TEs are decoupled from the actual GS of the species
REPETITIVE ELEMENTS: ANACYCLUS STUDY CASE
Comparative repeat composition of Anacyclus species Sequence conservation by differential stringency mapping
Alternative hypothesis: recombination events between
homologous chromosomes derived from distinct genomes
(i.e. from homoploid hybridization) leading to
chromosome arm exchanges, which would result in
different genome sizes.
REPETITIVE ELEMENTS: ANACYCLUS STUDY CASE
Vitales et al. 2019. Annals of Botany (in press). Doi: https://doi.org/10.1093/aob/mcz183
• Shallow sequencing of gDNA (genome skimming) might result in a depth characterization of repetitive DNA.
• Genomic repeat abundances and repeat sequence similarities contain phylogenetic signals and can be used as a complementary markers to infer evolutionary histories.
• Combined application of phylogenetic approaches based on repeat abundances and repeat sequence similarities can be helpful to understand mechanisms governing genome and repeatome evolution.
• Further development of these methods should focus on automating the data processing and obtaining support values for phylogenetic trees and networks.
SUMMARY
Acknowledgements:
Institut Botànic de Barcelona, CSIC-ICUBSònia GarciaTeresa GarnatjeJaume PellicerJoan Pere Pascual
Real Jardín Botánico, CSICGonzalo Nieto-FelinerInés ÁlvarezJavier Fuertes
Universitat de BarcelonaJoan VallèsOriane Hidalgo
Institute of Biophysics, BrnoAleš Kovařík
Jardí Botànic de la Universitat de ValènciaMarcela RosatoJosep Antoni Rosselló
University of BedfordshireSteven Dodsworth