Virgo – I njection SYS tem …Today and Tomorrow… Olivier FRAN Ç OIS
Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç...
-
Upload
christopher-nathan-copeland -
Category
Documents
-
view
215 -
download
0
Transcript of Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç...
Developing phyloinformatic tools for
an evolutionary understanding of
molecular biodiversity
François Lutzoni
Duke University Department of BiologyDurham, North CarolinaUSA
WHERE ARE WE IN ASSEMBLING THE FUNGAL TREE OF LIFE?%
of
pu
bli
shed
tre
es
Tot
al #
pu
bli
shed
tre
es e
xam
ined
1-locus 2-locus 3-locus 4-locus Total
560 publications reporting 595 fungal phylogenetic trees: # loci, # species, # orders (Martin Grube, Elizabeth Baloch, Betsy Arnold)
Lutzoni et al., 2004, American Journal of Botany
Summary of survey
83.9% of fungal phylogenies are based exclusively on sequences from the ribosomal RNA tandem repeats.
Few studies have focused on resolving relationships among orders of Fungi
354 of 595 trees examined (59.5%) conveyed relationships within single orders
Although the number of species included in published trees has generally increased over time, most studies have included fewer than 100 species, with an overall mean of 34.2 ± 2.3 species/study (range: 3–1155 species)
Constraints on progress
Lack of coordination toward a common goal
- poor overlap among genes
Taxon sampling
2) A total of 13,467 GenBank sequences were considered
1) What are the loci that have been sequenced for the highest number of fungal species and appropriate for resolving deep relationhsips within the Fungi? nucSSU and nucLSU rDNA
3) 1010 unique taxa had both sequences available
Taxonomic name associated with the sequence
4) Taxa excluded: < 600 bp Many errors in GenBank “definition line”
3) Unpublished sequences added from various projects and AFTOL Total of 573 taxa
Taxon sampling (part 2)
2) nucSSU+nucLSU+mitSSU = 253 taxa (but only Asco- andBasidiomycota)
1) What is the next locus that has been sequenced for the highest number of the 573 species selected and appropriate for resolving deep relationhsips within the Fungi? mitSSU rDNA, RPB2
3) nucSSU+nucLSU+RPB2 = 161 taxa (only Asco- and Basiomycota)
4) nucSSU+nucLSU+mitSSU+RPB2 = 103 taxa (only Asco- and Basiomycota)
Constraints on progress
Lack of coordination toward a common goal
- poor overlap among genes
- sampling bias toward macro fungi
Two-locus Bayesian MCMCMC treenucSSU+nucLSU
558 species, 430 genera, 68 orders, 5 phyla
Ascomycota
Basidiomycota
Glomeromycota
Zygomycota
Chytridiomycota
Constraints on progress
Lack of coordination toward a common goal
- poor overlap among genes
- sampling bias toward macro fungi
Analytical and bioinformatic advancements in systematic biology were almost exclusively in alignment algorithms, contig assembly and visualization, and in phylogenetics
DATA ACQUISITION
PHYLOGENETICANALYSES
TWO MAIN OPERATIONS NEEDED FORLARGE-SCALE PHYLOGENETIC ANALYSES
TREEOF
LIFE
Essence of AFToL in solving data acquisition issues
• COMMUNICATION/COORDINATION
• STANDARDIZATION
• AUTOMATION
PROPOSAL::*1500 species of fungi for 8 loci (≈ 10 kb) in 4 years!
6 nuclear loci:nucSSU rDNA, nucLSU rDNA, ITS rDNA, RPB1, RPB2, EF-1 2 mitochondrial loci: mitSSU rDNA, ATP6
*Ultrastructural features common to all fungiNew data and compilation of all existing data
Starting date: January 1, 2003
National Science FoundationAssembling the Tree of Life
AToL
See www.aftol.org for more information
David Hibbett (Clark University): 400 Basidiomycota
François Lutzoni (Duke University): 400 Lichen-forming Ascomycota et al. & Bioinformatics
David McLaughlin (University of Minnesota): Subcellular characters
Joseph Spatafora (Oregon State University): 400 Ascomycota
Rytas Vilgalys (Duke University): 300 Chrytridiomycota, Glomeromycota & Zygomycota
“ASSEMBLING THE FUNGAL TREE OF LIFE”(AFToL)
One proposal from the entire mycological community
“ASSEMBLING THE FUNGAL TREE OF LIFE”(AFToL)
113 collaborators from 23 countries
Strong bioinformatic & computational biology component
Preceded by an NSF Network Coordination grant “DEEP HYPHA”
AFTOL
Duke
Clark
OregonState
U ofMinnesota
RPB1 - RPB2
EF1 - ATP6 - mitSSU rDNA
nucLSU - SSU - ITS rDNA
1500
species
morphologicaldata
(SPB, nucleardivision, ...)
Source Information of DNA Samples
DNA SAMPLE
Voucher info
Culture info
Frozen material info
Providers of material/collaborators
Morpho &Ultrastructuralinfo
Team: Clark UDuke L LabDuke V labOSUUofM
Expected phylum
Entry in DB of source information of DNA + (voucher, cultures, frozen sample, DNA, etc...) + team providing the material + morphological/ultrastructural info + expected phylym
sequencing
Automatedsequencer
Raw sequences output Server #1
Clark U Duke L Duke V
When sequencingCompleted
Locus 3 alignLocus 2 alignLocus 1 align
Server #2BLAST (in house)
Initiate NJ bootstrap analyses for set of taxa common to more than one locus
NJ analysis of gene 1 For taxa in common between
DS2 and DS3
NJ analysis of gene 1 For taxa in common between
DS2 and DS4
NJ analysis of gene 2 For taxa in common between
DS3 and DS2
NJ analysis of gene 2 For taxa in common between
DS3 and DS4
NJ analysis of gene 3 For taxa in common between
DS4 and DS2
NJ analysis of gene 3 For taxa in common between
DS4 and DS2
Congruence test
OSU UofM
Blast checkpointBasidiomycete?
No
Phred & PhrapassemblerYes
GenBank Weekly
WWW based interface to AFTOL database
Secure access for registered users only
Possibility to enter, retrieve and change data
Each author views and modifies only his/her own data
Dynamic web page construction using Zope
Kauff, Cox & Lutzoni, Systematic Biology (in press)
Future goals of AFToL and interplay with Barcode
Continue developing WASABI / export to other projects
Less taxa more genes (25 genes for 160 taxa)
Because phylogenies will need multiple locifungal diversity cannot be covered
Because the focus of the Barcode initiative is identification and is based only on a single short locus it has the potential to cover all fungal diversity
AFToL and Barcode initiatives can mutually benefit
Why ITS for Fungi?• Easy to amplify and sequence
• Mixture of highly conserved (5.8S) and variable sites (ITS 1 and 2) [change definition of DNA barcode]
• Most broadly used marker for systematic and ecological studies of fungi at the species level
• The majority of the fungal diversity is unknown
• Highest overlap with other genes (AFToL)
• Alignment issues across a broad phylogenetic spectrum can be dealt with
> 95%
> 95%
Betsy Arnold, University of Arizona, TucsonJolanta Miadlikowska, Duke University
≈ 65 000 fungal speciesEstimated ≈ 1.5 million speciesPart of this diversity is known (herbaria)
Highest overlap with other genes (AFToL)
• Beyond a single marker inventory of life
• What are the putative ecological/physiological roles? What are the most closely related taxa? = phylogenies ≠ similarity
• Reconstruction of trait evolution on comprehensive phylogenies
• Branch lengths are important and trees not significantly worse than the optimal tree are needed = Bayesian MCMC searches
• More genes needed (nucSSU, nucLSU, RPB1, RPB2, mitSSU)
•Most endophytes &endolichenic fungi areAscomycota.•Beyond this level,specificity is an open question.Parasitism vs. mutualism?Horizontal vs. vertical transmission?
Acknowledgments
• Betsy Arnold, University of Arizona at Tucson
• Cymon Cox, British Museum, England
• Frank Kauff, University of Kaiserslautern, Germany
• Jolanta Miadlikowska, Duke University
• National Science Foundation