Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç...

38
Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity François Lutzoni Duke University Department of Biology Durham, North Carolina USA

Transcript of Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç...

Page 1: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Developing phyloinformatic tools for

an evolutionary understanding of

molecular biodiversity

François Lutzoni

Duke University Department of BiologyDurham, North CarolinaUSA

Page 2: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 3: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 4: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

WHERE ARE WE IN ASSEMBLING THE FUNGAL TREE OF LIFE?%

of

pu

bli

shed

tre

es

Tot

al #

pu

bli

shed

tre

es e

xam

ined

1-locus 2-locus 3-locus 4-locus Total

560 publications reporting 595 fungal phylogenetic trees: # loci, # species, # orders (Martin Grube, Elizabeth Baloch, Betsy Arnold)

Lutzoni et al., 2004, American Journal of Botany

Page 5: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Summary of survey

83.9% of fungal phylogenies are based exclusively on sequences from the ribosomal RNA tandem repeats.

Few studies have focused on resolving relationships among orders of Fungi

354 of 595 trees examined (59.5%) conveyed relationships within single orders

Although the number of species included in published trees has generally increased over time, most studies have included fewer than 100 species, with an overall mean of 34.2 ± 2.3 species/study (range: 3–1155 species)

Page 6: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Constraints on progress

Lack of coordination toward a common goal

- poor overlap among genes

Page 7: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Taxon sampling

2) A total of 13,467 GenBank sequences were considered

1) What are the loci that have been sequenced for the highest number of fungal species and appropriate for resolving deep relationhsips within the Fungi? nucSSU and nucLSU rDNA

3) 1010 unique taxa had both sequences available

Taxonomic name associated with the sequence

4) Taxa excluded: < 600 bp Many errors in GenBank “definition line”

3) Unpublished sequences added from various projects and AFTOL Total of 573 taxa

Page 8: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Taxon sampling (part 2)

2) nucSSU+nucLSU+mitSSU = 253 taxa (but only Asco- andBasidiomycota)

1) What is the next locus that has been sequenced for the highest number of the 573 species selected and appropriate for resolving deep relationhsips within the Fungi? mitSSU rDNA, RPB2

3) nucSSU+nucLSU+RPB2 = 161 taxa (only Asco- and Basiomycota)

4) nucSSU+nucLSU+mitSSU+RPB2 = 103 taxa (only Asco- and Basiomycota)

Page 9: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Constraints on progress

Lack of coordination toward a common goal

- poor overlap among genes

- sampling bias toward macro fungi

Page 10: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Two-locus Bayesian MCMCMC treenucSSU+nucLSU

558 species, 430 genera, 68 orders, 5 phyla

Ascomycota

Basidiomycota

Glomeromycota

Zygomycota

Chytridiomycota

Page 11: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Constraints on progress

Lack of coordination toward a common goal

- poor overlap among genes

- sampling bias toward macro fungi

Analytical and bioinformatic advancements in systematic biology were almost exclusively in alignment algorithms, contig assembly and visualization, and in phylogenetics

Page 12: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

DATA ACQUISITION

PHYLOGENETICANALYSES

TWO MAIN OPERATIONS NEEDED FORLARGE-SCALE PHYLOGENETIC ANALYSES

TREEOF

LIFE

Page 13: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Essence of AFToL in solving data acquisition issues

• COMMUNICATION/COORDINATION

• STANDARDIZATION

• AUTOMATION

Page 14: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

PROPOSAL::*1500 species of fungi for 8 loci (≈ 10 kb) in 4 years!

6 nuclear loci:nucSSU rDNA, nucLSU rDNA, ITS rDNA, RPB1, RPB2, EF-1 2 mitochondrial loci: mitSSU rDNA, ATP6

*Ultrastructural features common to all fungiNew data and compilation of all existing data

Starting date: January 1, 2003

National Science FoundationAssembling the Tree of Life

AToL

See www.aftol.org for more information

Page 15: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

David Hibbett (Clark University): 400 Basidiomycota

François Lutzoni (Duke University): 400 Lichen-forming Ascomycota et al. & Bioinformatics

David McLaughlin (University of Minnesota): Subcellular characters

Joseph Spatafora (Oregon State University): 400 Ascomycota

Rytas Vilgalys (Duke University): 300 Chrytridiomycota, Glomeromycota & Zygomycota

“ASSEMBLING THE FUNGAL TREE OF LIFE”(AFToL)

Page 16: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

One proposal from the entire mycological community

“ASSEMBLING THE FUNGAL TREE OF LIFE”(AFToL)

113 collaborators from 23 countries

Strong bioinformatic & computational biology component

Preceded by an NSF Network Coordination grant “DEEP HYPHA”

Page 17: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 18: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

AFTOL

Duke

Clark

OregonState

U ofMinnesota

RPB1 - RPB2

EF1 - ATP6 - mitSSU rDNA

nucLSU - SSU - ITS rDNA

1500

species

morphologicaldata

(SPB, nucleardivision, ...)

Page 19: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Source Information of DNA Samples

DNA SAMPLE

Voucher info

Culture info

Frozen material info

Providers of material/collaborators

Morpho &Ultrastructuralinfo

Team: Clark UDuke L LabDuke V labOSUUofM

Expected phylum

Page 20: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Entry in DB of source information of DNA + (voucher, cultures, frozen sample, DNA, etc...) + team providing the material + morphological/ultrastructural info + expected phylym

sequencing

Automatedsequencer

Raw sequences output Server #1

Clark U Duke L Duke V

When sequencingCompleted

Locus 3 alignLocus 2 alignLocus 1 align

Server #2BLAST (in house)

Initiate NJ bootstrap analyses for set of taxa common to more than one locus

NJ analysis of gene 1 For taxa in common between

DS2 and DS3

NJ analysis of gene 1 For taxa in common between

DS2 and DS4

NJ analysis of gene 2 For taxa in common between

DS3 and DS2

NJ analysis of gene 2 For taxa in common between

DS3 and DS4

NJ analysis of gene 3 For taxa in common between

DS4 and DS2

NJ analysis of gene 3 For taxa in common between

DS4 and DS2

Congruence test

OSU UofM

Blast checkpointBasidiomycete?

No

Phred & PhrapassemblerYes

GenBank Weekly

Page 21: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 22: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 23: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 24: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 25: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 26: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 27: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

WWW based interface to AFTOL database

Secure access for registered users only

Possibility to enter, retrieve and change data

Each author views and modifies only his/her own data

Dynamic web page construction using Zope

Page 28: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 29: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Kauff, Cox & Lutzoni, Systematic Biology (in press)

Page 30: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Future goals of AFToL and interplay with Barcode

Continue developing WASABI / export to other projects

Less taxa more genes (25 genes for 160 taxa)

Because phylogenies will need multiple locifungal diversity cannot be covered

Because the focus of the Barcode initiative is identification and is based only on a single short locus it has the potential to cover all fungal diversity

AFToL and Barcode initiatives can mutually benefit

Page 31: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Why ITS for Fungi?• Easy to amplify and sequence

• Mixture of highly conserved (5.8S) and variable sites (ITS 1 and 2) [change definition of DNA barcode]

• Most broadly used marker for systematic and ecological studies of fungi at the species level

• The majority of the fungal diversity is unknown

• Highest overlap with other genes (AFToL)

• Alignment issues across a broad phylogenetic spectrum can be dealt with

Page 32: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

> 95%

> 95%

Page 33: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Betsy Arnold, University of Arizona, TucsonJolanta Miadlikowska, Duke University

≈ 65 000 fungal speciesEstimated ≈ 1.5 million speciesPart of this diversity is known (herbaria)

Page 34: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Highest overlap with other genes (AFToL)

• Beyond a single marker inventory of life

• What are the putative ecological/physiological roles? What are the most closely related taxa? = phylogenies ≠ similarity

• Reconstruction of trait evolution on comprehensive phylogenies

• Branch lengths are important and trees not significantly worse than the optimal tree are needed = Bayesian MCMC searches

• More genes needed (nucSSU, nucLSU, RPB1, RPB2, mitSSU)

Page 35: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

•Most endophytes &endolichenic fungi areAscomycota.•Beyond this level,specificity is an open question.Parasitism vs. mutualism?Horizontal vs. vertical transmission?

Page 36: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 37: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,
Page 38: Developing phyloinformatic tools for an evolutionary understanding of molecular biodiversity Fran ç ois Lutzoni Duke University Department of Biology Durham,

Acknowledgments

• Betsy Arnold, University of Arizona at Tucson

• Cymon Cox, British Museum, England

• Frank Kauff, University of Kaiserslautern, Germany

• Jolanta Miadlikowska, Duke University

• National Science Foundation