iPlant Tree of Life

48
iPlant Tree of Life Naim Matasci Plant and Animal Genome XIX Conference Jan 15-19, 2011

description

iPlant Tree of Life Given at PAG XIX (2011)Overview of the ToL project

Transcript of iPlant Tree of Life

Page 1: iPlant Tree of Life

iPlant Tree of Life

Naim MatasciPlant and Animal Genome XIX Conference

Jan 15-19, 2011

Page 2: iPlant Tree of Life

Nothing in biology makes sense except in the light of evolution.

T. G. Dobzahnsky

Page 3: iPlant Tree of Life

Phylogenetic Insights

Page 4: iPlant Tree of Life

Tree of Life

A metaphor for the phylogeny of all species or a large group of

species

Page 5: iPlant Tree of Life

Scalability

Ackerly, 2009; J. Felsenstein, ca. 1980; Ranger Cluster at TACC

Page 6: iPlant Tree of Life

iPToL Challenges

Large phylogenetic inferenceBuilding a tree of life for up to 500,000 green plants

Tree VisualizationScalable visualization for small to large trees

Data Assembly and IntegrationAcquisition, organization and processing the data

Taxonomic IntelligenceSorting out different names for the same species

Tree ReconciliationResolving discordant gene and species trees

Trait EvolutionUsing tree to understand how traits evolved

Page 7: iPlant Tree of Life

Big TreesTo optimize existing methods to construct phylogenetic trees in the order of 500K taxa.

Page 8: iPlant Tree of Life

Tree Building

Page 9: iPlant Tree of Life

Number of atoms in the universe

Factorial (trees)

E10

E2

Page 10: iPlant Tree of Life

Big Trees

NINJA (Travis Wheeler)Neighbor-Joining implementation that can analyze > 200K species Software rewritten from Java to C with an MPI Six day run time reduced 32-fold to 4.5 hours for 220K species data setTwo/three day run time reduced 1,800-folds to 2 minutes for distance matrix calculation on 220K set

RAxML (Alexandros Stamatakis)Large Scale Maximum Likelihood implementation Added check-pointing Re-implementing pthreads implementation to an MPI

Page 11: iPlant Tree of Life

In the works

• Source code releases• Further improvements• Building the Tree of Life

Page 12: iPlant Tree of Life

Tree VisualizationTo develop an application for viewing, analyzing and exploring large phylogenetic trees.

Page 13: iPlant Tree of Life

Tree Visualization

• > 500K Taxa• Fast• Platform independent• Semantic zooming• Metadata driven display of information

Page 14: iPlant Tree of Life

iPlant Tree Viewer Prototype

Page 15: iPlant Tree of Life

My-Plant.orgTo easily share information and research, collaborate, and stay on top of the latest news in the field.

Page 16: iPlant Tree of Life

• The usual suspects in social networking features

– Image gallery– File sharing– Group/private messaging– Forums– Group posts– “Colleagues”– User profiles– Searchable content

Social Networking for the Plant Sciences

Page 17: iPlant Tree of Life
Page 18: iPlant Tree of Life

My Clades

Page 19: iPlant Tree of Life

My Clades

Page 20: iPlant Tree of Life

In the works

• Integration with other social networks/services– Twitter, Facebook

• Expanding the number of active clades

November 13, 2010

Page 21: iPlant Tree of Life

Sign up today!

• Go to http://my-plant.org• Click on Registration• Try it out!

• Check out poster (Software session)

Page 22: iPlant Tree of Life

1KPCollaboration (1KP) – To support the data analysis of the Thousand Plant Transcriptome Project

Page 23: iPlant Tree of Life

1KP

unexplored territory

N(g

enes

)

dozens of species completed genomes

N(species)

dozens of genes PCR in 104 species

Page 24: iPlant Tree of Life

Broad phylogenetic coverage

algae non-flowering flowering (angiosperm)

on role of polyploidy in

Darwin’s “abominable

mystery”

phylogenomics of 1000 species across plant taxa

Page 25: iPlant Tree of Life

Tuesday Afternoon, 18 January 2011 - 3:50 pm to 6:00 pm

Gene Expression Analysis Workshop - Pacific Salon 3

Organizers: David Galbraith, University of Arizona and Greg May, NCGR, Santa Fe

@ 4:00 pm - Gane Ka-Shu Wong , University of Alberta

"1KP: an International Consortium Sequencing the Transcriptomes of 1000 Phylogenetically Diverse Plants from Angiosperms to Green Algae"

Page 26: iPlant Tree of Life

Taxonomic Name ResolutionCollaboration (BIEN) - To unify and resolve synonymous, erroneous, or other conflicting taxonomic names.

Page 27: iPlant Tree of Life

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Page 28: iPlant Tree of Life

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Mouse-ear cress

Thale cress

Page 29: iPlant Tree of Life

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Mouse-ear cress

Thale cress

1A12_ARATH

Q06402

Page 30: iPlant Tree of Life

Taxonomic Intelligence

cress

Page 31: iPlant Tree of Life

Taxonomic Intelligence

cress

Page 32: iPlant Tree of Life

Image courtesy of Brad Boyle.

Taxonomic Name Resolution Service

Page 33: iPlant Tree of Life

Taxonomic Name Resolution Service

Page 34: iPlant Tree of Life

Taxonomic Name Resolution Service (Demo)

Page 35: iPlant Tree of Life

In the works

• Higher and lower taxonomic orders• Synonymy• API

Page 36: iPlant Tree of Life

Tree ReconciliationTo reconcile the evolutionary history of genes and species.

Page 37: iPlant Tree of Life

Origins of incongruence

• Lineage sorting and hybridization• Gene duplications• Horizontal gene transfer

Tree Reconciliation allows us to infer the occurrence of these evolutionary events

Page 38: iPlant Tree of Life

Species Tree

Page 39: iPlant Tree of Life

Gene Tree

Page 40: iPlant Tree of Life

Gene family data courtesy John Bowers

Tree Reconciliation

Page 41: iPlant Tree of Life

Tree Reconciliation (Demo)

Page 42: iPlant Tree of Life

In the works

• Additional search capabilities• New visualization style• Interactive mode

Page 43: iPlant Tree of Life

Trait EvolutionTo develop an infrastructure for downstream analysis of large trees.

Page 44: iPlant Tree of Life

Trait Evolution

• Toolkit to study the evolution of traits of interest on very large phylogenies– Diversification– Biogeographic patterns– Adaptation– Co-evolution – …

Page 45: iPlant Tree of Life

Current analyses

• Phylogenetically Independent Contrasts(Felsenstein 1985)

• Continuous Ancestral Character Estimation (Schulter et al. 1997, Paradis 2004)

• Discrete Ancestral Character Estimation (Pagel 1994, Paradis 2004)

Page 46: iPlant Tree of Life

Pylogenetically Independent Contrasts

Video

Page 47: iPlant Tree of Life

In the works

• Various tree stretching models• Evolutionary models fitting• Contrasts for discrete traits (Pagel 1994)

• Tighter integration with the Discovery Environment

• More tools and methods from the community

Page 48: iPlant Tree of Life