iPlant Tree of Life

Post on 09-May-2015

572 views 4 download

description

iPlant Tree of Life Given at PAG XIX (2011)Overview of the ToL project

Transcript of iPlant Tree of Life

iPlant Tree of Life

Naim MatasciPlant and Animal Genome XIX Conference

Jan 15-19, 2011

Nothing in biology makes sense except in the light of evolution.

T. G. Dobzahnsky

Phylogenetic Insights

Tree of Life

A metaphor for the phylogeny of all species or a large group of

species

Scalability

Ackerly, 2009; J. Felsenstein, ca. 1980; Ranger Cluster at TACC

iPToL Challenges

Large phylogenetic inferenceBuilding a tree of life for up to 500,000 green plants

Tree VisualizationScalable visualization for small to large trees

Data Assembly and IntegrationAcquisition, organization and processing the data

Taxonomic IntelligenceSorting out different names for the same species

Tree ReconciliationResolving discordant gene and species trees

Trait EvolutionUsing tree to understand how traits evolved

Big TreesTo optimize existing methods to construct phylogenetic trees in the order of 500K taxa.

Tree Building

Number of atoms in the universe

Factorial (trees)

E10

E2

Big Trees

NINJA (Travis Wheeler)Neighbor-Joining implementation that can analyze > 200K species Software rewritten from Java to C with an MPI Six day run time reduced 32-fold to 4.5 hours for 220K species data setTwo/three day run time reduced 1,800-folds to 2 minutes for distance matrix calculation on 220K set

RAxML (Alexandros Stamatakis)Large Scale Maximum Likelihood implementation Added check-pointing Re-implementing pthreads implementation to an MPI

In the works

• Source code releases• Further improvements• Building the Tree of Life

Tree VisualizationTo develop an application for viewing, analyzing and exploring large phylogenetic trees.

Tree Visualization

• > 500K Taxa• Fast• Platform independent• Semantic zooming• Metadata driven display of information

iPlant Tree Viewer Prototype

My-Plant.orgTo easily share information and research, collaborate, and stay on top of the latest news in the field.

• The usual suspects in social networking features

– Image gallery– File sharing– Group/private messaging– Forums– Group posts– “Colleagues”– User profiles– Searchable content

Social Networking for the Plant Sciences

My Clades

My Clades

In the works

• Integration with other social networks/services– Twitter, Facebook

• Expanding the number of active clades

November 13, 2010

Sign up today!

• Go to http://my-plant.org• Click on Registration• Try it out!

• Check out poster (Software session)

1KPCollaboration (1KP) – To support the data analysis of the Thousand Plant Transcriptome Project

1KP

unexplored territory

N(g

enes

)

dozens of species completed genomes

N(species)

dozens of genes PCR in 104 species

Broad phylogenetic coverage

algae non-flowering flowering (angiosperm)

on role of polyploidy in

Darwin’s “abominable

mystery”

phylogenomics of 1000 species across plant taxa

Tuesday Afternoon, 18 January 2011 - 3:50 pm to 6:00 pm

Gene Expression Analysis Workshop - Pacific Salon 3

Organizers: David Galbraith, University of Arizona and Greg May, NCGR, Santa Fe

@ 4:00 pm - Gane Ka-Shu Wong , University of Alberta

"1KP: an International Consortium Sequencing the Transcriptomes of 1000 Phylogenetically Diverse Plants from Angiosperms to Green Algae"

Taxonomic Name ResolutionCollaboration (BIEN) - To unify and resolve synonymous, erroneous, or other conflicting taxonomic names.

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Mouse-ear cress

Thale cress

Taxonomic Intelligence

Arabidopsis thaliana

Arabis thaliana

Mouse-ear cress

Thale cress

1A12_ARATH

Q06402

Taxonomic Intelligence

cress

Taxonomic Intelligence

cress

Image courtesy of Brad Boyle.

Taxonomic Name Resolution Service

Taxonomic Name Resolution Service

Taxonomic Name Resolution Service (Demo)

In the works

• Higher and lower taxonomic orders• Synonymy• API

Tree ReconciliationTo reconcile the evolutionary history of genes and species.

Origins of incongruence

• Lineage sorting and hybridization• Gene duplications• Horizontal gene transfer

Tree Reconciliation allows us to infer the occurrence of these evolutionary events

Species Tree

Gene Tree

Gene family data courtesy John Bowers

Tree Reconciliation

Tree Reconciliation (Demo)

In the works

• Additional search capabilities• New visualization style• Interactive mode

Trait EvolutionTo develop an infrastructure for downstream analysis of large trees.

Trait Evolution

• Toolkit to study the evolution of traits of interest on very large phylogenies– Diversification– Biogeographic patterns– Adaptation– Co-evolution – …

Current analyses

• Phylogenetically Independent Contrasts(Felsenstein 1985)

• Continuous Ancestral Character Estimation (Schulter et al. 1997, Paradis 2004)

• Discrete Ancestral Character Estimation (Pagel 1994, Paradis 2004)

Pylogenetically Independent Contrasts

Video

In the works

• Various tree stretching models• Evolutionary models fitting• Contrasts for discrete traits (Pagel 1994)

• Tighter integration with the Discovery Environment

• More tools and methods from the community