The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry,...

19
The iPlant Collaborative iPToL Data Assembly Workshop November 21 st , 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona, Texas Advanced Computing Center

description

What is the process for identifying GC questions? Encourage and assist the community in organizing grand challenge workshops, forming grand challenge teams and developing grand challenge ‘white papers’ (‘proposals’) Community-representative Board of Directors (drawn from community nominations) evaluates ‘proposals’ and teams and makes recommendations for priorities iPlant leadership team decides whether & how to implement the Board’s recommendations and assists GC team leads in assessing needs, specifying requirements and designing ‘Discovery Environments’ to serve the team’s and the broader community’s needs

Transcript of The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry,...

Page 1: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

The iPlant Collaborative

iPToL Data Assembly WorkshopNovember 21st, 2009

Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione

University of Arizona, Texas Advanced Computing Center

Page 2: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

What is the iPlant Collaborative??

• iPlant’s nature:• an organization that enables new conceptual advances through enables new conceptual advances through

integrative, computational thinkingintegrative, computational thinking • an organization that is by, for, and of the community• a service-oriented project, not a research project (creates CI in

support of plant science research; but does not perform research outside of prototyping & testing)

• iPlant’s mission: •address an evolving array of plant science grand challenge address an evolving array of plant science grand challenge questionsquestions• to enable the research community to identify the to enable the research community to identify the major problems in in

plant sciences, plant sciences, thenthen to develop & pursue needed CI to develop & pursue needed CI solutionssolutions

Page 3: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

What is the process for identifying GC questions?

• Encourage and assist the community in organizing grand challenge workshops, forming grand challenge teams and developinggrand challenge ‘white papers’ (‘proposals’)

• Community-representative Board of Directors (drawn from community nominations) evaluates ‘proposals’ and teams and makes recommendations for priorities

• iPlant leadership team decides whether & how to implement the Board’s recommendations and assists GC team leads in assessing needs, specifying requirements and designing ‘Discovery Environments’ to serve the team’s and the broader community’s needs

Page 4: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Biological questions will drive cyberinfrastructure design

• Phylogenetic relationships among species– Building large phylogenetic trees (species and gene)– Understanding Green Plant species relationships– Understanding gene family evolution– Addressing taxonomic problems and concepts– Facilitating understanding of evolution – form & function – Facilitating understanding of evolutionary processes

• Phenotype-Genotype relationships

Slide # 4

Page 5: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Biological questions will drive cyberinfrastructure design

• Phylogenetic relationships among species

• Phenotype-Genotype relationships– Sharing, accessing, and integrating datasets– Analysis and extraction of information and patterns

(automated phenotyping, imaging, etc)– Identifying complex relationships; networks & systems– Assigning functions to genes, networks & systems– Integration of phenological & ecological data with

networks and systems of genes, proteins, etc.– Understanding responses to environmental changes &

stresses (including climate) - natural and ‘ag’ ecosystems

Slide # 5

Page 6: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

The iPlant Collaborative

Internal Advisory Board

Science Opportunities

Teams

Administrative

SupportTeam

Education, Outreach, and Training

Team

Cyberinfrastructure

Development Team

Executive Team

National Science Foundation

Board of

DirectorsCommunity Grand Challenge

Teams

Page 7: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Executive Team

Co-DirectorSteve Goff

Co-DirectorDan Stanzione

Cyberinfrastructure Development Team

Project Managers

Karla Gendler Michael Gonzales

Lead DeveloperSonya Lowry

Semantic Web

ArchitectDamian Gessler

CI Team

Phylogenetics

Engagement Team LeadSheldon Mckay

Gen2Phen Engagement Team LeadMatt Vaughn

CI Advisory TeamGreg AndrewsSudha RamNirav MerchantLincoln SteinDoreen Ware

IT/Infrastruct

ure Edwin Skidmore

Developers, Systems Staff, Research Scientists at ASU, CSHL, UA, UT

Page 8: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Scope: What iPlant won’t do

• iPlant is not a funding agency– A large grant shouldn’t become a bunch of small grants

• iPlant will not fund data generation• iPlant will (probably) not fund <favorite tool x> – Whose funding is ending

• iPlant will not replace all online databases• iPlant will not *impose* community standards

Page 9: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Scope: What iPlant *will* do• Provide storage, computation, hosting, &

programmer effort to support GC projects • Work with community to support & develop

standards• Provide forums to discuss the role and design of

CI in plant science• Help organize the community to collect data• Provide appropriate funding for time spent

helping us design and test the CI

Page 10: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

GC Projects to Date

• 2 Grand Challenges• 11 Working Groups• Participation from ~45 scientists from ~25

institutions beyone the iPlant original team

Page 11: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPToL

• Final Deliverable– A web (interface) environment allowing the

scientific community to create, access, share, annotate, and visualize phylogenetic tree(s) of varying size and complexity.  Included in this environment are the software tools, as well as the infrastructure to host, process, analyze, and store this information.

• 6 working groups

Page 12: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPToL Working Groups

• Data Assembly– Assembling the data to produce the 500k taxa tree

• Big Trees– Providing the methods (and the tree) to produce a 500K taxa tree

• Trait Evolution– Providing methods to relate phylogenetic trees to the evolution of specific traits

• Tree Reconciliation– Developing tools for inferring gene family histories in the context of species trees

• Data Integration– Combining data from different sources to initially meet the needs of the iPToL working

groups• Tree Visualization

– Developing tools to visualize large trees and annotations

Page 13: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPToL WG Membership

Page 14: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPG2P

• Final deliverable:– Procedure allowing an investigator to begin with

trait of interest in species possessing limited genetic resources and progress toward ability to predict trait scores for known genotypes in given, non-constant environments

• Identifying cross-cutting biological use cases to be addressed by working groups– Ex: develop informatics tools to reveal regulatory

networks underlying photosynthetic differentiation in C3 and C4 plants

• 5 working groups

Page 15: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPG2P Working Groups

• NextGen Sequencing– Establishing an informatics pipeline that will allow the plant community to process

NextGen sequence data• Statistical Inference

– Developing a platform using advanced computational approaches to statistically link genotype to phenotype

• Modeling Tools– Developing a framework to support tools for the construction, simulation and analysis

of computational models of plant function at various scales of resolution and fidelity• Visual Analytics

– Generating, adapting, and integrating visualization tools capable of displaying diverse types of data from laboratory, field, in silico analyses and simulations

• Data Integration– Investigating and applying methods for describing and unifying data sets into virtual

systems that support iPG2P activities

Page 16: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

iPG2P WG Membership

Page 17: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Additional Efforts (in progress or consideration)

• Image Analysis Platform – Edgar Spalding, BS Manjunath, Kris Kvilekval, Justin Borovitz, Steve Welch, Ed Buckler

• Semantic Web – Damian Gessler• APWeb2 – • Taxonomic Intelligence –

Page 18: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Why Evolution of Plants is more Exciting than Evoultuion of Humans

Slide # 18www.iplantcollaborative.org

versus

Page 19: The iPlant Collaborative iPToL Data Assembly Workshop November 21 st, 2009 Steve Goff, Sonya Lowry, Martha Narro, Dan Stanzione University of Arizona,

Discussion