BioPAX The Birth of A Data Exchange Language for Biological Pathways

23
BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Core Group www.biopax.org 7 th International Annual Bio-Ontologies Meeting 30 July 2004 Glasgow, Scotland United Kingdom

description

BioPAX The Birth of A Data Exchange Language for Biological Pathways. Joanne Luciano BioPAX Core Group www.biopax.org 7 th International Annual Bio-Ontologies Meeting 30 July 2004 Glasgow, Scotland United Kingdom. Introduction. BioPAX = Biopathway Exchange Language Emerged at ISMB - PowerPoint PPT Presentation

Transcript of BioPAX The Birth of A Data Exchange Language for Biological Pathways

Page 1: BioPAX The Birth of A Data Exchange Language for Biological Pathways

BioPAXThe Birth of A Data Exchange

Language for Biological PathwaysJoanne Luciano

BioPAX Core Groupwww.biopax.org

7th International Annual Bio-Ontologies Meeting

30 July 2004Glasgow, ScotlandUnited Kingdom

Page 2: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 2

IntroductionBioPAX = Biopathway Exchange Language

Emerged at ISMB

•conceived at ISMB ’01•born at ISMB ’02 •crawling at ISMB ’03 (Level 0.5)•walking at ISMB ’04 (Level 1.0)•now approaching the “terrible twos”

Page 3: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 3

What is a pathway?

Depends on who you ask

Page 4: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 4

WITBioCycReactomeaMAZEKEGGBINDDIPHPRDMINTIntActPSI formatCSNDBTRANSPATHTRANSFACPubGeneGeneWays

IntegratedPathwayDatabase

Research Community Need

PathwayDatabases

MetabolicProtein InteractionSignal Transduction

Gene Regulatory

Page 5: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 5

Design Goals• Encapsulation: An entire pathway in

one record• Compatible: Use existing standards

wherever possible• Computable: From file reading to

logical inference• Successful: Buy-in from the research

community

Page 6: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 6

Technical Logistics & Goals

Interoperability – Integration and exchange of

pathway data– Interchange through a common

(standard) representation– accommodate existing database

representations– provide a basis for future databases– enables development of tools for

searching and reasoning over the data base

Page 7: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 7

Technical Logistics (cont’d)Why OWL? Why OWL DL?Expressivity (biology = “complex relationships”) • W3C Standard (use existing standards)

“Semantic Web enabled”• XML based (the exchange language in computing)• Machine Computable

– Facilitate integration of knowledge, data, tool development– Uncover inconsistencies and new knowledge

– OWL DL• Enable full reasoning capability for users

from file reading to logical inference• Complete: all conclusions are guaranteed to be

computed• Decidable: all computations will finish in finite time

(with OWL Lite, short amount of time)

Page 8: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 8

Social Logistics

Get organizedMake the decision & commitment2 or 3 dedicated individuals

Small core group– Bi-weekly conference calls, bi-monthly F2F– Commitment & resources

• Participants willing and able cover their costs• Outside funding (DOE)

Special interests and needs form subgroup task forces• Core group member(s)• Outside experts

International representation & participation (Outreach & Community Building)

• conferences and mailing lists• follow-up and individual

Collaborate with complementary/competing representations

Page 9: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 9

Social LogisticsHow we engendered buy in from the field which

made life much easier

Take things in steps:•Pathway Database vision -> Data Exchange Format as 1st step•Data Exchange Format -> Release in Levels of increasing complexity Level 1 supports Metabolic pathways, Level 2

Early success leads to early adoption, leads to increased probability of overall project success.

Get “buy in” and get involvement -leads to acceptance later•Support the existing databases (BioCYC, WIT, BIND, etc.)

–Got database sources to agree to participate in the development to assure that their DBs will be properly represented

•Got database sources to agree to export in the new format once it is defined

Page 10: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 10

Social Logistics (cont’d)Get “buy in” (continued) • Community Involvement and Support

Core group (represents voice of community, small, committed)Mailing ListUser communitySubgroups

• International Meetings and Presentations Tool developers

ModelersUsers (researchers)Ontology developersDatabase providersComplementary representations (SBML, CellML)Like mindsGeneral Community

Page 11: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 11

Implementation of BioPAX

Designed using GKB Editor and Protégé

BioPAX uses OWL to define the Schema

BioPAX Instances to store the data

Page 12: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 12

BioPAX – Ontology

Page 13: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 13

OWL(schema)

Instances (Individuals)

data

Page 14: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 14

Complex Relationships Captured

Page 15: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 15

Ontology Slot Definitions

Page 16: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 16

Integration -> KnowledgeKnowledge is Power

Data in the same format: Metabolic Protein Protein

InteractionSignal Transduction Gene Regulation

Facilitates– Centralized public pathway DB– Share data between existing DBs– Distribute public and proprietary data– Knowledge Assembly– Reasoning

Page 17: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 17

A Common Exchange Language

Without BioPAX>100 DBs and tools

BioPAX

Promotes collaboration (big science), accessibility

Database

Application

User

Page 18: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 18

Biomass

Consistency Checking: Nutrient-related analysis of a BioPAX knowledge base

Fired Reaction

Missing essentialcompound

Known Nutrient set

Essentialcompounds

Unfired Reaction

Page 19: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 19

What Next?

• BioPAX future Development– Level 2, 3, future levels– BOF (check schedule)– Talk later today by Gary Bader at BioPathways

SIG– Poster in Main Conference (check program)

• Development of tools and API– libBioPAX

• Semantic Web Life Science Initiatives– BOF Sunday

Page 20: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 20

BioPAX Supporting GroupsGroups • Memorial Sloan-Kettering Cancer Center:

G. Bader, M. Cary, J. Luciano, C. Sander• SRI Bioinformatics Research Group:

P. Karp, S. Paley, J. Pick• University of Colorado Health Sciences

Center: I. Shah• BioPathways Consortium: J. Luciano,

E. Neumann, A. Regev, V. Schachter• Argonne National Laboratory: N. Maltsev,

E. Marland• Samuel Lunenfeld Research Institute:

C. Hogue• Harvard Medical School: E. Brauner,

D. Marks, J. Luciano, A. Regev• NIST: R. Goldberg• Stanford: T. Klein• Columbia: A. Rzhetsky• Dana Farber Cancer Institute: J. Zucker

Collaborating Organizations:

• Proteomics Standards Initiative (PSI)• Systems Biology Markup Language

(SBML)• CellML• Chemical Markup Language (CML)

Databases• BioCyc (www.biocyc.org)• BIND (www.bind.ca)• WIT (wit.mcs.anl.gov/WIT2)• PharmGKB (www.pharmgkb.org)

Grants• Department of Energy (Workshop)

The BioPAX Community

Page 21: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 21

PSI

Biochemical Reactions

SBML,CellML

Regulatory PathwaysLow Detail High Detail

ProteinInteractionNetworks

Metabolic PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Exchange Formats in the Pathway Data Space

Page 22: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 22

Level 1 BioPAXReleased July 2004

BioPAX Level 1

PSISBML,CellML

GeneticInteractions

Molecular InteractionsPro:Pro All:All

Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic

Regulatory PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Metabolic PathwaysLow Detail High Detail

Biochemical Reactions

Small MoleculesLow Detail High Detail

Page 23: BioPAX The Birth of A Data Exchange Language for Biological Pathways

30 July 2004 7th BioOntologies Workshop 23

Exchange Formats in the Pathway Data Space

BioPAX

PSISBML,CellML

GeneticInteractions

Molecular InteractionsPro:Pro All:All

Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic

Regulatory PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Metabolic PathwaysLow Detail High Detail

Biochemical Reactions

Small MoleculesLow Detail High Detail