BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group 7...

23
BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group www.biopax.org 7 th International Annual Bio-Ontologies Meeting 30 July 2004 Glasgow, Scotland United Kingdom

Transcript of BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group 7...

Page 1: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

BioPAXThe Birth of A Data Exchange

Language for Biological PathwaysJoanne LucianoBioPAX Group

www.biopax.org

7th International Annual Bio-Ontologies Meeting

30 July 2004Glasgow, ScotlandUnited Kingdom

Page 2: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 2

IntroductionBioPAX = Biopathway Exchange Language

Emerged at ISMB

•conceived at ISMB ’01•born just before ISMB ’02 (Protégé

workshop)•crawling at ISMB ’03 (Level 0.5)•walking at ISMB ’04 (Level 1.0)•approaching the “terrible twos”

Page 3: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 3

What is a pathway?

Depends on who you ask

Page 4: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 4

WITBioCycReactomeaMAZEKEGGBINDDIPHPRDMINTIntActPSI formatCSNDBTRANSPATHTRANSFACPubGeneGeneWays

IntegratedPathwayDatabase

Research Community Need

PathwayDatabases

MetabolicProtein InteractionSignal Transduction

Gene Regulatory

Page 5: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 5

Design Goals• Encapsulation: An entire pathway in

one record• Compatible: Use existing standards

wherever possible• Computable: From file reading to

logical inference• Successful: Buy-in from the research

community

Page 6: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 6

Technical Logistics & Goals

Interoperability – Integration and exchange of

pathway data– Interchange through a common

(standard) representation– accommodate existing database

representations– provide a basis for future databases– enables development of tools for

searching and reasoning over the data base

Page 7: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 7

Technical Logistics (cont’d)Why OWL DL?

• Expressivity (biology = “complex relationships” • W3C Standard (use existing standards)

“Semantic Web enabled.”• XML based (the exchange language in computing)• Machine Computable

Enable full reasoning capability from file reading to logical inference– facilitate integration of knowledge, data, tool development– uncover inconsistencies and new knowledge

– OWL DL• Complete: all conclusions are guaranteed to be

computed• Decidable: all computations will finish in finite time

(with OWL Lite, short amount of time

Page 8: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 8

Social LogisticsHow we engendered buy in from the field.

Made it much easier

Take things in steps:• Pathway Database -> Data Exchange Format• Data Exchange Format -> Release in Levels of increasing

complexity (early success leads to early adoption leads to the possibility of overall project success.

Get “buy in” and get involvement -leads to acceptance later

• Support the existing databases (BioCYC, WIT, BIND, etc.)– Got database sources to agree to participate in the

development to assure that their DBs will be properly represented

– Got database sources to agree to export in the new format once it is defined

Page 9: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 9

Social Logistics(cont’d)Get “buy in” (continued) • Community Involvement and Support

Core group (from community, small, meet regularly)Mailing ListUser communitySubgroupsInternational Meetings and Presentations Tool developers

ModelersUsers (researchers)Ontology developersDatabase providersComplementary representations (SBML, CellML)Like mindsGeneral Community

Page 10: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 10

Social Logistics(cont’d)

Get organized• Small core group advancing standard• International representation via mailing lists• Collaborate complementary representations• Bi-weekly conference calls, bi-monthly F2F• Cost paid by participants and DOE• Special interests have subgroups

– Core group member + outside experts– Tackle specific challenges

Page 11: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 11

Implementation of BioPAX

Designed using GKB Editor and Protégé

BioPAX uses OWL to define the Schema

BioPAX Instances to store the data

Page 12: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 12

BioPAX – Ontology

Page 13: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 13

OWL(schema)

Instances (Individuals)

data

Page 14: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 14

Complex Relationships Captured

Page 15: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 15

Ontology Slot Definitions

Page 16: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 16

Integration -> KnowledgeKnowledge is Power

Data in the same format: Metabolic Protein Protein

InteractionSignal Transduction Gene Regulation

Facilitates– Centralized public pathway DB– Share data between existing DBs– Distribute public and proprietary data– Knowledge Assembly– Reasoning

Page 17: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 17

A Common Exchange Language

Without BioPAX>100 DBs and tools

BioPAX

Promotes collaboration (big science), accessibility

Database

Application

User

Page 18: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 18

Biomass

Consistency Checking: Nutrient-related analysis of a BioPAX knowledge base

Fired Reaction

Missing essentialcompound

Known Nutrient set

Essentialcompounds

Unfired Reaction

Page 19: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 19

What Next?

• BioPAX future Development– Level 2, 3, future levels– BOF (check schedule)– Talk later today by Gary Bader at BioPathways

SIG– Poster in Main Conference

• Development of tools and API– libBioPAX

• Semantic Web Life Science Initiatives– BOF Sunday

Page 20: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 20

BioPAX Supporting GroupsGroups • Memorial Sloan-Kettering Cancer Center:

G. Bader, M. Cary, J. Luciano, C. Sander• SRI Bioinformatics Research Group:

P. Karp, S. Paley, J. Pick• University of Colorado Health Sciences

Center: I. Shah• BioPathways Consortium: J. Luciano,

E. Neumann, A. Regev, V. Schachter• Argonne National Laboratory: N. Maltsev,

E. Marland• Samuel Lunenfeld Research Institute:

C. Hogue• Harvard Medical School: E. Brauner,

D. Marks, J. Luciano, A. Regev• NIST: R. Goldberg• Stanford: T. Klein• Columbia: A. Rzhetsky• Dana Farber Cancer Institute: J. Zucker

Collaborating Organizations:

• Proteomics Standards Initiative (PSI)• Systems Biology Markup Language

(SBML)• CellML• Chemical Markup Language (CML)

Databases• BioCyc (www.biocyc.org)• BIND (www.bind.ca)• WIT (wit.mcs.anl.gov/WIT2)• PharmGKB (www.pharmgkb.org)

Grants• Department of Energy (Workshop)

The BioPAX Community

Page 21: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 21

PSI

Biochemical Reactions

SBML,CellML

Regulatory PathwaysLow Detail High Detail

ProteinInteractionNetworks

Metabolic PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Exchange Formats in the Pathway Data Space

Page 22: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 22

Level 1 BioPAXReleased July 2004

BioPAX Level 1

PSISBML,CellML

GeneticInteractions

Molecular InteractionsPro:Pro All:All

Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic

Regulatory PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Metabolic PathwaysLow Detail High Detail

Biochemical Reactions

Small MoleculesLow Detail High Detail

Page 23: BioPAX The Birth of A Data Exchange Language for Biological Pathways Joanne Luciano BioPAX Group  7 th International Annual Bio-Ontologies.

30 July 2004 7th BioOntologies Workshop 23

Exchange Formats in the Pathway Data Space

BioPAX

PSISBML,CellML

GeneticInteractions

Molecular InteractionsPro:Pro All:All

Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic

Regulatory PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Metabolic PathwaysLow Detail High Detail

Biochemical Reactions

Small MoleculesLow Detail High Detail