Presentatie nbic2011templates

32
A democracy of reporting standards for omics studies Kees van Bochove NBIC BioAssist taskforce leader metabolomics and study capture [email protected]

Transcript of Presentatie nbic2011templates

Page 1: Presentatie nbic2011templates

A democracy of reporting standards for omics studies

Kees van Bochove

NBIC BioAssist taskforce leader

metabolomics and study capture

[email protected]

Page 2: Presentatie nbic2011templates

Origin of the question: consortia

Their aims:

  Get an overview over studies of all partners

  Share study data

  Standardization of bioinformatics

Nutritional Phenotype Database Project (dbNP)

http://dbnp.org

Data Support Platform (DSP)

http://www.nmcdsp.org

Page 3: Presentatie nbic2011templates

Central question: How do I turn study descriptions and metadata tables into a persistent, queryable database?

Database

Query

Page 4: Presentatie nbic2011templates

We studied many available open source solutions… and finally decided to create our own

•  Pedro •  OpenBIS •  WikiLIMS •  ISACreator •  SysMO-DB •  Annotare •  LabKey •  MOLGENIS •  i3Cube •  And more…

Open formats: XML formats: MAGE-ML, FuGE, LabKey etc. Tab-delimited formats: MAGETAB, ISATAB, XGAP RDF

Page 5: Presentatie nbic2011templates

GSCF: Generic Study Capture Framework

•  Open source web application, developed in •  Grails = Groovy on Rails •  Groovy is an extension of the Java language, it compiles to

Java bytecode (can be run on any Java VM) •  Development started October/November 2009, on average 4

fulltime programmers since then •  Current version is 0.8.0 •  Info: http://dbnp.org

•  Test it: http://demo.dbnp.org •  Source code: http://trac.nbic.nl/gscf

Page 6: Presentatie nbic2011templates

GSCF homepage

Page 7: Presentatie nbic2011templates

Chris Taylor (MIBBI) about data standards

“Coverage of experimental design in current bioinformatics standards is meagre at best”

Page 8: Presentatie nbic2011templates

Study design

Page 9: Presentatie nbic2011templates

GSCF study design wizard

Page 10: Presentatie nbic2011templates

GSCF study design wizard

Page 11: Presentatie nbic2011templates

Use of ontologies: users don’t like long term lists

Page 12: Presentatie nbic2011templates

Study design overview

Page 13: Presentatie nbic2011templates

Machiel Jansen about Knowledge Representation

“The representation of knowledge will always depend on its use”

Page 14: Presentatie nbic2011templates

Study overview – which columns should be there?

Page 15: Presentatie nbic2011templates

Different ‘data levels’ in a study

•  Study (meta level) •  Subject (source organism, e.g. humans, mice, plants, cell lines) •  Event (e.g. treatment, compound, diet) •  Sampling Event (e.g. DNA isolation, liver sampling) •  Sample (e.g. blood sample, urine sample) •  Assay (e.g. transcriptomics, metabolomics, sequencing)

•  Lines up mostly with both ISATAB and MIBBI Foundry

Page 16: Presentatie nbic2011templates

GSCF template editor – Subject level

Page 17: Presentatie nbic2011templates

GSCF template editor – Event level

Page 18: Presentatie nbic2011templates

Barend Mons about structured data

“Everyone wants structured data, but no one wants to fill out the forms”

Page 19: Presentatie nbic2011templates

Importer – upload Excel file

Page 20: Presentatie nbic2011templates

Importer – map your Excel file unto templates

Page 21: Presentatie nbic2011templates

Jildau Bouwman about study capturing

“If we really want to do personalized health research, we have to capture everything that might affect our measurements!”

Page 22: Presentatie nbic2011templates

DbNP data model

REST protocol

Page 23: Presentatie nbic2011templates

Transcriptomics module

Page 24: Presentatie nbic2011templates

Metabolomics module

Page 25: Presentatie nbic2011templates

Next Generation Sequencing module

Page 26: Presentatie nbic2011templates

Query composer

Page 27: Presentatie nbic2011templates

Query results on Study level

Page 28: Presentatie nbic2011templates

Query results on Sample level

Page 29: Presentatie nbic2011templates

Query results on Assay level

Page 30: Presentatie nbic2011templates

Next steps

•  Within the NMC DSP project, we will create a ‘GSCF data fetch’ functionality in Galaxy, enabling the execution of workflows on specific data-slices from the database

•  Connect to Semantic Web efforts (OpenPHACTS project) – we also have a pilot with TNO and UvA on using a triple store to enrich GSCF assay results

•  Align with other projects: e.g. Hackathon result gscf4molgenis

•  Employ the NBIC philosophy – these tools are also available to you!

Page 31: Presentatie nbic2011templates

Hackathon results – GSCF – MOLGENIS adapter http://hackathon.nmcdsp.org | http://trac.nbic.nl/gscf4molgenis

Page 32: Presentatie nbic2011templates

Acknowledgements Tjeerd Abma Adem Bilican Jildau Bouwman Christine Chichester Sudeshna Das Marjan van Erk Chris Evelo Prasad Gajula Roeland van Ham Thomas Hankemeier Margriet Hendriks Guido Hooiveld Robert Horlings Peter Horvatovich Rob Hooft Machiel Jansen Jim Kaput Kostas Karasavvas Bart Keijser Matthew Lange Scott Marshall

Barend Mons Ben van Ommen Linette Pellis Janneke van der Ploeg Marijana Radonjic Theo Reijmers Erik Roos Marco Roos Frans Paul Ruzius Jahn Saito Susanna Sansone Siemen Sikkema Rob Stierum Eugene van Someren Morris Swertz Chris Taylor Michael van Vliet Jeroen Wesbeek Katy Wolstencroft Suzan Wopereis Gooitzen Zwanenburg