ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

28
ISA-Tab as a COSMOS standard Metabolomics Data Standards and Capture Workshop Metabolomics Society Meeting 2014, Tsuruoka, Japan Philippe Rocca-Serra (PhD) University of Oxford e-Research Centre

description

Introducing ISA-Tab Format, key features, patterns, curation issues, supporting software and RDF conversion

Transcript of ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Page 1: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISA-Tab as a COSMOS standard

Metabolomics Data Standards and Capture Workshop

Metabolomics Society Meeting 2014, Tsuruoka, Japan

Philippe Rocca-Serra (PhD)University of Oxford e-Research Centre

Page 2: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Data exchange, Let information flow!

• Tenets of Science: reproducibility of results and findings

• justifies the right to access data

• publishing a manuscript is no longer enough

• data should be published and released along side

• A GEO or an ArrayExpress for Metabolomic Data

• What would you do if you had access to 25000 studies in Metabolomics today?

Page 3: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Data Provenance and Preservation

It is all about structuring experimental information to make it available to computer and software agents to enable:

Notes in Lab Books(information for humans)Spreadsheets and Tables

( the compromise)Facts as RDF statements(information for machines)

Page 4: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Exchange as Main Goal• Exchange of experimental description: the Study Plan

• description of subjects and perturbations: ISA-TAB

• Exchange of spectral acquisition file: the Raw Data

• enables review, assessment,appraisal, reuse: MzML,nmrML

• Exchange of findings: the Results and Interpretation

• identified metabolites: Mz-TAB and Metabolite Annotation File

Page 5: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

The essential value of Contextual Data or Metadata

• “Data about the Data”–description of the data (descriptive metadata)

• Lazy way: “it is all in the file name” approach CNL_MOA1_C2_LD_TP1_EWR.cdf

• Is this enough to understand what this experiment is about ....5 years from now?

Page 6: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISA-Tab format in a nutshell (1)

ISA metadata specifications:• workflow and process orientated• compatible with checklist enforcement• compatible with external vocabulary resources• compatible by design with existing schemas

Page 7: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

• Investigation File: cardinality: 1..1–purpose: think “executive summary”– layout: rows of key value pairs organized in blocks– content:

•Why? general study description•How? methods / protocol declaration •How? variable declarations (predictor and response variables)•Who? contact and affiliation information

• Study File: cardinality: 1..n– layout: true header/row of record table (think “sorting, filtering of samples”) –content:

•What? Listing all biological materials collected over the study course and their treatments.

• Assay File: cardinality: 1..n– layout: true header/row of record table (think “sorting, filtering of datafiles”) –content:

•What? Listing all data acquisition events and data files collected by a given assay and subsequent data transformations

ISA-Tab format in a nutshell (II)

Page 8: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISA syntax: Characteristics[<tag>]Declaring and annotating an ISA Source Name or Sample Name

ISA syntax: Protocol REF with sets of Parameter Value[<tag>] resulting in a ISA node Sample Name

Worked example-ISA Study Sample File:

Describing Study Subjects and their features

ISA syntax: Factor Value[<tag>] forreporting treatments or study groups as a set of levels of independent variables

Page 9: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Worked example - ISA Assay File:reporting signal acquisition events

ISA Pattern for LC-MS: Splitting in 2 distinct assay tables, one per scan polarity

ISA Pattern for GC-MS: Report derivatization as an extra sample prep step

ISA Pattern for NMR:

Page 10: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
Page 11: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

• Different kinds of experiments, Different annotation needs

• CIMR ISA configurations to deal with Biological Specifics

• Clinical Context (Human as subjects)

• Non-clinical Context (=Animal as subjects)

• Plant Context (=Plants as subjects)

• In-vitro Context ( = Cell as subject)

Dealing with Diversity:ISA configurations for

ISAcreator

Page 12: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

30/06/2013

In-vitro study Plant study Clinical study

https://github.com/ISA-tools/Configuration-Files

Dealing with Diversity:Refining CIMR ISA configurations

Page 13: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Dealing with Diversity:Refining CIMR ISA

configurations• Different kinds of experiments, Different annotation needs

• additional ISA assay table definitions to deal with technology needs

• Targeted profiling or global metabolomics analysis

• liquid chromatography mass spectrometry

• gas chromatography mass spectrometry

• direct infusion mass spectrometry

• 1D /2D NMR spectroscopy

• Metabolic Flux Analysis (ongoing work with Pr Marta Cascante)

Page 14: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Developed to be a user friendly way to enter standards-compliant metadata: it has lots of features...

But these are just some of them...we also have a data entry wizard and an import utility...

The ISAcreator: an editor for ISA-Tab format

https://github.com/ISA-tools/ISAcreator

Page 15: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

The ISAcreator: an editor for ISA-Tab format

https://github.com/ISA-tools/ISAcreator

Page 16: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISAcreator features: visualizing experimental workflows

Work completed during investigation of new approach for creation of glyphs with use of taxonomy for guidance. See Maguire et al, Taxonomy-Based Glyph Design

– with a Case Study on Visualizing Workflows of Biological Experiments, IEEE Transactions on Visualization and Computer Graphics, 2012

Page 17: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

This bit of code indicates you need to invoke ISA configuration which define expected table layout in order to proceed

ISAcreator features: API

https://github.com/ISA-tools/ISAcreator/wiki/API

Page 18: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

https://github.com/ISA-tools/Risa

Page 19: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISAViewer: ISA-Tab viewing component on the web

https://github.com/ISA-tools/ISATab-Viewer

Page 20: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

ISA patterns for reporting QC samplesAnnotation Rule of Thumb: does the reported value satisfy the

‘is_a’ rule?In this representation, QC1 would be interpreted to be an instance of organism whose type is a ‘vanillic acid’ => incorrect

Improved representation:QC1 would be interpreted to be an instance of chemical compound whose type is a ‘vanillic acid’ => incorrect acting as ‘positive control’

Furthermore, only 2 actual study subject will be accounted for

Page 21: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Why does it matter? It is all about structuring experimental information to make it

available to computer and software agents to enable:

Notes in Lab Books(information for humans)Spreadsheets and Tables

( the compromise)Facts as RDF statements(information for machines)

Page 22: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

RDF representation of Metabolomics Experimental

information• Query Expansion and Data Discovery

https://github.com/ISA-tools/isa2owl

Page 23: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

RDF representation of Metabolomic Experimental

information• Conversion of 80 % of public datasets• Tests against case-queries report partial success• Points to the need to enforce stricter curation

rules in order to fully benefit from the RDF representation

• Existing conversion already enables easy cohort creation

• Ongoing work: converting MAF file to RDF• enabling querying from experimental

metadata to chemical identities and vice-versa.

https://github.com/ISA-tools/isa2owl

Page 24: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Contributing to Metabolights and ISA

• BBRSC UK-China Award & BGI funded Hackathon• venue: BGI Hong-Kong• Participants:

• Metabolights/BGI/ISA/Birmingham/Hong-Kong University

• Outcome: • ISAtab web viewer code• Functional Specifications & Code for

DoE Wisard API• 4 datasets coded in ISA format

Page 25: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Contributing to Metabolights and ISA

• BBRSC UK-China Award funded Hackathon will be back!• 2nd Meeting to be organised• Fancy participating? get in touch!

[email protected]

Page 26: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Don’t miss out• 2 Main Publishers involved in

developing Data Journals• (Scott Edmunds (GigaScience) and

Susanna Sansone (NPG Scientific Data)

• Representatives from Metabolomics Repository• Get all the help you need for

depositing your data and increase the visibility of your research!

Page 27: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
Page 28: ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan

Questions??

You can email [email protected]

View our bloghttp://isatools.wordpress.com

Follow us on Twitter@isatools

View our websitehttp://www.isa-tools.org

Thanks for listening...

View our Git repo & contribute

http://github.com/ISA-tools