PSI Mass Spectrometry Standards Working Group Summary
description
Transcript of PSI Mass Spectrometry Standards Working Group Summary
PSI Mass Spectrometry StandardsWorking Group Summary
HUPO PSI MS Standards Working Group
PSI MSS & PI WGs Joint Session2:00 – 5:30
10 minutes each:• mzIdentML• mzQuantML (incl support for SRM)• mzTab (incl. support for SRM)• TraML• mzML & imzML & compression• metabolomics & ion mobility & SWATH-MS• PEFF• MIAPE: MS & MSI & Quant• Controlled vocabulary• Cross-linking• Protein grouping• PTM localization• RNA-seq assisted proteomics
2
TraML
• Discussed ongoing implementation and use efforts
• Identified some outdated information in documentation. TODO: update
• Online validator has outdated CV. TODO: update
• No schema or CV updates needed at this time
• Discussed adding Waters format to jTraML converter
3
mzML
• Schema remains stable with no needed changes
• Many updates to the CV continue
• imzML continues to be aligned with mzML
• mz5 format was recently published as mzML clone using HDF5 not XML
• Discussions ongoing by metabolomics groups about how mzML would meet the needs of the metabolomics community. Some terms already added. Expect some more proposals soon.
• File size continues to be a significant problem…
4
mzML (or other) compression
• Andy Dowsey & Faviel Gonzalez Galarza presented their work on compression
• mz5 claims 50%+ space savings by using HDF5
• Implementation discussed vs. alternate HDF5 implementations
• But significant space savings was achieved via tricks that could work in mzML
• Discussed other work and proposed work in mass-spec aware compression
• Possibility: alternative to zlib internal compression (currently supported) could be a mass-spec aware “mszip” compression scheme. Provided as a simple, open-source routine available in many languages
• Possibility: Develop a variant of zlib that creates files that can be uncompressed normally, but allow indexing into the compressed file
5
MB
April 2013
~50% compression using mz5
File compression results
Orbitrap profile-mode spectra
Compressing mzML
April 2013
SYNAPT
57.1%50.6%
36.8%
45.1% 43.3%
Other presentations
• Shin presented on the use of RDF & TogoDB
• Mathias presented about qcML
8
Ion mobility MS & SWATH-MS
• Discussed with Waters their ion mobility data
• Discussed required terms and practices for encoding raw IMS and peak-picked IMS data. Proposal to be publicized on lists for further comment
• No schema change necessary
9
RNA-seq assisted proteomics
• Good discussion of the state of the field on this workflow
• Discussed using/promoting the PEFF format as a useful mechanism for encoding some of the RNA-seq results for use by proteomics searches
• Discussed possible need to update MIAPE documents to capture information about what is done in this type of a workflow
10
Controlled Vocabulary
• It is time to get the vendors to update their instrument and software terms again. Gerhard will repeat the effort done by Luisa years ago.
• Worked to get rid of purgatory branch in CV
• Discussed what to do with multiple SoftwareName:specificTerm entries that are effectively the same concept. Start by grouping similar terms under a common parent
• Discussed constraining some terms with an is_a relationship to a concept like “value between 0 and 1 inclusive”
11
• Interest in finalising the format specification and make it available
• Cannot expect that (most of the) DB providers will produce it in addition to their existing format
• Cannot expect that (most of the) search engines will fully take advantage of its structure (variants, PTMs, …) in the identification jobs
• A converter («source»-to-PEFF) and a reader already exist . Could be a reference implementation
=>A few minor open issues to be resolved and finalise the recommendation
PEFF (PSI Extended Fasta Format)
MIAPE-MSI
• Document is updated• Mapping to mzIdentML is validated
• Collection of up-to-date example instance documents ongoing
• Semantic validator for mzIdentML ongoing
=> Prepare submission of version 1.2 to PSI doc process