AnIML: A New Analytical Data Standard

25
AnIML: A New Analytical Data Standard Stuart J. Chalk, Department of Chemistry, University of North Florida [email protected] ACS Meeting Boston 2015

Transcript of AnIML: A New Analytical Data Standard

Page 1: AnIML: A New Analytical Data Standard

AnIML:A New Analytical Data

StandardStuart J. Chalk, Department of Chemistry, University of North

[email protected]

ACS Meeting Boston 2015

Page 2: AnIML: A New Analytical Data Standard

Data Formats Goals for Data Handling Introduction to AnIML Sections of an AnIML file AnIML Schemas and Files AnIML Technique Definitions Publishing Instrument Data Referencing Data Elements Calculations on Data Future Developments Conclusion

Overview

Page 3: AnIML: A New Analytical Data Standard

Native Data Formats Proprietary formats "Metadata" separated from result data Metadata and data in multiple files Metadata not available electronically No way to link metadata with result data

Interchange Data Formats Available for only a few techniques

ANDI — GC, LC, MS JCAMP-DX — UV-Vis, IR, NMR, UV/Vis, IMS

Fixed order, fixed syntax, immutable formats Content limitations Inconsistent implementations

Current Data Formats

Page 4: AnIML: A New Analytical Data Standard

Extensible Easy to add new elements without breaking existing

applications Flexible

Useful for diverse needs: Interchange, Interconversion, Archiving...

Useable & Maintainable Easy to create, use, adapt, maintain... Readily available tools

Acceptable Use standard mechanisms accepted by mainstream

computing Human readable eXtensible Markup Language

Goals for Data Handling

Page 5: AnIML: A New Analytical Data Standard

Extensible Markup Language (XML) specification

Development under ASTM E13.15 ‘AnIML Task Group’

Data standard to:

“Develop an analytical data standard that can

be used to store data from any analytical instrument”

Introduction to AnIML

http://animl.sourceforge.net

Page 6: AnIML: A New Analytical Data Standard

JCAMP-DX http://www.jcamp-dx.org/

ANDI (netCDF) ThermoML (NIST) SpectroML

Nguyen, A. D. T., Arslan, A., Travis, J., Smith, M., Schafer, R., & Kramer, G. W. (2004) ‘Molecular Spectrometry Data Interchange Applications for NIST's SpectroML’, JALA 9 (6), 346-354. doi:10.1016/j.jala.2004.09.001

Generalized Analytical Markup Language (GAML) http://www.gaml.org/

First official meeting March 23, 2003 @ ASTM

Brief History of Time AnIML

Page 7: AnIML: A New Analytical Data Standard

Broad scope Different types of data Size of data sets Everyone calls ‘widgit’ something different Need for metadata dictionaries One size does not fit all Getting broad community involvement

Domain experts User communities

What format?

Challenges for AnIML

Page 8: AnIML: A New Analytical Data Standard

AnIML XML elements are ‘pigeon holes’ for metadata Minimal ‘required’ information If it’s not required you don’t have to include the

element Extensible Store raw data not processed data

(except for FT techniques) Support for legacy data Record of changes

Validatable Signable (digital sense)

AnIML Design Philosophy

Page 9: AnIML: A New Analytical Data Standard

AnIML Schemas and Files

Page 10: AnIML: A New Analytical Data Standard

Sections of an AnIML File

Page 11: AnIML: A New Analytical Data Standard

AnIML Technique Definitions

Page 12: AnIML: A New Analytical Data Standard

AnIML - Sample

Page 13: AnIML: A New Analytical Data Standard

AnIML - Sample

Page 14: AnIML: A New Analytical Data Standard

AnIML-

Experiment

Page 15: AnIML: A New Analytical Data Standard

AnIML - Result

Page 16: AnIML: A New Analytical Data Standard

Data storageformat

Not just forspectral data

Access Data Metadata

Manipulateusing XSLT

Validate Signable

AnIML in an ELN

Page 17: AnIML: A New Analytical Data Standard

AnIML Viewer -> Jmol/JSpecView (http://jmol.sourceforge.net)

Publish Supplementary Data

Page 18: AnIML: A New Analytical Data Standard

Conversion of AnIML data to SVG using XSLT

Convert to Image File for Publication

Page 19: AnIML: A New Analytical Data Standard

Expose an AnIML file at a URL Optional: Define a DOI for that URL

Use XPath to reference a specific data point in an AnIML file

//ExperimentStepSet[1]/ExperimentStep[1]/Method[1]/Author[1]/Name[1]

Encode the XPath expression so it can be part of the URL

Open Instrument Data

Page 20: AnIML: A New Analytical Data Standard

Part of a Data Management Plan

Federal agencies are mandating data be made available

Long term archive format for research data Referenceable if available online Searchable with Xquery Publish data processing algorithms (XSLT)

Future proof data -> conversion to future data formats

Page 21: AnIML: A New Analytical Data Standard

The Healthcare and Life Science (HCLS) Community Profile is a Note from the Semantic Web HCLS Interest Group Access to consistent, high-quality metadata is critical to

finding, understanding, and reusing scientific data. This document describes a consensus among participating stakeholders in the Health Care and the Life Sciences domain on the description of datasets using the Resource Description Framework (RDF). This specification meets key functional requirements, reuses existing vocabularies to the extent that it is possible, and addresses elements of data description, versioning, provenance, discovery, exchange, query, and retrieval.

Data Descriptions:HCLS Community Profile

http://www.w3.org/TR/hcls-dataset/

Page 22: AnIML: A New Analytical Data Standard

AnIML 1.0 Deliverables Core Schema - Fundamental framework for AnIML documents Technique Schema - Fundamental framework for technique definition

and extension documents AnIML Technique Definition Documents (ATDD) - Rules for content of

specific technique file AnIML Naming and Design Rules - Specifies rules about data element

structure for interoperability Standard Practice for AnIML Files - Describes how the specification is

supposed to work How to Create a Technique Definition Document - Guidelines for

creating new technique definition documents Other documents

Draft Requirements Specification for AnIML Version 1.0 Requirements and Goals of the Analytical Information Markup Language

AnIML Specification

http://animl.sourceforge.net

Page 23: AnIML: A New Analytical Data Standard

Documentation Core specification Technique and extension specification Naming and design rules Annotated technique definitions

(UV/Vis, IR, 1D NMR, MS, Chromatography) Balloting through ASTM (end of 2015)

Vendor, User, Developer extensions Semantic extension of AnIML metadata

items

Future Developments

Page 24: AnIML: A New Analytical Data Standard

Conclusion AnIML is a great solution

for storing instrument data Human readable (UTF-8) Platform neutral Archivable Validatable

AnIML leverages the extensiveXML ecosystem of tools

Software engineers know XML