Download - Metabolite Profiling Platform - Thermo Fisherapps.thermoscientific.com/media/SID/Europe Region... · - Early detection of desirable traits is very valuable - Sensory and nutritional

Metabolite Profiling and Identification Employing High Resolution MS Strategies

David Portwood

Syngenta, Jealotts Hill International Research Centre, UK

Classification: PUBLIC

2

We bring plant potential to life

Syngenta is one of the world’s leading companies

with more than 24,000 employees in over 90

countries dedicated to our purpose: Bringing plant

potential to life.

Our Crop Protection and Seeds products help

growers increase crop yields and productivity. We

contribute to meeting the growing global demand

for food, feed and fuel and are committed to

protecting the environment, promoting health and

improving the quality of life.


3

Metabolomics

● Metabolomics applications within Syngenta

- Plant breeding

- Early detection of desirable traits is very valuable

- Sensory and nutritional profiling, relating chemical composition to

desirable traits

- Fundamental research

- Drought tolerance

- Ripening processes

- Effect of genetic manipulation

- New agrochemical products

- Mode of action studies


4

Data acquisition workflow

Extract GC or LC-MS

3D dataset:

(time vs. m/z vs. intensity)

Deconvolution

Compound identification

Data compilation

Normalisation

QC and Data analysis

Data combination


5

Study Design

● Interested in metabolic content of fruit

and vegetables in support of breeding

programs

● Early detection of desirable traits is very

valuable

● Also interested in metabolic and genetic

basis of ripening

● Here we looked at 4 different varieties

of Tomato as part of a European

Systems Biology project*

● The study compared three non-ripening

varieties with a normal (wild type)

tomato* A collaboration between Nottingham University and Syngenta

(Charlie Hodgeman and Graham Seymour, Charlie Baxter, Mark Earll,

Dave Portwood, Mark Seymour). Funded by the BBSRC

Ripening

Gen

oty

pe

AC+

CNR

NOR

RIN


6

Study Design

● Four different cultivars:

- WT wild type (Ailsa Craig or AC++)

- NOR as WT except non-ripening locus

- RIN ripening inhibited

- CNR colourless non-ripe

● Sampling; post anthesis (flowering), breaker (fruit starts to change

colour) and post-breaker

● Extracts were analysed by GC-MS (Trace DSQII), UPLC-MS/MS (LTQ)

and accurate mass UPLC-MS/MS (Orbitrap Velos)

● Data processed using either RefinerMS or mzMine abd Simca-P (PCA +

OPLS)


7

Deconvolution of 3D MS data

● Methods we use:

- Genedata MS refiner (Commercial)

- Re-samples 3D MS data onto a common grid and

applies alignment

- Filtering and feature detection produce a peak list

across all samples

- mzMine (Open source)

- Feature detects in the mass dimension to extract

single ion chromatograms

- Peak detection, alignment by mass and time

- Identification by database searching

● In both cases “missing peaks” are back filled with baseline

levels to ensure correct statistical treatment of data


8

Identification

● Identification of peaks is the single most challenging aspect

● UPLC-MS:

- Spectra have low fragmentation and are often adducts

- Peak recognition by retention time and accurate mass of parent ion and or adducts

- Hierarchical identification process

- In-house library of ~ 300 compounds (automated in mzmine)

- In house NIST library search (automated in mzmine)

- Public databases

- MS-MS fragments (manual)

● GCMS:

- Samples derivatised (MOX + TMS), Spectra are highly fragmented

- Peak recognition by NIST search and Retention index

- Some overlap occurs particularly with sugars

● Although partly automated, careful manual curation of data is essential to prevent

misidentification. Even then the possibility of wrongly identified peaks remains.


9

Quality control of data

● Principal Components Analysis is used as a

first pass quality assurance tool

● All instruments show drift, provided it is

consistent and minor it may be corrected by

normalisation

● Composite mixed samples are repeatedly

injected to ensure consistent results

● Occasionally instrument problems occur which

compromise the data

● Example of GC-MS data with analytical batch

to batch variation

Composite mixed sample

Technical replicates

Injection failure


10

Normalisation

● Normalisation is often required to reduce overall amplitude variations caused by spectrometer response or sample dilution factors

● Normalisation has to be applied with caution as it can also remove useful variation

● The plot shows QC, Blank and Amino acid standards repeatedly injected during a batch of LC-MS experiments. In this case total signal normalisation was effective in removing baseline variation.

● Internal standard normalisation works very well with accurate mass data, not so well with low resolution MS (due to interference)

● Syngenta commissioned Umetrics to add normalisation filters within SIMCA-P

Total signal

normalisation

Minimising the effects of closure on analytical data Erik Johansson, Svante

Wold, Kristina Sjodin Analytical Chemistry 1984,56,1685Classification: PUBLIC

11

OPLS model of combined Polar and Apolar DSQ-GC-MS

● OPLS model built vs development time

● OPLS – able to partition systematic effects from desired effects

● Scores plot on left - green indicates pre-breaker and orange/red post breaker

ripening in 1st component (predictive component)

● Scores plot on right - analytical variation in 2nd orthogonal component


12

3-D Plot of Combined Positive and Negative Ion LC-MS (LTQ)

● OPLS model built vs development time

● Three orthogonal components were

obtained

● 1st related to AC++ ripening

● 2nd related to CNR diverging


13

Stability of metabolic trajectory – excellent agreement

Thermo LTQ

Nominal mass

Genedata MS refiner processed

“fresh” samples

Thermo LTQ Velos Orbitrap

High resolution accurate mass

mzMine processed

Samples stored for 6months -18’C


14

Gene Expression vs Metabolite data

● OPLS vs time models for both gene and metabolite data show strong similarities

● CNR divergence from an early point, joint development up to breaker stage for AC RIN and NOR, further development of AC post breaker


15

Comparison of genotypes

● Comparing the OPLS

loadings for each

genotype vs time we can

observe metabolite

changes that are unique

and are shared by each

genotype

● Here the wildtype (AC) is

compared with the

colourless non-ripe

tomato (CNR)

(delta-tomatine)

Visualization of GC-TOF/MS Based Metabolomics Data for Identification of Biochemically

Interesting Compounds Using OPLS Class Models Susanne Wiklund, Erik Johansson,

Lina Sjöström, Ewa J, Anal. Chem. 2008, 80 (1), pp 115–122Classification: PUBLIC

16

Scenarios in Annotation

● Unequivocal assignments

- mass spectrometry data alone is able to assign the identity to an

unknown component

● Ambiguous assignments

- mass spectrometry data can lead to more than one potential identity,

such as class assignments (e.g. Flavanoids)

● Equivocal assignments

- mass spectrometry data alone cannot assign an identity wholly or

even in part


17

New and Developing Technologies

● Advances in technologies

- High resolution, high mass accuracy instrumentation

- E.g. New TOFs and Orbitraps

● On-line data bases and library tools

- E.g. Chemspider, Kegg, HMDB, MassBank, Chebi

● Development of mass spectrometry based library utilities

- Existing instrument vendor offerings

- In-house specific offerings

- Fragmentation predictors (e.g. Mass Fragment, Mass Frontier)


18

Advances in Technologies – Mass Accuracy

● Accurate mass measurement is used to determine elemental formulae

● The better the accuracy the less the ambiguity

● Mass accuracy is defined as the ratio of the m/z measurement error to

the true m/z

● External mass calibration methods are less mass accurate than internal

calibration method

- E.g. LTQ-Orbitrap

- External calibration <3ppm

- Internal calibration <1ppm


19

Understanding what accurate mass measurement gives us...

Mass Resolution

Ma

ss A

ccu

racy


20

Tomato Metabolite Example (30,000 res)

Component Measured

Mass (H+)

Proposed

Formula

Mass error

ppm

No. of hits

within

2ppm

Proposed

Annotation

A 205.09718 C11H13O2N2 0.126 1 Tryptophan

B 273.07611 C15H13O5 1.318 1 Naringenin

C 355.10245 C16H19O9 0.257 1 Chlorogenic

Acid

D 414.33694 C27H44O2N 0.685 1 Tomatidinol

E 416.35260 C27H46O2N 0.490 1 Tomatidine

F 578.40503 C33H56NO7 0.172 1 -Tomatine

G 740.45850 C39H66NO12 0.739 2 γ-Tomatine

Assumption: nitrogen rule only


21

On-line Database Hits – “Proposed Formula” Search

Component Measured

Mass (H+)

Proposed

Formula

Chebi MassBank HMDB Chemspider Kegg

A 205.09718 C11H12O2N2 90 24 1 124 8

B 273.07611 C15H12O5 82 4 2 459 15

C 355.10245 C16H18O9 16 8 2 68 4

D 414.33694 C27H43O2N 19 0 0 187 1

E 416.35260 C27H45O2N 18 0 0 206 1

F 578.40503 C33H55NO7 2 0 0 30 0

G 740.45850 C39H65NO12 1 0 0 8 0

How plant (tomato) specific are these databases?


22

Conclusions

● Analysed the metabolome of fruit from wild type and tomato ripening

mutants using high resolution chromatography mass spectrometry

methods

- GC-MS, UPLC-MS/MS

- accurate mass high resolution UPLC-MS/MS

● Data was processed using a combination of RefinerMS and Sieve

followed by multivariate statistics using SIMCA-P (PCA and OPLS-DA)

● Annotation of components using a combination retention indexes and

mass spectra (accurate mass etc)

● Observed distinct metabolic differences between genotypes associated

with ability to develop ripening competency


23

Conclusions

● Comparison with authentic reference material offer the best way to

obtain an unequivocal identification

● But a number of procedures can be utilised to improve confidence in

annotation of components detected in metabolomics studies

● High resolution and high mass accuracy are essential in obtaining

elemental formulae

- The higher the mass accuracy the better!

● On-line database searches are very useful but in many cases do not

contain “plant” specific metabolites


24

Conclusions

● Further corroborative information can be obtained using MS/MS,

fragmentation predictor software or in-house libraries

● No excuse for not reading the literature!

● Where no reference material is available many assignments are likely to

remain ambiguous, however, this may be adequate for many

applications

● And ultimately, where MS does fail to provide the answer then

preparative LC and NMR may be required


25

Acknowledgments

● Nottingham University

- Charlie Hodgman

- Graham Seymour

● Genedata

- Peter Haberl

- Mike Bowyer

● Funding

- ESB-link

- ERA-Net post genomics:

TOMQML

● Syngenta

- Charles Baxter

- Mark Seymour

- Mark Earll

- Zsuszanna Ament


Metabolite Profiling and Identification Employing High Resolution MS Strategies

David Portwood

Syngenta, Jealotts Hill International Research Centre, UK