Standardizing Gas Chromatography-Mass Spectrometry ...bioanalysis.web.auth.gr/workshop/Klapa.pdf ·...
Transcript of Standardizing Gas Chromatography-Mass Spectrometry ...bioanalysis.web.auth.gr/workshop/Klapa.pdf ·...
Standardizing
Gas Chromatography-Mass Spectrometry
Metabolomics
Maria I. KlapaMetabolic Engineering and Systems Biology Laboratory
Institute of Chemical Engineering and High Temperature Chemical Processes, Foundation for Research and Technology-Hellas (FORTH),
Patras, GREECE
Free Metabolite Pool ExtractionPolar
Metabolomic Profiling: a multi-step procedure
Data Acquisition
Non - Polar
NMRGC-MSLC-MSCE-MS
GC-MS
Data Analysis
Metabolite Identification
Pool Quantification
Multivariate Statistical Analysis
Schematic Diagram of Metabolomic Analysis
Biological Sample
BiologicalConclusions
DriedMetabolite
Mixture
Peak Area Profile
List of
Marker Ion Peak Areas
Schematic Diagram of Metabolomic Analysis
Biological Sample
BiologicalConclusions
DriedMetabolite
Mixture
Peak Area Profile
List of
Marker Ion Peak Areas
Original Metabolite j
Concentration
Measured Marker Ion i Peak Area
×=)( j
iMIjRF1
Internal Standard Normalizationonly biases that change RF to the same extent for all metabolites (Type A) might be presente.g. variation in the injected volumes, variation in drying, variation in replicate division, &Equipment’s operating conditions remain constantamong runs
Original Metabolite j
Concentration
Measured Marker Ion i Peak Area
×=)( j
iMIjRF1
Internal Standard (IS)
OriginalConcentration
Measured IS Marker Ion k Peak Area
×=)( IS
kMIISRF1 RPA j
ratio between 2 states (Metabolite’s j Concentration) = ratio (RPAj )
Schematic Diagram of Metabolomic Analysis
Biological Sample
BiologicalConclusions
DriedMetabolite
Mixture
Peak Area Profile
List of
Marker Ion Peak Areas
GC-MS
Mixture of Metabolite Derivatives
Derivative
Derivative
From Original Metabolite to Derivative Peak Area
concentration of the original metaboliteconcentration of a derivative of the original metaboliteMeasured peak area of the derivative’s marker ion(s)
Derivative’s I concentration
Measured Marker Ion h Peak Area
×=)( l
hMIlRF1
Internal Standard (IS)
OriginalConcentration
Measured IS Marker Ion k Peak Area
×=)( IS
kMIISRF1 RPAderiv. l of Mj
ratio between 2 states (Mj Concentration) ? ratio(RPAderiv. l of Mj )
Type B Biases
Incomplete derivatizationMultiple Derivatives for some MetabolitesPotential Change in Equipment’s Conditions between Runs
Need for a NEW Data Normalization, Correction and Validation Strategy
not jeopardizing the high-throughput nature of metabolomic analysis
Η. Κanani and M.I. Klapa #. 2007. Data Correction Strategy for Metabolomics Analysis using Gas Chromatography-Mass Spectrometry, Metabolic Engineering
Vol.9:39-51
TMS and MeOX Derivatization
R1C=OR2
+ Methoxyamine HCL
MSTFA
R1C=N-O-CH3
R2R1C=N-O-CH3
R2
syn
anti
R-COOHR-OH
R-NH2
R-COO-Si(CH3)3
R-O-Si(CH3)3
R-NH-Si(CH3)3
Metabolite Category 1kM + MSTFA MD
[ ] [ ] MDMD RPAwMDM *==
MDj
ISl
MD RFRFw =
SILYLATION
silylation time
conc
entr
atio
n
for t > tM
SILYLATIONsilylation time
conc
entr
atio
n
Metabolite Category 2
[ ] [ ] [ ]21 MDMDM +=
[ ][ ]
22
110
2
1
2
1
MDMD
MDMD
RPAwRPAw
kkk
MDMD
**
===
(+MSTFA)
k3
k1
k2
M + MD1
(+MSTFA)Methoxy
amine MD2
MD1
MD2ox
k3
ox
SILYLATIONsilylation time
conc
entr
atio
n
Metabolite Category 2
[ ] [ ] [ ]21 MDMDM +=
[ ][ ]
22
110
2
1
2
1
MDMD
MDMD
RPAwRPAw
kkk
MDMD
**
===Data Validation Criterion!
(+MSTFA)
k3
k1
k2
M + MD1
(+MSTFA)Methoxy
amine MD2
MD1
MD2ox
k3
ox
0
0.2
0.4
0.6
0.8
1
1 101 201 301 401 501 601 701 801
Injection Number
MD
1/ M
D2
Glucose Fructose
Published Metabolomic Analysis based on Metabolomic Data Acquired at Different Equipment conditions
SILYLATIONsilylation time
conc
entr
atio
n
Metabolite Category 3M + MSTFA M(TMS)x
M(TMS)x+1M(TMS)x+n
k
k1(+MSTFA)
[ ] [ ] ∑∑==
==n
iMDMD
n
ii ii
RPAwMDM11
*
Peak Area Variation with derivatization time among replicates of the same sample
15-100%
0.0001
0.0010
0.0100
0.1000
1.0000
5 6 7 8 9 10 11 12 13 14 15 16 17
Derivatization Time
Rel
ativ
e Pe
ak A
rea
Raw Data from Standard Amino Acid Mixture
New Normalization Algorithm
⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
=
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
•
⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢
⎣
⎡
][][
.
.
.][][
.
.
.
...............
...
.
.
.
o
o
o
o
M
M
MDt
MDt
MDt
MDt
ISM
ISM
w
w
RPA
RPA
RPA
RPA
N
N
V
N
V
11
1
1
1
# of derivatives
# of
tim
epo i
nts
: relative (with respect to the peak area of the internal standard) peak area corresponding to the i-th derivative of M metabolite at derivatization time tj
1
1
MDtRPA
• Η. Κanani and M.I. Klapa #. 2007. Data Correction Strategy for Metabolomics Analysis using Gas Chromatography-Mass Spectrometry, Metabolic Engineering
Vol.9:39-51
• U.S. Patent Application No. 11/362,717• Best University of Maryland Invention of the Year 2005 in Information Sciences
Peak Area Variation with derivatization time among replicates of the same sample
dropped from 15-100% to 2-8%
0.0001
0.0010
0.0100
0.1000
1.0000
5 6 7 8 9 10 11 12 13 14 15 16 17
Derivatization Time
Rel
ativ
e Pe
ak A
rea
Normalized Data from Standard Amino Acid Mixture
0.0001
0.0010
0.0100
0.1000
1.0000
5 6 7 8 9 10 11 12 13 14 15 16 17
Derivatization TimeEf
fect
ive
Peak
Are
a
Category - 1 and 2 Metabolites
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
0 100 200 300 400 500
Time After Addition of MSTFA (minutes)
log2
(Pea
k A
rea
time
at s
ilyla
tion
time
t /
Peak
are
a at
30
min
of s
ilyla
tion)
citrate TMS sorbitol TMS iso-citrate TMS
ribitol 5 TMS threonate TMS fumarate TMSGlycerol 3TMS fructose meox2 TMS glucose MeOX1 5TMS
Kanani HH, Chrysanthopoulos P, Klapa MI. 2008. Standardizing GC-MS Metabolomics. J. Chromatogr. B Analyt Technol Biomed Life Sci. 871: 191-201
Matrix Effect-Derivatization time 14hr
0.10
1.00
10.00
100.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Plant Sample Number
Rat
io (P
A M
D1
/ PA
MD
2)
Fructose Glutamate Threonine Asparagine
Matrix Effects Limit the Accuracy of the Measurements even in the presence of an automated derivatization scheme
Kanani HH, Chrysanthopoulos P, Klapa MI. 2008. Standardizing GC-MS Metabolomics. J. Chromatogr. B Analyt Technol Biomed Life Sci. 871: 191-201
Identification of Unknown Peaks
Derivatives formed from chemical transformations
Derivatives not present inmajor public databases
n/d: Not detected consistentlyin all the samples
n/dLeucine N N O21.000Lecine N On/dLeucine OLeucine11
n/dTryptophan N N O1.0Tryptophan N On/dTryptophan O2 (putative)Tryptophan17
0.26Tyrosine N O O0.94Tyrosine O O1.18Tyrosine O2 (putative)Tyrosine18
7.87Serine NNOO20.299Serine N O O2.97Serine O OSerine15
0.48Phenylalanine N O1.30Phenylalanine OPhenylalanine14
2.124Lysine NNNNO21.005Lysine N N N On/dLysine N N OLysine*12
n/diso-Leucine N N O20.92iso-Leucine N O2.55iso-Leucine Oiso-Leucine*10
0.369Methinonine N N O21.42Methionine N OMethionine*13
1.00Histidine N N On/dHistidine N On/dHistidine O2 (putative)Histidine9
0.37Cysteine N N O12.67Cysteine N S On/dCysteine N O2Cysteine5
1.595Asparagine N N N N O2,3
(putative)1.904Asparagine N N N O0.726Asparagine N N OAsparagine3
n/dOrnithine N N N N O0.48Ornithine N N N O21.10Ornithine N N N OArginine2
0.73Dopamine N N O O4.16Dopamine N O ODopamine23
2.67Homoserine N N O O0.231Homoserine N O O6.51Homoserine O OHomoserine24
n/dOrnithine N N N N O0.48Ornithine N N N O21.10Ornithine N N N OOrnithine25
0.774Alanine N N O1.025Alanine N OAlanine*1
0.224Aspartate N O O3.824Aspartate O O2,3Aspartate*4
Derivative 3Derivative 2Derivative 1Amino acid
33.5Threonine NNOO20.321Threonine N O O3.30Threonine O OThreonine16
0.988Pyroglutamate N O11.014Glutamate N O OGlutamate6
9.000Pyroglutamine NNO 1,2,3
(putative)10.3Glutamine N N N O0.667Glutamine N N OGlutamine7
b–Alanine N N O
Allantoin N N N N N
Valine N N O2,3
MD3
Gaba N N OB- Alanine N O
Allantoin N N N N
Valine N O
Glycine N N O
MD2
Gaba N OB-Alanine O
Allantoin N N N
Valine O
Glycine N O
MD1
1.0n/dGaba22
2.120.53025.3Allantoin20
n/d0.8421.638Valine*19
0.80n/d8.88Beta-Alanine21
0.7739.397Glycine*8
w3w2w1(M)
n/dLeucine N N O21.000Lecine N On/dLeucine OLeucine11
n/dTryptophan N N O1.0Tryptophan N On/dTryptophan O2 (putative)Tryptophan17
0.26Tyrosine N O O0.94Tyrosine O O1.18Tyrosine O2 (putative)Tyrosine18
7.87Serine NNOO20.299Serine N O O2.97Serine O OSerine15
0.48Phenylalanine N O1.30Phenylalanine OPhenylalanine14
2.124Lysine NNNNO21.005Lysine N N N On/dLysine N N OLysine*12
n/diso-Leucine N N O20.92iso-Leucine N O2.55iso-Leucine Oiso-Leucine*10
0.369Methinonine N N O21.42Methionine N OMethionine*13
1.00Histidine N N On/dHistidine N On/dHistidine O2 (putative)Histidine9
0.37Cysteine N N O12.67Cysteine N S On/dCysteine N O2Cysteine5
1.595Asparagine N N N N O2,3
(putative)1.904Asparagine N N N O0.726Asparagine N N OAsparagine3
n/dOrnithine N N N N O0.48Ornithine N N N O21.10Ornithine N N N OArginine2
0.73Dopamine N N O O4.16Dopamine N O ODopamine23
2.67Homoserine N N O O0.231Homoserine N O O6.51Homoserine O OHomoserine24
n/dOrnithine N N N N O0.48Ornithine N N N O21.10Ornithine N N N OOrnithine25
0.774Alanine N N O1.025Alanine N OAlanine*1
0.224Aspartate N O O3.824Aspartate O O2,3Aspartate*4
Derivative 3Derivative 2Derivative 1Amino acid
33.5Threonine NNOO20.321Threonine N O O3.30Threonine O OThreonine16
0.988Pyroglutamate N O11.014Glutamate N O OGlutamate6
9.000Pyroglutamine NNO 1,2,3
(putative)10.3Glutamine N N N O0.667Glutamine N N OGlutamine7
b–Alanine N N O
Allantoin N N N N N
Valine N N O2,3
MD3
Gaba N N OB- Alanine N O
Allantoin N N N N
Valine N O
Glycine N N O
MD2
Gaba N OB-Alanine O
Allantoin N N N
Valine O
Glycine N O
MD1
1.0n/dGaba22
2.120.53025.3Allantoin20
n/d0.8421.638Valine*19
0.80n/d8.88Beta-Alanine21
0.7739.397Glycine*8
w3w2w1(M)
Derivatives treated as unknowns in public
databases
Conclusions
We developed a GC-MS metabolomic data validation, normalization and correction strategy that does NOT jeopardize the high-throughput nature of the analysis
The method is easy to implement and increases the accuracy of measurements by an order of magnitude for some metabolites (NH2 containing compounds)
In light of the importance of metabolomics research, this method is expected to provide a valuable tool for
the acquisition of accurate metabolomic data
Objective
To analyze stress-induced molecular interaction networks
in the context of plant primary metabolism
during the first (30) hours of the stress treatment
under a variety of individual or combined perturbations
using integrated time-series transcriptomic & metabolomicanalyses
Model System: Αrabidopsis thaliana Whole Plant Liquid Cultures Well-controlled growth environment
Experimental Design & Setup
X 4
9h6h 12h 18h 24h1 h
3h
X 2
HarvestingDay 12
LIGHT
Humidity
(80-100 µmole/cm2/s)
60%TEMP (23 °C)
A. thaliana whole-plant liquid culturesDay 0
Exp 3,4,5
Elevated CO2
Exp 2
ControlSet
Combined StressesExp 6,7,8
Gamborg Media + Sucrose (58 mM)Air 0.04% CO2
30h
NaCl (50mM) or Trehalose (12mM) or ACC
Air 1% CO2
NaCl (50mM) or Trehalose (12mM) or ACCAir 1% CO2
Dutta et. al. 2008. Time-series integrated “omic” analyses to elucidate short-term stress-induced responses
in plant liquid cultures. Biotech. Bioeng. (In Press; E-print Available)
3
2
1 4
1% CO2
7
8
AmbientCO2
Sucrose (58 mM)
Glucose (58 mM)
NaCl (50 mM)
910
ACC (0.01 mM)
6
6
Media Pert. effectElevated CO2 Effect
1. Control2. 1% CO23. 50 mM NaCl4. 1% CO2 + 50 mM NaCl Trehalose (12 mM)
cDNA MicroarrayTranscriptomic analysis
(Bhaskar Dutta, UMD)
GC-MS Metabolomic analysis(Harin Kanani, UMD)
10 Exp * 20 samples * 2 Injections * 550 Peaks = 220,000 Total Measurements (8 Exp * 20 samples) Trizol extractions 160 mRNA amplifications 640 cDNA syntheses
320 Dye Injections 320 Micro-array hybridizations (flip-dye)
GCGC--MS MS MetabolomicMetabolomic Data Correction MethodologyData Correction Methodology
Without Data CorrectionPaired-SAM analysis (TIGR MeV v.3.1)
delta = 1.2, FDR (median)= 0%
Cat-3Cat-1 Cat-2 Η. Κanani and M.I. Klapa #. 2007. Data Correction Strategy for Metabolomics Analysis
using Gas Chromatography-Mass Spectrometry, Metabolic Engineering 9:39-51Kanani HH, Chrysanthopoulos P, Klapa MI. 2008. Standardizing GC-MS Metabolomics.
J. Chromatogr. B Analyt Technol Biomed Life Sci. 871: 191-201
Unknown 022
3,4-
Glyoxylate
Glycerol 3 P
4-Hydroxybutanoate
Unknown 083Unknown 024Unknown 116Unknown 040Unknown 044Unknown 048Unknown 074Unknown 345
Unknown 161
Unknown 133Unknown 088Unknown 285Unknown 010
Unknown 059Unknown 097Unknown 089Unknown 039Unknown 136
Unknown 013Unknown 073
Unknown 195
Dihydroxybutyrate
Nicotinate
Glycerate
With Data Correction
Relative Peak Areas
27 + 1 significant
Unknown 022
3,4-Dihydroxybutyrate
NicotinateGlycerol 3 P
Unknown 083
Unknown 024Unknown 116
Unknown 040Unknown 044Unknown 048
Unknown 074
Unknown 345
Unknown 161
Unknown 133
Unknown 088
Unknown 285
Unknown 010
Unknown 059
Unknown 097Unknown 089
Unknown 039
Unknown 136
4-Hydroxybutanoate
Unknown 381
Unknown 371Unknown 391
Unknown 368
Unknown 387
Unknown 376
LysineUnknown 412
Unknown 390
Unknown 415
Uracil
Glyoxylate
26+1+11 significant
Unknown 013Unknown 073
Unknown 195
Relative Peak Areas
PCA- Metabolomic Data: Individual Stress Response
PC1: 48%
PC2: 14%
PC3: 9%
Total: 71%
1% CO2
Control
PC 3
PC 1
PC 2
MiMicroarray TimeTime-series SSignificance Analysis
Consists of 4 modules
Comparison of significant genes’
GO analysis results between time-points
Identification of Significant Genes at each time-point
Correlation Analysis between timepoints with respect to their common
significant genes
US Patent Application, 2006
Dutta B, Snyder R, Klapa MI. 2007. Significance Analysis of Time-Series Transcriptomic Data: A methodology that enables the identification
and further exploration of the differentially expressed genesat each time-point. Biotech. Bioeng. 98: 668-678.
Analysis of Gene Variability in
Significance Level Among Time Points
Elevated CO2 stress : Time-profile of No of Significant Genes
Metabolomic Analysis
20
40
60
80
1 h 3 h 6 h 9 h 12 h 18 h 24 h 30 h PairedSAMfra
ctio
n of
tota
l num
ber (
295)
of
met
abol
ites
(%)
% FDR 2.6 2.1 2.4 1.7 1.9 1.6 4.8 1.8 0
PositivelySignificant NegativelySignificant
Total Significant
% FDR 0 0 0.011 0.014 0 0.007 0 0 0
Metabolomic Analysis
Dutta et. al. 2008. Time-series integrated “omic” analyses to elucidate short-term stress-induced responses in plant liquid cultures. Biotech. Bioeng. (In Press; E-print Available)
0
10
20
30
40
1h 3h 6h 9h 12h 18h 24h 30h PairedSAMfra
ctio
n of
tota
l num
ber (
1123
1)
of g
enes
(%)
Transcriptomic Analysis
Acknowledgements
FundingUS NSF Grant: QSB-0331312, UMD Minta Martin Foundation,UMD Department of Chemical & Biomolecular Engineering,FORTH/ICE-HTBayer HealthCare LLC