Analytical method evaluation and discovery of variation within maize varieties in the context of...

download Analytical method evaluation and discovery of  variation within maize varieties in the context of food safety Transcript profiling and metabolomics..pdf

of 45

Transcript of Analytical method evaluation and discovery of variation within maize varieties in the context of...

  • Journal of Agricultural and Food Chemistry is published by the American ChemicalSociety. 1155 Sixteenth Street N.W., Washington, DC 20036Published by American Chemical Society. Copyright American Chemical Society.However, no copyright claim is made to original U.S. Government works, or worksproduced by employees of any Commonwealth realm Crown government in the courseof their duties.

    ArticleAnalytical method evaluation and discovery of variation within maize

    varieties in the context of food safety: Transcript profiling and metabolomics.Weiqing Zeng, Jan Hazebroek, Mary Beatty, Kevin Hayes, Christine Ponte, Carl A. Maxwell, and Cathy Zhong

    J. Agric. Food Chem., Just Accepted Manuscript DOI: 10.1021/jf405652j Publication Date (Web): 24 Feb 2014Downloaded from http://pubs.acs.org on March 16, 2014

    Just Accepted

    Just Accepted manuscripts have been peer-reviewed and accepted for publication. They are postedonline prior to technical editing, formatting for publication and author proofing. The American ChemicalSociety provides Just Accepted as a free service to the research community to expedite thedissemination of scientific material as soon as possible after acceptance. Just Accepted manuscriptsappear in full in PDF format accompanied by an HTML abstract. Just Accepted manuscripts have beenfully peer reviewed, but should not be considered the official version of record. They are accessible to allreaders and citable by the Digital Object Identifier (DOI). Just Accepted is an optional service offeredto authors. Therefore, the Just Accepted Web site may not include all articles that will be publishedin the journal. After a manuscript is technically edited and formatted, it will be removed from the JustAccepted Web site and published as an ASAP article. Note that technical editing may introduce minorchanges to the manuscript text and/or graphics which could affect content, and all legal disclaimersand ethical guidelines that apply to the journal pertain. ACS cannot be held responsible for errorsor consequences arising from the use of information contained in these Just Accepted manuscripts.

  • 1

    Analytical method evaluation and discovery of variation within maize 1

    varieties in the context of food safety: Transcript profiling and metabolomics. 2

    3

    Weiqing Zeng1*, Jan Hazebroek

    2, Mary Beatty

    2, Kevin Hayes

    3, Christine Ponte

    1, Carl Maxwell

    1, 4

    and Cathy Xiaoyan Zhong1* 5

    6

    1DuPont Pioneer, Regulatory Sciences, Wilmington, DE 19880 7

    2DuPont Pioneer, Analytical & Genomics Technologies, Johnson, IA 50131 8

    3DuPont Pioneer, Trait Characterization, Johnson, IA 50131 9

    *Corresponding Authors 10

    11

    Page 1 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 2

    SUMMARY 12

    Profiling techniques such as microarrays, proteomics, and metabolomics are used widely to 13

    assess the overall effects of genetic background, environmental stimuli, growth stage, or 14

    transgene expression in plants. To assess the potential regulatory use of these techniques in 15

    agricultural biotechnology, we carried out microarray and metabolomic studies of three different 16

    tissues from eleven conventional maize varieties. We measured technical variations for both 17

    microarrays and metabolomics, compared results from individual plants and corresponding 18

    pooled samples, and documented variations detected among different varieties with individual 19

    plants or pooled samples. Both microarray and metabolomic technologies are reproducible, and 20

    can be used to detect plant-to-plant and variety-to-variety differences. A pooling strategy 21

    lowered sample variations for both microarray and metabolomics, while capturing variety-to-22

    variety variation. However, unknown genomic sequences differing between maize varieties 23

    might hinder the application of microarrays. High throughput metabolomics could be useful as a 24

    tool for the characterization of transgenic crops. However, researchers will have to take into 25

    consideration the impact on the detection and quantitation of a wide range of metabolites on 26

    experimental design as well as validation and interpretation of results. 27

    KEYWORDS 28

    Metabolomics, zea mays, maize, microarray 29

    30

    Page 2 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 3

    INTRODUCTION 31

    Global demand for food is increasing rapidly, a trend that is expected to continue for many years. 32

    This trend coincides with the growth of the world population, the limited availability of arable 33

    land and irrigation water, and global environmental changes.1-3

    In addition to traditional plant 34

    breeding, biotechnology has become a main focus in the effort to meet the global food demand. 35

    The main crops targeted for genetic engineering include maize, soy, cotton, oilseed, 36

    canola/rapeseed, rice, potato, staple cereal plants, and vegetables.2 37

    The introduction of genetically modified (GM) crops has presented technical, regulatory, 38

    and social challenges.4,5

    Detailed studies are required to demonstrate that food and feed produced 39

    from agricultural products developed through biotechnology are as safe as conventional 40

    counterparts, not posing risks to the environment or human health.6-8

    In the early 2000s, the 41

    concept of substantial equivalence emerged for testing the equivalence of GM and corresponding 42

    conventional crops.5,9

    The introduction of a single gene of interest should preferably affect only 43

    the desired trait. The biochemical composition of the crop should otherwise be comparable to a 44

    parental strain or a variety similar to the parental line.10

    Therefore, compositional analysis 45

    covering key nutrients and anti-nutrients is recommended by the Organization for Economic 46

    Cooperation and Development (OECD). This targeted approach, focusing on the majority of the 47

    compositional components,11-14

    has been widely accepted by international regulatory agencies as 48

    part of the concept of substantial equivalence and applied to the assessment of the safety of GM 49

    crops.9,14,15

    50

    The development of -omics profiling offers powerful high-throughput tools for 51

    biomedical and agricultural studies. Since non-targeted profiling technologies can screen many 52

    components simultaneously, they have the potential to provide insight into complicated 53

    metabolic pathways and their interconnections. Such technologies therefore could represent 54

    valuable analytical approaches for the assessment of substantial equivalence for GM plants.10,15-

    55

    17 The challenges in the use of these methods are due to the complexity of the data sets and the 56

    use of different technological platforms and software that might generate artifacts, biases, and 57

    non-uniform data representations.18

    58

    Although non-targeted surveys of the overall transcriptome, proteome, or metabolome of 59

    a plant at one snapshot in time and tissue are gaining attention,19,20

    these technologies are not yet 60

    fully validated within the regulatory framework and therefore not at present officially 61

    Page 3 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 4

    recommended for safety evaluations of GM plants. A major challenge is to determine whether 62

    any detected differences are due to genetic manipulation through biotechnology or are due to 63

    natural variation resulting from genetic and environmental effects, interaction of genotypes with 64

    environments, or even stochastic differences between plants. For this purpose it is necessary to 65

    evaluate reproducibility of these analytical methods and natural variation of the results of 66

    applying these methods to crop species, such as maize. Without this understanding it would be 67

    impossible to interpret the omics data and declare equivalence. Therefore, the International Life 68

    Sciences Institute (ILSI) recommended establishing baseline ranges for natural variations and 69

    validating these -omics technologies before they can be used for regulatory assessment of 70

    biotech crops.15

    This paper is directed towards fulfilling this function for transcriptomic and 71

    metabolomic methods. 72

    Microarray analysis of transcriptomes is available for both model and crop plants, 73

    including Arabidopsis, maize, rice, potato, tomato, soy, pepper, barley, Brassica, and 74

    sugarcane.21

    Microarrays provide high-throughput, simultaneous detection of differences in 75

    mRNA abundance between samples for thousands of genes. Use of microarray technology for 76

    safety assessment of GM crops faces some challenges. First, nucleic acid probe hybridization is 77

    not able to detect genes expressed at very low level or genes with alternate splicing forms. 78

    Second, it is difficult to achieve high reproducibility for microarray experiments due to 79

    variations resulting from sample handling, experiment processes, environmental impact on 80

    plants, and crop variety differences.22-24

    81

    Technologies for simultaneous analysis of metabolites have been developed,25,26

    and 82

    offer the possibility of surveying significantly more metabolites than conventional chemical 83

    analyses in a much shorter time and with much lower cost per analyte. However, comparing data 84

    from different laboratories remains challenging. This challenge is usually due to relative rather 85

    than absolute quantification and to different methodologies adopted by different groups, 86

    including equipment platforms and statistical analysis methods. High sample-to-sample and 87

    experiment-to-experiment variability, even within the same laboratory, and the wide 88

    concentration range of the same metabolite between plants add to the complexity of the 89

    analysis.10

    We applied microarray and metabolomic technologies to a randomized field study as 90

    conventionally used in regulatory studies. To evaluate the reproducibility and technical 91

    variations of the microarray and metabolomic technologies, the samples were tested individually 92

    Page 4 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 5

    or as pools of plants, RNA and metabolites were extracted and analyzed by microarray and 93

    GC/MS. Overall, we evaluated the reproducibility of the microarray and metabolomic 94

    technologies in order to explore the capability of these methodologies in our experimental 95

    settings to detect the natural variation of gene expression and metabolite levels between plants 96

    and maize varieties. 97

    98

    Page 5 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 6

    MATERIALS AND METHODS 99

    100

    Plant Tissue 101

    Seven inbred and four non-GM commercial hybrid maize varieties were planted in a randomized 102

    plot at DuPont Stine Haskell Research Center, Newark, Delaware, USA. Twenty-five seeds were 103

    sowed per row for each variety. Leaves at the V5 growth stage and immature kernels at 25 days 104

    after pollination (DAP) were collected in the morning between 8:30 and 12 AM for microarray 105

    and GC/MS-based metabolomics. 106

    Three leaf punches avoiding midribs were collected at the middle of the V5 leaf area and placed 107

    on dry ice immediately after harvest, transported to the lab on dry ice, and stored at -80C before 108

    processing for metabolomic analysis. The remaining leaf was collected and frozen in liquid 109

    nitrogen immediately after harvest, transported to the lab on dry ice, and stored at -80C before 110

    processing for microarray analysis. 111

    For 25 DAP kernels, 10 kernels in the middle row of the ear were collected for metabolomics, 112

    and the remaining kernels were used for microarray analysis. The ears at 25 DAP were removed 113

    from the plants and placed on the wet ice immediately after harvest and transported to the lab on 114

    wet ice. Immature kernels were removed from the cobs, frozen in the liquid nitrogen, and stored 115

    at -80C before processing for microarray and metabolomics analyses. 116

    Mature kernels at R6 growth stage (about 60 DAP) were also collected for metabolomics 117

    analysis. The ears at R6 stage were removed from the plants and placed on the wet ice 118

    immediately after harvest and transported to the lab on wet ice. Ten mature kernels in the middle 119

    row of the ear were removed from the cob, frozen in the liquid nitrogen, and stored at -80C 120

    before processing for metabolomics analyses. 121

    For microarray analysis, tissues were ground into fine powders. For metabolomics analysis, 122

    tissues were lyophilized before grinding to fine powders. Additional pooled samples were 123

    obtained by combining equivalent amounts of ground material from three individual plants. 124

    125

    Microarray 126

    Total RNA was isolated from ground frozen tissue using the EZNA SQ RNA Isolation Kit 127

    (Omega Bio-Tek, Norcross, GA), treated with DNase-I, and used for mRNA isolation with the 128

    Page 6 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 7

    Illustra mRNA Purification Kit (GE Biosciences, Pittsburgh, PA). The total RNA and mRNA 129

    samples were visualized and quantified on a Bioanalyzer 2100 (Agilent Technologies, Santa 130

    Clara, CA). Each mRNA sample was converted into double-stranded DNA by an in-vitro 131

    transcription reaction and labeled with Cy3 fluorescent dye using the Low RNA Input 132

    Fluorescent Linear Amplification Kit (Agilent Technologies, Santa Clara, CA). The cRNA 133

    product was purified with an Agencourt RNAClean Kit (Beckman Coulter, Indianapolis, IN). 134

    Hybridizations were performed overnight with equal amounts of labeled cRNA to a custom 135

    4x44K Maize Oligo Microarray from Agilent Technologies (Santa Clara, CA) according to 136

    Agilents One-Color Microarray-Based Gene Expression Analysis protocol. After hybridization, 137

    the microarray slides were washed and immediately scanned with the G2505C DNA Microarray 138

    Scanner (Agilent Technologies, Santa Clara, CA). The images were visually inspected for 139

    artifacts and feature intensities were extracted, filtered, and normalized with the Feature 140

    Extraction Software (v 10.5.1.1) (Agilent Technologies, Santa Clara, CA). Quality control and 141

    downstream analysis were performed using data analysis tools in Genedata Expressionist and the 142

    statistical language R. Further data analysis and bioinformatic analyses were carried out 143

    according to methods described in Hayes et al.27

    144

    145

    Metabolomics 146

    Metabolites were extracted from approximately 3 mg (dry weight) lyophilized tissues for each 147

    sample. In a 1.1-mL polypropylene microtube containing two -5/32 inch stainless steel ball 148

    bearings, each sample was added with 500 L of chloroform:methanol:water (2:5:2, v/v/v) 149

    solution containing a 0.015 mg ribitol internal standard. Samples were homogenized in a 2000 150

    Geno/Grinder ball mill at setting 1,650 for 1 min and then rotated at 4C for 30 min before being 151

    centrifuged at 1,454g for 15 min at 4C. Aliquots (300-L) were transferred to 1.8-mL high 152

    recovery glass autosampler vials, evaporated to dryness in a speed vac, and re-dissolved in 50 L 153

    of 20 mg mL-1

    methoxyamine hydrochloride in pyridine. The vials were capped, agitated with a 154

    vortex mixer, and incubated in an orbital shaker at 30C for 90 min to form methoxyamine 155

    derivatives. Next, 80 L of N-methyl-N-(trimethylsilyl)trifluoroacetamide (MSTFA) were added 156

    to each sample to form trimethylsilyl (TMS) derivatives by a Gerstel autosampler 30 min prior to 157

    injection to minimize sample variations due to derivatization differences. This just in time 158

    derivatization eliminates variation due to differences in reaction time or temperature. 159

    Page 7 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 8

    Furthermore, the gas chromatograph inlet liner and septum were replaced daily, mitigating 160

    against the known influence of sample residue in the inlet on trimethylsilyation completeness.28

    161

    However, trimethylsilylation can vary with the sample matrix.28

    Thus, for molecules such as 162

    amino acids that present multiple reaction sites leading to the possibility of two or more chemical 163

    derivatives, the relative abundance of these trimethylsilyled forms can vary among the three 164

    different tissue types assayed in this study. 165

    The derivatized samples were separated by gas chromatography on a Restek 30m x 0.25mm x 166

    0.25m film thickness Rtx

    -5Sil MS column with a 10 m Integra-Guard column. One 167

    microliter injections were made with a 1:30 split ratio using the Gerstel autosampler. The 168

    Agilent 6890N gas chromatograph was programmed for an initial temperature of 80C for 0.5 169

    min, increased to 350C at a rate of 18 min-1

    where it was held for 2 min, before being cooled 170

    rapidly to 80C and held there for 5 min in preparation for the next run. The injector and transfer 171

    line temperatures were 230C and 250C, respectively, and the source temperature was 200C. 172

    Helium was used as the carrier gas with a constant flow rate of 1 mL min-1

    maintained by 173

    electronic pressure control. Data acquisition was performed on a LECO Pegasus III time-of-174

    flight mass spectrometer with an acquisition rate of 10 spectra sec-1

    in the mass range of m/z 45-175

    600. An electron beam of 70eV was used to generate spectra. Detector voltage was 1,750 V. An 176

    instrument auto-tune for mass calibration using PFTBA (perfluorotributylamine) was performed 177

    prior to each sample sequence. 178

    179

    Metabolomics Data Processing and Analysis 180

    Raw Leco GC/MS .peg datafiles were converted into .netcdf (Andi) formats using the Leco 181

    ChromaTof

    ver. 4.13 software. Data preprocessing was performed with Genedata Refiner MS

    182

    ver. 5.2.1 software. For each .netcdf file, retention times were converted into retention indices 183

    using an in-house program. Preprocessing consisted of gridding chromatograms in the m/z value 184

    (80-437) and retention index dimensions, subtracting chemical noise, aligning the retention 185

    indices of each selected ion chromatogram, and detecting nominal mass peaks, using empirically 186

    optimized settings for each process. Data from each of the three tissue types were processed 187

    separately to maximize alignment and peak peaking. The resulting three matrices consisted of 188

    intensities for each m/z value_retention index combination and each sample. The aligned and 189

    Page 8 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 9

    de-noised data matrices were passed to Genedata Analyst ver. 2.1 software where each 190

    intensity value by sample was normalized for both the ribitol internal standard signal and sample 191

    dry weight. 192

    Since m/z value_retention index fingerprint data is redundant, significant signatures were 193

    reduced to named known metabolites based on matching both the retention index and mass 194

    spectrum to those of authentic standards. Relative quantitation of each metabolite in each sample 195

    was derived from the intensity of each metabolites representative m/z value obtained from the 196

    Genedata Analyst output. In a few cases, peak heights obtained from ChromaTof

    197

    quantification ion chromatograms were used instead when signals were below the threshold set 198

    for fingerprinting and thus not present in the Genedata Analyst output. Metabolite detection 199

    from either source was dependent on reaching a conservative limit of detection to mitigate 200

    against false positive peaks that would have an undue effect on subsequent statistical analyses. 201

    Percent CV values were calculated for each metabolite across selected samples. Data matrices 202

    were reformatted and imported into the PLS_Toolbox version 7.0.1 (Eigenvector Research, Inc.), 203

    with which principle component analysis (PCA) was performed on autoscaled (mean centered 204

    and each variable scaled to unit variance) data. 205

    206

    Experimental Design 207

    For both microarray and metabolomics experiments, 11 maize varieties were used, including (1) 208

    seven inbred lines PHG9B (high oil), H31(low oil), PH2WBS (high protein), PH2WBR (low 209

    protein), PH0GP (median starch), PH14T(median starch), and 658 (low starch); and (2) 4 210

    commodity hybrid lines 38B85, 37Y12, 34A15, and 34P88. These lines were chosen as a partial 211

    representation of the range of U.S. cultivated maize diversity, and include lines differing in 212

    protein, oil and starch content. Three types of tissues, V5 leaf, 25 DAP immature kernel, and 213

    mature kernel, were used for metabolomic experiments. Because the mature kernels are dormant 214

    and have very limited gene expression,29,30

    only the V5 leaf and the 25 DAP immature kernels 215

    were used for microarray experiments. Due to limited tissue availability for some varieties, some 216

    microarray or metabolomic experiments were not conducted.. 217

    For microarray technical repeat controls, eight independent RNA samples were isolated 218

    from either V5 leaves or 25 DAP immature kernels of a single plant from a high oil variety 219

    PHG9B and a low oil line H31, and used for eight different microarray hybridizations. The 220

    Page 9 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 10

    signal differences among these hybridizations were considered as the technical variation of the 221

    microarray methodology. Similarly, eight independent metabolite extractions were made from 222

    bulk collections of V5 leaves or 25 DAP immature kernels of PHG9B and H31, and from bulk 223

    mature kernels of PH2WBS (high protein line) and PHG9B. They were used for independent 224

    GC/MS analyses and technical variation assessment. The multiple sample preparation and testing 225

    steps were used to evaluate the reproducibility of both technical methods. Sample variations 226

    were also evaluated by comparing data from different individual plants and different pooled 227

    plants. 228

    229

    qRT-PCR 230

    Genes, primers and probes are listed in Supplementary Table 1. Primers and probes were 231

    designed with Primer Express 3.0.1 (Applied Biosystems, Carlsbad, CA) and purchased from 232

    Integrated DNA Technologies, Inc. (Coralville, IA). First-strand cDNA was synthesized from the 233

    same mRNA samples used for microarray. Fifteen pooled samples from either V5 leaf or 25 234

    DAP kernel were chosen based on sample availability. For each RT reaction, 240 ng mRNA was 235

    used as a template in a total volume of 80 l following the manufacturers instruction for the 236

    SuperScript VILO cDNA synthesis kit (Invitrogen, Carlsbad, CA). All qRT-PCR primers and 237

    Taqman probes were designed using the Primer Express program (Applied Biosystems, 238

    Carlsbad, CA), and tested for specificity by Blast search against the NCBI public sequence 239

    database. The qPCR reactions were carried out in 384-well plates in a ViiA 7 real-time-PCR 240

    machine (Applied Biosystems, Carlsbad, CA) using the TaqMan Gene Expression Master Mix 241

    (Applied Biosystems). The qPCR program was 50C for 2 min, 95C for 10 min, followed with 242

    40 cycles of 95C for 15 sec and 60C for 1 min. Each reaction contains 200 nM of each primer, 243

    100 nM probe, and 2 l of the RT reaction solution as template in a final volume of 20 l. Every 244

    reaction was repeated 3 times. The ViiA 7 Software V1.2 was used to record and process the 245

    data. The Rn (Normalized Reporter) values of each reaction for every cycle was exported and 246

    used to calculate the single-well qPCR efficiencies using a Real-Time PCR Miner program.31

    247

    248

    Page 10 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 11

    RESULTS 249

    Microarray Reproducibility 250

    To compare the reproducibility of expression levels of the same genes on repeated microarrays, 251

    the data were analyzed using correlation statistics. The coefficient of variation (CV) for each set 252

    of repeats was calculated and compared as an indication of reproducibility.32

    Mean CV of gene 253

    transcripts for technical repeats were 0.25, 0.23, 0.33, and 0.23 for 25 DAP PHG9B, 25 DAP 254

    H31, V5 leaf PHG9B, and V5 leaf H31 samples, respectively, relatively low compared to the 255

    microarray literature,33-35

    indicating good technical reproducibility. Expression of most of genes 256

    on the microarrays had low CV values, with 82.4%, 92.1%, 90.4%, and 91.6% of genes from V5 257

    leaf PHG9B, V5 leaf H31, 25 DAP PHG9B and 25 DAP H31 microarrays exhibiting CV values 258

    below 0.5 (Figure 1A). In addition, these CV values showed log-normal distribution centered at 259

    0.1 (Figure 1B), indicating good reproducibility. However, the reproducibility of 8 technical 260

    repeat microarrays for PHG9B V5 leaves was little higher than other technical repeat 261

    microarrays (Figure 1A). Alternatively, the CV values were log transformed and plotted against 262

    the log transformed mean values. Polynomial curve fitting showed as expected that CV values 263

    decreased as the mean intensities increased (Figure 1C). The inflection points calculated based 264

    on the polynomial curves showed that technical repeat microarrays for PHG9B V5 leaves have 265

    higher background noise (Figure 1D), similar to that shown by the CV distributions (Figure 1A). 266

    We further investigated the reproducibility of microarray results by a linear regression 267

    model correlating data between any pair of microarrays within each group. Four groups were 268

    analyzed this way for both V5 leaf and 25 DAP kernel samples, including the 8 technical arrays 269

    for H31, the 6 individual biological repeats for H31, the 8 technical arrays for PHG9B, and the 9 270

    individual biological repeats for PHG9B. The box plots represent the distributions of R square 271

    values of all pair-wise comparisons of linear regression modeling (Figure 2). The technical 272

    replicates had consistently higher correlations than the biological replicates (Figure 2). We 273

    conclude that gene expression variation from microarrays resulted primarily from maize variety 274

    differences rather than from plant-to-plant differences, pooled sample-to-sample differences, or 275

    technical variations, indicating that the method is sensitive enough to detect biological variation 276

    among individual or pooled plant samples. 277

    278

    Correlation between gene expression of individual and pooled samples 279

    Page 11 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 12

    Next, CV values for the microarrays from the 6 varieties analyzed both as individual samples (I) 280

    and pooled samples (P) were calculated. For each variety, 6 or 9 individual plants and 2 or 3 281

    pools of samples (3 plants per pool) were analyzed. The pools were created by combining equal 282

    amount of RNA extracts from individual plants. Overall, 73.7% (34A15_I) 93.3% (38B85_P) 283

    of the genes had CV values below 0.5 from V5 leaf samples, and 73.8% (38B85_I) 96.8% 284

    (PHG9B_P) of genes had CV values below 0.5 from 25 DAP kernel samples (Supplementary 285

    Table 2), representing very good experimental reproducibility. The distributions of the CV 286

    values from both 25 DAP kernel and V5 leaf samples are shown in Supplementary Figure 1. 287

    When the overall CV distribution patterns of individual or pooled samples were compared, 288

    microarrays for 25 DAP kernel samples showed larger CV differences, compared to V5 leaf 289

    samples. Additionally, log10 (CV) vs. log10 (mean) plots were generated for all the microarrays 290

    to reveal the relationship between CV and mean intensities (data not shown). The inflection point 291

    values were very similar to what was shown by the CV distribution patterns (Supplementary 292

    Figure 1). 293

    The Pearsons correlation coefficient was calculated by comparing mean gene 294

    expressions between individual samples and pooled samples for each variety and tissue type. The 295

    R values were between 0.9743 and 0.9959 (Table 1), indicating that the signals obtained from 296

    individual plants and pooled plants were highly correlated and similar. When samples from the 297

    same variety but different tissue types were compared, the Pearson correlation R values were 298

    between 0.234 and 0.317 (Table 1), indicating significant gene expression differences between 299

    leaves and kernels, as expected. For every variety-tissue combination, the pooled samples 300

    showed a smaller mean CV value than the one from corresponding individual samples 301

    (Supplementary Table 3). Therefore, the plant-to-plant variation detected from the same variety 302

    was reduced by pooling 3 plants into a single sample, essentially transforming plant-to-plant 303

    variation into sample-to-sample variation. In addition, the distribution patterns of CV values 304

    from I and P samples were very similar (Supplementary Figure 2), indicating that our pooling 305

    strategy was efficient in capturing the variations existing among maize varieties, while realizing 306

    a cost savings. 307

    308

    Gene expression differences between varieties 309

    Page 12 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 13

    To evaluate gene expression variation between maize varieties, mean microarray spot intensities 310

    from 6 or 9 individual samples (I) and 2 or 3 pooled samples, with 3 individuals per each pool 311

    (P) were determined and used for CV calculations comparing the 6 varieties that had both I and P 312

    samples. 313

    When the CV distributions of individual samples representing variety-to-variety 314

    variations (Supplementary Figure 2 and Supplementary Table 4) were compared to the CV 315

    distributions representing plant-to-plant variation within a certain variety (Supplementary Figure 316

    3 and Supplementary Table 2), we found that the former were larger. For V5 leaves, 65.4% of 317

    genes showed a CV value less than 0.5 comparing different maize varieties (Supplementary 318

    Figure 2, Supplementary Table 4), but 73.7% (34A15) to 93.1% (38B85) of genes had a CV less 319

    than 0.5 when comparing individual plants within a given variety (Supplementary Figure 1, 320

    Supplementary Table 2). For 25 DAP kernels, 68% of genes showed a CV value less than 0.5 321

    comparing different maize varieties (Supplementary Figure 2, Supplementary Table 4), but 322

    73.8% (38B85) to 85.1% (H31) of genes had a CV less than 0.5 when comparing individual 323

    plants within a given variety (Supplementary Figure 1, Supplementary Table 2). These results 324

    indicate that higher variations exist among different maize varieties compared to those among 325

    individual plants of the same variety, likely due to the genetic differences and/or genetic and 326

    environmental interactions affecting gene expression among varieties. 327

    In addition, the variety-to-variety variation detected in 25 DAP kernels is similar to that 328

    among V5 leaf tissues based on their CV distributions (Supplementary Figure 2), indicating that 329

    gene expression variations among different maize varieties are similar between these two tissue 330

    types. 331

    332

    Confirmation of Microarray Results by qRT-PCR 333

    To confirm the gene expression levels measured by the microarray experiments, two groups of 334

    18 different genes were chosen for V5 leaf and 25 DAP kernel, respectively (Supplementary 335

    Table 1). The expression levels of these genes are ranked across all microarrays at 80% or 50% 336

    percentiles. Expression of these genes was measured by qRT-PCR reactions using the same RNA 337

    samples used for microarrays. Due to limited sample availability and possible polymorphisms 338

    among different maize varieties at primer annealing locations, we used a Real-Time PCR Miner 339

    program31

    that has been validated by many other groups36-43

    to monitor the single-well qRT-PCR 340

    Page 13 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 14

    efficiency. The dapA gene was used as a control for comparison between microarray and qRT-341

    PCR data. Gene expression levels from qRT-PCR reactions were calculated based on the dapA 342

    expression and compared to levels detected on microarrays that were also quantified based on the 343

    dapA expression. The ratio of expression levels for each gene detected by these two techniques 344

    was log transformed for proper comparison (Figure 3). 345

    For V5 leaf tissue samples, two genes (pco602011 and pco603626) were not amplified 346

    from any of the 15 templates by qRT-PCR and therefore not included in the analysis. Two genes, 347

    pco627753 and pco643043 (gene 2 and 11 in Figure 3A, respectively), had expression detected 348

    only in some of the samples, and 4 genes, pco624384, pco521467, pco652567, and pco658406 349

    (gene 13, 14, 15, and 16 in Figure 3A, respectively), showed higher expression (ca. 2-32 fold 350

    higher), relative to microarray, in all 15 samples. For the remaining genes tested, the expression 351

    levels were close to those measured by the microarrays, although there were some variety-352

    specific expression differences between the two techniques (Figure 3A). For 25 DAP kernel 353

    samples, one gene (pco621453) did not show any amplification from qRT-PCR and was not 354

    included for further analysis. Two genes, pco653893 and pco598383 (gene14 and 15 in Figure 355

    3B, respectively), were amplified from 11 and 8 samples out of 15 varieties, respectively. Two 356

    genes, pco601999, pco632057 (gene 16 and 17 in Figure 3B, respectively), had very different 357

    expression levels from microarray across all varieties (Figure 3B). Gene 16 showed 4-60 times 358

    lower expression and gene 17 showed 8-60 times higher expression when detected by qRT-PCR 359

    compared to microarray results. The rest of genes tested in 25 DAP kernel samples showed good 360

    consistency with microarray data (Figure 3B). A few qRT-PCR expression data points were 361

    different from the microarray data, but only for one or two maize varieties. 362

    363

    Metabolomic Data Analysis 364

    The three processed data matrices contained 3,891 metabolomic signatures or fingerprints (m/z 365

    value_RI combinations) for V5 leaves, 4,300 for 25 DAP immature kernels, and 3,891 for 366

    mature kernels. Of these, 87-103 metabolites were successfully identified in tissues examined. 367

    These numbers, reduced relatively to the raw data set take into account the elimination of the 368

    inherent redundancy in metabolomics signatures and ignoring metabolites the identity of which 369

    could not be unambiguously established. The substantial reductions are due to (1) eliminating the 370

    inherent redundancy in metabolomics signatures wherein each metabolite can be represented by 371

    Page 14 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 15

    multiple m/z values in its electron impact mass spectrum and (2) ignoring metabolites the 372

    identity of which could not be unambiguously established. 373

    To evaluate technical variations including the sampling, analytical, and data analysis 374

    variability, 8 technical repeats were produced from V5 leaves of PHG9B and H31, 25 DAP 375

    kernels from PHG9B and H31, and mature kernels from PH2WBS and PHG9B. For each tissue-376

    variety combination, a single bulk tissue sample was aliquoted into 8 extraction tubes, producing 377

    8 metabolomic samples. Mean CV values calculated from technical repeat metabolite relative 378

    levels were between 0.33-0.54, and median values were between 0.27-0.46, indicating good 379

    reproducibility in spite of some outliers in the upper ranges (Supplementary Figure 3). The 380

    majority of metabolites detected showed CV values less than 0.6, with 76.7% and 87.4% 381

    metabolites from V5 leaves of PHG9B and H31, 76.1% and 80.2% of metabolites from 25 DAP 382

    immature kernels of PHG9B and H31, and 77.6% and 60.9% of metabolites from mature seeds 383

    of PH2WBS and PHG9B, respectively (Supplementary Figure 3). We also found that pooled 384

    samples had lower variances compared to the individual samples (Figure 4, Supplementary Table 385

    5), similar to what was observed from the transcript data. Particularly, metabolomics for mature 386

    seed samples showed much less variation than for immature seeds (Supplementary Table 5), 387

    probably due to a less complex metabolome and terminal differentiation state of mature kernels. 388

    Mean metabolite levels detected from individual and pooled samples of the same tissue-389

    variety combination are highly correlated, with Pearson correlation R values all close to 1 (Table 390

    2). For PHG9B and H31, the high correlations are observed despite the fact that the individual 391

    samples were from 9 different plants, compared to the technical variation tests where metabolites 392

    were extracted from 8 aliquots of a same plant sample. These results also demonstrate consistent 393

    derivatization, GC chromatography, MS data acquisition, and data processing. 394

    395

    Tissue or Variety Separation Based on Metabolomics 396

    When the metabolites detected from the three different tissues were compared, PCA analysis 397

    clearly indicated tissue separation (Figure 5), reflecting tissue specificity of metabolic processes, 398

    as expected. However, PCA analysis revealed variety specificity for only certain variety-tissue 399

    combinations. For example, for V5 leaf tissues, there was clear separation of PH2WBR, PH14T, 400

    and H31 from other varieties based on PC1 and PC3 (Figure 6A). Likewise, PH2WBS and 401

    PH2WBR in 25 DAP kernels were readily distinguished from other varieties with PC1 and PC4 402

    Page 15 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 16

    (Figure 6E). For mature kernels, PH2WBS and PHG9B showed good separation from other 403

    varieties based on PC2 and PC3 (Figure 6I). 404

    By and large, the tissue and variety classifications observed with individual plants were 405

    also evident in pooled plant samples, although sometimes with different principal component 406

    projections (Figures 5 and 6A, J). This result suggests that pooling did not degrade the 407

    discriminating power afforded by individual samples. Interestingly, the combined percent 408

    variance included in the PCA scores plots was slightly higher for pooled samples compared to 409

    that generated for analogous individual samples, suggesting that pooling removed some 410

    uninformative signal. 411

    Loadings associated with examples of the above variety classifications were selected 412

    graphically (Figures 6C, D; G, H; and K, L; in purple) and listed in Supplementary Table 6. The 413

    very significant increases in the amount of amino acids in developing kernels, including glutamic 414

    acid, glutamine, histidine, leucine, lysine, pyroglutamic acid (which could be derived from 415

    glutamine during sample preparation), and tryptophan, are expected for PH2WBS, a genotype 416

    with elevated grain protein. Explanations for the genotype-specific differences (loadings) in the 417

    other tissues are less obvious. For the three examples shown, loadings from pooled plants were 418

    very similar to those from individual plants. Thus, pooling generated similar PCA scores and 419

    loadings, maintaining the ability to classify sample groups (varieties) as well as to identify the 420

    prominent metabolites underlying said classifications. 421

    In this GC/MS metabolomic study, we also found that some metabolites were only 422

    detected in one or two tissue types. Among individual plant samples, there were 19 metabolites 423

    uniquely detected in V5 leaves, 2 only in immature kernels and 3 only in mature kernels (Table 424

    3). It is expected that the metabolome of leaves is more divergent than that of immature or 425

    mature kernels. Some of these metabolites are present but not detected in other tissues, given our 426

    conservative limit of peak detection. Moreover, immature and mature kernels contain more 427

    polysaccharides by weight than leaves. Since approximately 3 mg dry weight samples were used 428

    for all three tissue types, it is expected that the concentration of many small molecule metabolites 429

    will be greater in leaf than in kernel samples. This could result in apparent tissue specificity, as 430

    seen in Table 3. 431

    432

    Range and Variations of Metabolite Abundances 433

    Page 16 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 17

    We observed large ranges in relative levels for many metabolites across all varieties. The ratio 434

    between the maximum value and the minimum value detected for a metabolite in individual 435

    samples ranged from 1.8 to 1,663 for V5 leaf tissues, from 3.3 to 16,815 for immature kernels, 436

    and 2.7 to 585 for mature kernels (Supplementary Table 7). However, when samples were 437

    pooled, the range narrowed to 1.4 to 167 for V5 leaves, 2.1 to 4,828 for immature kernels, and 438

    1.6 to 86 for mature kernels (Supplementary Table 7). Similarly, when the mean values within 439

    each variety for each metabolite from either individual samples or pooled samples were 440

    compared across all varieties with box plots, the pooled samples showed much narrower 441

    distribution compared to the individual samples (data not shown). This observation indicated that 442

    the biological variation among individual plants combined with variety variation was very large. 443

    However, our pooling strategy effectively decreased the biological variation between plants. The 444

    actual relative levels are specific to the current dataset and should not be compared to other 445

    datasets, unless they were processed (aligned and scaled) together. 446

    Multiple derivative forms for certain metabolites are characteristic of GC/MS-based 447

    metabolomics, as illustrated by asparagine in Supplementary Table 7. Asparagine with four TMS 448

    groups (one attached to the carboxyl and three to the amines) was found in mature kernels while 449

    asparagine with just three TMS moieties (one attached to the carboxyl and two to the amines) 450

    was specific to V5 leaves. This dichotomy might be explained by differential trimethylsilylation 451

    due to the different sample matrices.28

    Consequently, comparing metabolomes across tissue 452

    types or species should be undertaken with caution. 453

    We also compared the levels of metabolites in all samples across all varieties for a given 454

    tissue type, and identified metabolites that are quite stable as well as those that are highly 455

    variable among varieties. There were 21 metabolites from V5 leaves, 20 metabolites from 25 456

    DAP immature kernels and 11 metabolites from the mature kernels that showed a CV value less 457

    than 0.4 across all varieties (Table 4), representing tissue-specific stable metabolomes. Among 458

    them, sucrose and myo-inositol were identified from all three tissues, and another ten metabolites 459

    appeared in two tissue types. On the other hand, there were 17 metabolites from the V5 leaves, 460

    13 from the 25 DAP immature kernels, and 16 from the mature kernels that showed CV values 461

    larger than 1, indicating that these metabolites are highly variable among different maize 462

    varieties (Table 4). A partial derivative form of glutamine seemed to be highly variable in all 463

    three tissue types, and another three metabolites were highly variable for two tissue types. The 464

    Page 17 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 18

    high variability of the partial derivative of glutamine may be due, at least in part, to inconsistent 465

    transformation to pyroglutamic acid, which was also highly variable in two of the tissues. The 466

    inconsistent transformation of pyroglutamic acid is a process known to be associated with 467

    trimethylsilylation. 468

    As with gene expression levels detected from microarrays, mean metabolite abundances 469

    from individual samples or pooled samples were calculated for each variety and used to calculate 470

    CVs among varieties. For all three tissue types, metabolite variations among different varieties 471

    detected from individual or pooled samples are well-correlated. Linear regression R2 values are 472

    0.90, 0.92, and 0.82 for V5 leaves, 25 DAP developing kernels, and mature kernels, respectively. 473

    Furthermore, the CV distribution patterns are very similar between individual and pooled 474

    samples (Supplementary Figure 4). For V5 leaves, 43.7% of metabolites showed higher CVs in 475

    pooled samples compared to individual samples. In 25 DAP developing kernels and mature 476

    kernels, the numbers are 61.5% and 47.6%. This observation indicated that using pooled samples 477

    revealed variety-to-variety metabolomic variation similar to that using individual samples. 478

    479

    DISCUSSION 480

    Thorough evaluation of the applicability and limitations of the -omics technologies for food 481

    safety assessment is necessary before their acceptance for this purpose. Towards this end, we 482

    evaluated high-throughput gene expression and metabolomic technologies by characterizing the 483

    transcriptomes and metabolomes of several conventional maize varieties using alternative 484

    protocols. Our observations lead us to conclude that in applying these methods to regulatory 485

    issues, consideration should be given to natural variation in maize transcriptome and to the high 486

    degree of variation in metabolite concentrations between plant varieties and individuals of the 487

    same variety. 488

    489

    Technical Variation 490

    To validate methods for both microarray and metabolomics, selected samples were analyzed 491

    multiple times to serve as technical repeats. The CV distribution for the technical microarrays 492

    showed small variations between different microarray runs for the same sample (Figure 1), 493

    validating the method and technical consistency. When compared to CVs detected from 494

    individual plant samples, the technical CVs are much smaller (Figure 1A, Supplementary Table 495

    Page 18 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 19

    2), indicating that our microarray technology is consistent and sensitive enough to detect 496

    biological variations outside of technical variations. The data correlation analysis among repeat 497

    arrays comparing individual samples and technical repeat samples confirmed this conclusion 498

    (Figure 2). Technical plus biological CVs detected from metabolomics, however, were much 499

    larger compared to microarrays (Supplemetary Figure 3). This increase is not unexpected since 500

    expression of many metabolites is dynamically affected by microenvironment. Furthermore, 501

    different metabolites have very different physical and biochemical properties as well as ranges of 502

    expression, and therefore can be affected by the extraction and derivatization methods employed. 503

    Nevertheless, biological variability was found to be greater than analytical variability. The mean 504

    CVs observed are similar to those reported in the plant metabolomic literature. 505

    Sample Pooling 506

    Profiling techniques are a powerful tool for gene discovery research as long as appropriate 507

    statistical tools are used to analyze the data. Pooling of mRNA samples from different 508

    individuals of the same variety for microarray hybridizations has the following advantages: 1) 509

    controls cost, 2) generates data when the amounts of individual samples are insufficient, and 3) 510

    decreases variation between individuals. A design of multiple pools with multiple individual 511

    samples in each pool was established as a compromise.44-47

    Thus, the ability to detect the 512

    difference between biological subject-to-subject variations and the experimental technical 513

    variations is combined with the efficiency of the pooling strategy designed to reduce overall 514

    variance. The larger the individual-to-individual variability is, as compared to technical 515

    variability, the greater the reduction of variability is achieved by pooling samples.44,45

    516

    We designed the microarray and metabolomics experiments to include both individual 517

    samples and sample pools. Gene expression levels detected from microarray and metabolite 518

    abundances both showed very good correlation between individual and pooled samples within a 519

    same tissue type and variety (Table 1, 2). Using pooled samples lowered sample-to-sample 520

    variation resulting in lower CVs (Figure 4, Supplementary Figure 1, Supplementary Table 2, 3, 5 521

    6). Interestingly, pooling microarray samples reduced the CVs more dramatically for 25 DAP 522

    samples compared to V5 leaf samples (Supplementary Figure 1C, D), presumably due to the 523

    higher transcriptome variation among 25 DAP individual samples compared to V5 leaf 524

    individual samples. 525

    Page 19 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 20

    The mean CVs for each variety calculated from either individual plants or pooled 526

    samples represent the variety-to-variety variations. The CVs representing variety-to-variety 527

    variations were similar when obtained from either individual plants or pooled samples for both 528

    microarrays and metabolomics (Supplementary Figures 2,4). Furthermore, the distribution 529

    patterns of variety CVs were similar for both microarrays and metabolomics. A slight increase of 530

    variation in pooled samples compared with individual samples from microarrays was detected 531

    (Supplementary Figure 2 and Supplementary Table 3), presumably due to fewer pooled samples. 532

    Pooling plants prior to analysis also did not adversely affect the ability to classify tissues 533

    or varieties, or identify discriminating metabolites by PCA (Figures 5, 6). In fact, pooling 534

    appeared to enhance discriminating power, presumably by eliminating some noise from the 535

    datasets. Overall, our pooling strategy of three sample pools of three is a cost-saving design that 536

    does not sacrifice analytical power. 537

    538

    qRT-PCR and Microarray 539

    The use of microarray profiling for comparative assessment of biotech crops requires a gene 540

    expression sequence database for probe design, gene annotation, and expression level 541

    interpretation. For many plant species, genomes or transcriptomes have not been completely 542

    sequenced except for a few model genotypes. The maize genome has an especially high level of 543

    DNA sequence polymorphisms, approximately an order of magnitude higher than that in 544

    humans.48-50

    High level of genotypic variation in maize introduces challenges for gene 545

    expression profiling such as microarray or PCR-based technologies, since experimental designs 546

    are based on knowledge obtained from just one or two varieties. As most of the genomic 547

    sequence and transcriptome for the varieties used in this study are not available, microarray 548

    hybridization efficiency is expected to vary between varieties. In the microarray study and in 549

    qRT-PCR, the primers and probes were designed using gene sequences of the B73 reference 550

    genome. Consequently, we observed substantial variation in single-well qRT-PCR efficiencies 551

    for the amplification of the same gene from different maize variety samples (data not shown). 552

    This resulted in some inconsistency in expression values detected by qRT-PCR and microarrays 553

    across different varieties (Figure 3). For some genes, expression values assayed by qRT-PCR 554

    were very different from the corresponding microarray expression levels (Figure 3). This 555

    observation raises concerns about the validity of probe homology-dependent methodologies in 556

    Page 20 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 21

    highly diverse species. In some cases, very large variation in measured expression levels 557

    between varieties may be due to presence - absence variation.51

    For future studies, caution should 558

    be exercised when using microarray technology under similar circumstances. When comparing 559

    transgenic and non-transgenic varieties, pairs of lines should be used that are isogenic except for 560

    the presence of transgenes. 561

    562

    Metabolomics 563

    The physiological concentration range of metabolites is very broad (Supplementary Table 7).52,53

    564

    The lower technical variation for microarrays compared to metabolomics (Figure 1, 565

    Supplementary Figure 3) can be partially explained by quantile normalization of microarray data 566

    which helps reduce CV. The CV ranges observed in our study are nevertheless comparable to 567

    those seen by others using different systems.54-57

    568

    The levels of many metabolites measured by metabolomics are extremely sensitive to not 569

    only the experimental procedures and instrument type used, but also to the environment where 570

    the samples are collected. Nevertheless, large changes in the amount of many metabolites within 571

    a plant rarely make significant overall contributions to the nutritional composition or raise safety 572

    concerns.12,53,58-60

    Genetic background strongly affects metabolite levels,60,61

    usually more than 573

    transgene insertions.16,20,63

    Per sample cost for metabolomics is much lower compared to 574

    microarrays, allowing more sample replicates, increasing statistical power and lowering technical 575

    variation, while retaining true variation in physiological metabolite levels. 576

    GC/MS-based high-throughput metabolomics requires a uniform extraction and sample 577

    processing protocol for hundreds of metabolites differing in chemical properties and in vivo 578

    concentrations, which leads to suboptimal analytical conditions for many metabolites. Most 579

    metabolomic techniques lack sufficient analytical breadth to accurately measure hundreds of 580

    metabolites with very diverse chemical properties.64-66

    Analytical compromises must be made to 581

    achieve high-throughput and high metabolome coverage, rendering metabolomic data 582

    fundamentally different than targeted analysis of specific analytes. Even augmented with LC/MS 583

    and CE/MS, metabolomics does not cover all of the compounds presumed to be present in maize 584

    leaves or kernels. Also, metabolomics results include a large amount of unidentified metabolites 585

    that currently cannot be mapped to a biochemical pathway. Thus, a traditional metabolic 586

    pathway-centric evaluation of metabolomic data for safety assessment is not conceptually 587

    Page 21 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 22

    appropriate. The lack of knowledge of metabolic pathways and the limited availability of 588

    reference standards and databases also have restricted the use of metabolomic technology for 589

    tasks best served by traditional targeted analytical methods. Therefore, it might be preferable to 590

    combine non-targeted methods with multivariate tools such as principle component analysis and 591

    hierarchical clustering to visualize sample relationships, rather than to focus on individual 592

    metabolite tolerance levels.62

    593

    Although our metabolomic study identified metabolites present at significantly different 594

    levels in different maize varieties, the biological significance of these differences should be 595

    interpreted with caution.16,67

    We reported relative metabolite abundances rather than absolute 596

    abundances, therefore only metabolomic data generated using the same experimental procedures, 597

    detection methodologies, and internal controls should be compared to this data set directly. This 598

    consideration is additional to significant biological variability. As recommended by Codex 599

    Alimentarius,68

    The statistical significance of any observed differences should be assessed in 600

    the context of the range of natural variations for that parameter to determine its biological 601

    significance. Our study strongly supports this recommendation. 602

    603

    Page 22 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 23

    ABBREVIATIONS USED

    GMO - genetically modified organism

    qRT-PCR - quantitative reverse transcript PCR

    OECD - Organization for Economic Co-operation and Development

    GM - genetically modified

    ILSI - International Life Sciences Institute

    mRNA messenger RNA

    RNA - ribonucleic acid

    GC/MS - gas chromatography / mass spectrometry

    DAP - days after pollination

    DNA - deoxyribonucleic acid

    cRNA - complementary RNA

    MSTFA - N-Methyl-N-(trimethylsilyl) trifluoroacetamide

    PFTBA - perfluorotributylamine

    CV - coefficient of variation

    PCA - principal component analysis

    cDNA - complementary DNA

    RT - reverse transcription

    NCBI - National Center for Biotechnology Information

    qPCR - quantitative PCR

    LC/MS - liquid chromatography / mass spectrometry

    CE/MS - capillary electrophoresis / mass spectrometry

    TMS - trimethylsilyl

    ACKNOWLEDGEMENTS

    The authors express appreciation to the Wilmington Regulatory Science team for assistance in

    tissue generation; John Nau for carrying out the microarray experiments and data processing;

    Teresa Harp for carrying out the metabolomics experiments; Xiaoxiao Kong and Bonnie Hong

    for assistance in data analysis; Antoni Rafalski for assistance in the preparation of the

    manuscript; and Antoni Rafalski, Stan Luck, and Mary Locke for critical review of the

    manuscript.

    Page 23 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 24

    SUPPORTING INFORMATION AVAILABLE

    Supplementary Figure 1. CV distribution analysis of microarray data.

    Supplementary Figure 2. Percent of genes with CV values at specific ranges.

    Supplementary Figure 3. CV distribution of metabolite profiling technical repeats derived from

    different sample groups.

    Supplementary Figure 4. CVs among varieties comparing average metabolite levels.

    Supplementary Table 1. Primers and probes used for qRT-PCR reactions.

    Supplementary Table 2. Percentage of genes from microarrays with CV values within different

    ranges.

    Supplementary Table 3. CV summaries of gene expression from microarrays with individual (I)

    and pooled (P) samples.

    Supplementary Table 4. Percentage of genes from microarrays with CV values within different

    ranges. Supplementary Table 5. CV summaries of metabolite levels from I (individual) and P

    (pooled) samples.

    Supplementary Table 6. Metabolites that contributed most to the classification of PH2WBR in

    leaf samples and in PH2WBS in 25 DAP and mature kernel samples.

    Supplementary Table 7. Max/min ratio of relative levels of each metabolite.

    This material is available free of charge via the Internet at http://pubs.acs.org.

    REFERENCES

    1. Fedoroff, N.V.; Battisti, D.S.; Beachy, R.N.; Cooper, P.J.M.; Fischhoff, D.A.; Hodges,

    C.N.; Knauf, V.C.; Lobell, D.; Mazur, B.J.; Molden, D.; Reynolds, M.P.; Ronald, P.C.;

    Rosegrant, M.W.; Sanchez, P.A.; Vonshak, A.; Zhu J-K. Radically rethinking agriculture

    for the 21st century. Science 2010, 327, 833-834.

    2. Godfray, H.C.J.; Beddington, J.R.; Crute, I.R.; Haddad, L.; Lawrence, D.; Muir, J.F.;

    Pretty, J.; Robinson, S.; Thomas, S.M.; Toulmin, C. Food security: the challenge of

    feeding 9 billion people. Science 2010, 327, 812-818.

    Page 24 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 25

    3. Park, J.R.; McFarlane, I.; Phipps, R.H.; Ceddia, G. The role of transgenic crops in

    sustainable development. Plant Biotech. J. 2010, 9, 2-21.

    4. McGloughlin, M.N. Modifying agricultural crops for improved nutrition. New

    Biotechnol. 2010, 27, 494-504.

    5. Domingo, J.; Bordonaba, J.G. A literature review on the safety assessment of genetically

    modified plants. Environ. Int. 2011, 37: 734-742.

    6. Kuiper, H.A.; Kleter, G.; Noteborn, H.P.; Kok E.J. Assessment of food safety issues

    related to genetically modified foods. Plant J. 2001, 27, 503-528.

    7. Kok, E.J.; Kuiper, H.A. Comparative safety assessment for biotech crops. Trends

    Biotechnol. 2003, 21, 439-444.

    8. Knig, A.; Cockburn, A.; Crevel, R.W.R.; Debruyne, E.; Grafstroem, R.; Hammerling,

    U.; Kimber, I.; Knudsen, I.; Kuiper, H.A.; Peijnenburg, A.A.C.M.; Penninks, A.H.;

    Poulsen, M.; Schauzu, M.; Wal, J.M. Assessment of the safety of foods derived from

    genetically modified (GM) crops. Food Chem. Tox. 2004, 42, 1047-1088.

    9. Organization for Economic Cooperation and Development. An Introduction to the

    Food/Feed Safety Consensus Documents of the Task Force. Series on the Safety of Novel

    Foods and Feeds, No. 14; Paris, 2006, pp7-9.

    10. Chassy, BM. Can omics inform a food safety assessment? Regul. Toxicol. Pharmacol.

    2010, 58, S62-S70.

    11. ILSI. Recent developments in the safety and nutritional assessment of nutritionally

    improved foods and feeds. Compr. Rev. Food Sci. Food Saf. 2008, 7, 50-113.

    12. Herman, R.A.; Chassy, B.M.; Parrott, W. Compositional assessment of transgenic crops:

    An idea whose time has passed? Trends Biotechnol. 2009, 27, 565-567.

    13. Davies, H.V.; Shepherd, L.V.T.; Stewart, D.; Frank, T.; Rhlig, R.M.; Engel, K-H.

    Metabolome variability in crop plant species When, where, how much and so what?

    Regul. Toxicol. Pharmacol. 2010b, 58, S54-S61.

    14. Harrigan, G.G.; Glenn, K.C.; Ridley, W.P. Assessing the natural variability in crop

    composition. Regul. Toxicol. Pharmacol. 2010a, 58, S13-S20.

    15. ILSI. Nutritional and safety assessments of foods and feeds nutritionally improved

    through biotechnology. Compr. Rev. Food Sci. Food Saf. 2004, 3, 36-104.

    Page 25 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 26

    16. Harrigan, G.G.; Lundry, D.; Drury, S.; Berman, K.; Riordan, S.G.; Nemeth, M.A.;

    Ridley, W.P.; Glenn, K.C. Natural variation in crop composition and the impact of

    transgenesis. Nat. Biotechnol. 2010b, 28, 402-404.

    17. EFSA. Guidance document of the scientific panel on genetically modified organisms for

    the risk assessment of genetically modified plants and derived food and feed. EFSA J.

    2006, 99, 1-100.

    18. Joyce, A.R.; Palsson, B.O. The model organism as a system: integrating omics data

    sets. Nat. Rev. Mol. Cell Biol. 2006, 7, 198-210.

    19. Li, X.; Huang, K.L.; Zhu, B.Z.; Tang, M.Z.; Luo, Y.B. Potentiality of omics techniques

    for the detection of unintended effects in genetically modified crops. J. Agric. Biotechnol.

    2005, 13, 1082-1088.

    20. Ricroch, A.E.; Berg, J.B.; Kuntz, M. Evaluation of genetically engineered crops using

    transcriptomic, proteomic, and metabolomic profiling techniques. Plant Physiol. 2011,

    155: 1752-1761.

    21. Davies, H. A role for omics technologies in food safety assessment. Food Control

    2010a, 21, 1601-1610.

    22. Baudo, M.M.; Lyons, R.; Powers, S.; Pastori, G.M.; Edwards, K.J.; Holdsworth, M.J.;

    Shewry, P.R. Transgenesis has less impact on the transcriptome of wheat grain than

    conventional breeding. Plant Biotechnol. J. 2006, 4, 369-380.

    23. Batista, R.; Saibo, N.; Lourenco, T.; Oliveira, M.M. Microarray analyses reveal that plant

    mutagenesis may induce more transcriptomic changes than transgene insertion. Proc.

    Natl. Acad. Sci. USA 2008, 105, 3640-3645.

    24. van Dijk, J.P.; Leifert, C.; Barros, E.; Kok, E.J. Gene expression profiling for food safety

    assessment: Examples in potato and maize. Regul. Toxicol. Pharmacol. 2010, 58, S21-

    S25.

    25. Schauer, S.; Fernie, A.R. Plant metabolomics: Towards biological function and

    mechanism. Trends Plant Sci. 2006, 11, 508-516.

    26. Hall, R.D. Plant metabolomics: From holistic hope, to hype, to hot topic. New Phytol.

    2006, 169, 453-468.

    Page 26 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 27

    27. Hayes, K.R.; Beatty, M.; Meng, X.; Simmons, C.R.; Habben, J.E.; Danilevskaya, O.N.

    Maize global transcriptomics reveals pervasive leaf diurnal rhythms but rhythms in

    developing ears are largely limited to the core oscillator. PLoS One 2010, 5, e12887.

    28. Fiehn, O.; Wohlgemuth, G.; Scholz, M.; Kind, T.; Lee, D.Y.; Lu, Y.; Moon, S., Nikolau,

    B. Quality control for plant metabolomics: reporting MSI-compliant studies. Plant J.

    2008, 53, 691-704.

    29. McElver, J.; Tzafrir, I.; Aux, G.; Rogers, R.; Ashby, C.; Smith, K.; Thomas, C.; Schetter,

    A.; Zhou, Q.; Cushman, M.A.; Tossberg, J.; Nickle, T.; Levin, J.Z.; Law, M.; Meinke,

    D.; Patton, D. Insertional mutagenesis of genes required for seed development in

    Arabidopsis thaliana. Genetics 2001, 159, 1751-1763.

    30. Luo, M.; Liu, J.; Lee, R.D.; Guo, B.Z. Characterization of gene expression profiles in

    developing kernels of maize (Zea mays) inbred Tex6. Plant Breed. 2008, 127, 569-578.

    31. Zhao, S.; Fernald, R.D. Comprehensive algorithm for quantitative real-time polymerase

    chain reaction. J. Comput. Biol. 2005, 12 (8), 1045-1062.

    32. Fan, J.; Tam, P.; Woude, G.V.; Ren, Y. Normalization and analysis of cDNA microarrays

    using witin-array replications applied to neuroblastoma cell response to a cytokine. Proc.

    Natl. Acad. Sci. USA 2004, 101, 1135-1140.

    33. Zhou, J.; Thompson, D.K. In Microarray technology and applications in environmental

    microbiology; Spark DL; Advances in Agronomy vol. 82; Academic Press: San Diego,

    CA, 2004, 183-270.

    34. Novak, J.P.; Miller, M.C., III; Bell, D.A. Variation in fiberoptic bead-based

    oligonucleotide microarrays: dispersion characteristics among hybridization and

    biological replicate samples. Biol. Direct 2006, 1, 18.

    35. Sato, F.; Tsuchiya, S.; Terasawa, K.; Tsujimoto, G. Intra-platform repeatability and inter-

    platform comparability of microRNA microarray technology. PLoS One 2009, 4, e5540.

    36. Rutledge, R.G.; Stewart, D. A kinetic-based sigmoidal model for the polymerase chain

    reaction and its application to high-capacity absolute quantitative real-time PCR. BMC

    Biotechnol. 2008, 8, 47.

    37. Cruz, F.; Kalaoun, S.; Nobile, P.; Colombo, C.; Almeida, J.; Barros, L.M.G.; Romano,

    E.; Grossi-de-S, M.F.; Vaslin, M.; Alves-Ferreira, M. Evaluation of coffee reference

    Page 27 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 28

    genes for relative expression studies by quantitative real-time RT-PCR. Mol. Breed.

    2009, 23, 607-616.

    38. Ruijter, J.M.; Ramakers, C.; Hoogaars, W.M.H.; Karlen, Y.; Bakker, O.; van den Hoff,

    M.J.B.; Moorman, A.F.M. Amplification efficiency: linking baseline and bias in the

    analysis of quantitative PCR data. Nucleic Acids Res. 2009, 37, e45.

    39. Capito, C.; Paiva, J.A.P.; Santos, D.M.; Fevereiro, P. In Medicago truncatula, water

    deficit modulates the transcript accumulation of components of small RNA pathways.

    BMC Plant Biol. 2011, 11, 79.

    40. Demidenko, N.V.; Logacheva, M.D.; Penin, A.A. Selection and validation of reference

    genes for quantitative real-time PCR in buckwheat (Fagopyrum esculentum) based on

    transcriptome sequence data. PLoS One 2011, 6, e 19434.

    41. Graeber, K.; Linkies, A.; Wood, A.T.A.; Leubner-Metzger, G. A guideline to family-

    wide comparative state-of-the-art quantitative RT-PCR analysis exemplified with a

    Brassicaceae cross-species seed germination case study. Plant Cell 2011, 23, 2045-2063.

    42. Mafra, V.; Kubo, K.S.; Alves-Ferreira, M.; Ribeiro-Alves, M.; Stuart, R.M.; Boava, L.P.;

    Rodrigues, C.M.; Machado, M.A. Reference genes for accurate transcript normalization

    in citrus genotypes under different experimental conditions. PLoS One 2012, 2, e31263.

    43. Marum, L.; Miguel, A.; Ricardo, C.P.; Miguel, C. Reference gene selection for

    quantitative real-time PCR normalization in Quercus suber. PLoS One. 2012, 4, e35113.

    44. Kendziorski, C.M.; Zhang, Y.; Lan, H.; Attie, A.D. The efficiency of pooling mRNA in

    microarray experiments. Biostatistics 2003, 4, 465-477.

    45. Kendziorski, C.; Irizarry, R.A.; Chen, K-S.; Haag, J.D.; Gould, M.N. On the utility of

    pooling biological samples in microarray experiments. Proc. Natl. Acad. Sci. USA 2005,

    102, 4252-4257.

    46. Peng, X.; Wood, C.L.; Blalock, E.M.; Chen, K.C.; Landfield, P.W.; Stromberg, A.J.

    Statistical implications of pooling RNA samples for microarray experiments. BMC

    Bioinf. 2003, 4, 26.

    47. Zhang, W.; Carriquiry, A.; Nettleton, D.; Dekkers, J.C.M. Pooling mRNA in microarray

    experiments and its effect on power. Bioinformatics 2007, 23, 1217-1224.

    48. Buckler, E.S; Thornsberry, J.M. Plant molecular diversity and applications to

    genomics. Curr. Opin. Plant Biol. 2002, 5, 107111.

    Page 28 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 29

    49. Ching, A.; Caldwell, K.S.; Jung, M.; Dolan, M.; Smith, O.S.; Tingey, S.; Morgante, M.;

    Rafalski, J.A. SNP frequency, haplotype structure and linkage disequilibrium in elite

    maize inbred lines. BMC Genet. 2002, 3 (19), 3-19.

    50. Rafalski, A.; Morgante, M. Corn and humans: recombination and linkage disequilibrium

    in two genomes of similar size. Trends Genet. 2004, 20, 103-111.

    51. Springer, N.M.; Ying, K.; Fu, Y.; Ji, T.; Yeh, C-T.; et al. Maize Inbreds Exhibit High

    Levels of Copy Number Variation (CNV) and Presence/Absence Variation (PAV) in

    Genome Content. PLoS Genet. 2009, 5 (11): e1000734.

    doi:10.1371/journal.pgen.1000734

    52. Eldridge, A.C.; Kwolek, W.F. Soybean isoflavones: effect of environment and variety on

    composition. J. Agric. Food Chem. 1983, 31, 394-396.

    53. Gutierrez-Gonzalez, J.J.; Wu, X.; Zhang, J.; Lee, J.D.; Ellersieck, M.; Shannon, J.G.; Yu,

    O.; Nguyen, H.T.; Sleper, D.A. Genetic control of soybean seed isoflavone content:

    importance of statistical model and epistasis in complex traits. Theor. Appl. Genet. 2009,

    119, 1069-1083.

    54. Morgenthal, K.; Wienkoop, S.; Scholz, M.; Selbig, J.; Weckwerth, W. Correlative GC-

    TOF-MS-based metabolite profiling and LC-MS-based protein profiling reveal time-

    related systemic regulation of metabolite-protein networks and improve pattern

    recognition for multiple biomarker selection. Metabolomics 2005, 1, 109-121.

    55. Sysi-Aho, M.; Katajamaa, M.; Yetukuri, L.; Orei, M. Normalization method for

    metabolomics data using optimal selection of multiple internal standards. BMC Bioinf.

    2007, 8, 93.

    56. Parsons, H.M.; Ekman, D.R.; Collette, T.W.; Viant, M.R. Spectral relative standard

    deviation: a practical benchmark in metabolomics. Analyst 2009, 134, 478-485.

    57. Toubiana, D.; Semel, Y.; Tohge, T.; Beleggia, R.; Cattivelli, L.; Rosental, L.; Nikoloski,

    Z.; Zamir, D.; Fernie, A.R.; Fait, A. Metabolic profiling of a mapping population exposes

    new insights in the regulation of seed metabolism and seed, fruit, and plant relations.

    PLoS Genet. 2012, 8, e1002612.

    58. Harrigan, G.G.; Stork, L.G.; Riordan, S.G.; Reynolds, T.L.; Ridley, W.P.; Masucci, J.D.;

    Macisaac, S.; Halls, S.C.; Orth, R.; Smith, R.G.; Wen, L.; Brown, W.E.; Welsch, M.;

    Riley, R.; Mcfarland, D.; Pandravada, A.; Glenn, K.C. Impact of genetics and

    Page 29 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 30

    environment on nutritional and metabolite components of maize grain. J. Agric. Food

    Chem. 2007, 55, 6177-6185.

    59. Skogerson, K.; Harrigan, G.G.; Reynolds, T.L.; Halls, S.C.; Ruebelt, M.; Iandolino, A.;

    Pandravada, A.; Glenn, K.C.; Fiehn, O. Impact of genetics and environment on the

    metabolite composition of maize grain. J. Agric. Food Chem. 2010, 58, 3600-3610.

    60. Zhou, J.; Harrigan, G.G.; Berman, K.H.; Webb, E.G.; Klusmeyer, T.H.; Nemeth, M.A.

    Stability of the compositional equivalence of grain from insect-protected corn and seed

    from herbicide-tolerant soybean over multiple seasons, locations and breeding

    germplasms. J. Agric. Food Chem. 2010, 59, 8822-8828.

    61. Reynolds, T.L.; Nemeth, M.A.; Glenn, K.C.; Ridley, W.P.; Astwood, J.D. Natural

    variability of metabolites in maize grain: differences due to genetic background. J. Agric.

    Food Chem. 2005, 53. 10061-10067.

    62. Asiago, V.; Hazebroek, J.; Harp, T.; Zhong, C. Effect of genetics and environment on the

    metabolome of commercial maize hybrids: A multisite study. J. Agric. Food Chem. 2012,

    60, 11498-11508.

    63. Catchpole, G.S.; Beckmann, M.; Enot, D.P.; Mondhe, M.; Zywicki, B.; Taylor, J.; Hardy,

    N.; Smith, A.; King, R.D.; Kell, D.B.; Fiehn, O.; Draper, J. Hierarchical metabolomics

    demonstrates substantial compositional similarity between genetically modified and

    conventional potato crops. Proc. Natl. Acad. Sci. USA 2005, 102, 14458-14462.

    64. Goodacre, R.; Vaidyanathan, S.; Dunn, W.R.; Harrigan, G.G.; Kell, D.B. Metabolomics

    by numbers-Acquiring and understanding global metabolite data. Trends Biotechnol.

    2004, 22, 245-252.

    65. Rischer, H.; Oksman-Caldentey, K-M. Unintended effects in genetically modified crops:

    revealed by metabolomics? Trends Biotechnol. 2006, 24, 102-104.

    66. Kusano, M.; Redestig, H.; Hirai, T.; Oikawa, A.; Matsuda, F.; Fukushima, A.; Arita, M.;

    Watanabe, S.; Yano, M.; Hiwasa-Tanase, K.; Ezura, H.; Saito, K. Covering chemical

    diversity of genetically-modified tomatoes using metabolomics for objective substantial

    equivalence assessment. PLoS One 2011, 6, e16989.

    67. Goodman, S. A dirty dozen: Twelve p-value misconceptions. Semin. Hematol. 2008, 45,

    135-140.

    Page 30 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 31

    68. Codex Alimentarius. Guideline for the conduct of food safety assessment of foods

    derived from recombinant-DNA plants.

    http://www.codexalimentarius.net/input/download/standards/10021/CXG_045e.pdf

    (accessed March 4, 2013).

    FIGURE CAPTIONS

    Figure 1. CVs of gene expression calculated from technical repeat microarrays. A. Percentages

    of genes within different CV ranges. B. CV distributions generated by TIBCO Spotfire. X-axis,

    CV values in log scale; Y-axis, gene counts. C. Plots of log10 (CV)s (y-axis) and against log10

    (mean)s (x-axis). Curves are polynomial fittings generated by TIBCO Spotfire. D. Log10(CV)

    values of inflection points calculated from curves in C.

    Figure 2. Technical reproducibility of microarrays. Pairwise correlation coefficients between

    pairs of technical replicates (T) or between pairs of biological repeats (I). Boxplots were

    generated with TIBCO Spotfire. The white bar represents the median value. Edges of boxes

    represent values at 75% and 25% percentiles. Edges of bars represent the ranges of values with

    outside dots as outliers.

    Figure 3. Gene expression analysis comparing qRT-PCR to microarray hybridization with V5

    leaf (A) or 25 DAP (B) samples. Numbers are log2 transformed ratios between expression levels

    detected by qRT-PCR and microarray that were defined against dapA expression levels.

    Figure 4. CV distribution of metabolite levels detected from V5 leaf (A), 25 DAP immature

    kernel (B), and mature kernels (C).

    Figure 5. PCA score plots from individual or pooled plants showing tissue specificity of

    metabolomes.

    Figure 6. PCA scores and loadings plots from individual plants (A, C; E, G; I, K) or pooled

    plants (B, D; F, H; J, L) showing classifications of PH2WBR from the leaf metabolome (A-D),

    PH2WBS from the 25 DAP kernel metabolome (E-H) and PH2WBS from the mature kernel

    metabolome (I-L). Significant loadings shown in purple.

    Page 31 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 32

    TABLES

    Table 1. Rows 1,2: Mean gene expression levels detected in individual-plant (I) and pooled-

    plants (P) for the same tissue type of a variety are highly correlated. Values are Pearson

    correlation coefficients (R) comparing mean gene expression intensities between all I and all P

    samples for each variety-tissue combination, calculated using Excel function PEARSON.

    Rows 3,4: Mean gene expressions detected microarrays are not correlated between V5 leaf and

    25 DAP. Values are Pearson correlation coefficients (R) comparing gene expression intensities

    from V5 leaf and 25 DAP for the same plant samples , calculated using Excel function

    PEARSON, for individual-plant (I) or pooled-plants (P)

    Row Sample 34A15 38B85 PH2WBR PH2WBS PHG9B H31

    1 V5Leaf 0.9932 0.9885 0.9959 0.9893 0.9937 0.9846

    2 25DAP 0.9938 0.9743 0.989 0.9836 0.9889 0.9899

    3 I 0.238 0.2527 0.317 0.2757 0.283 0.2472

    4 P 0.234 0.2551 0.3092 0.2595 0.2849 0.239

    Table 2. Mean metabolite levels detected from individual-plant (I) and pooled-plants (P)

    samples for the same tissue type of a variety are highly correlated. Pearson correlation

    coefficients (R) by Excel function PEARSON.

    Variety V5Leaf 25DAP Mature

    34A15 0.9976 0.9996 0.9732

    37Y12 0.9988 0.9987 0.9921

    38B85 0.9988 0.9994

    PH2WBS 0.9089 0.9952

    PH2WBR 0.9985

    PH0GP 0.9970 0.9971

    658 0.9981 0.9948 0.9984

    Page 32 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 33

    PH14T 0.9994 0.9981 0.9992

    PHG9B 0.9993 0.9933 0.9940

    H31 0.9980 0.9891

    Page 33 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 34

    Table 3. Apparent tissue specific metabolites.

    Tissue Analyte Class

    V5 leaf

    Tyramine polyamine

    L-Tryptophan, N,1-bis(trimethylsilyl)-, amino acid

    Chlorogenic acid phenolic acid

    Citramalic acid organic acid

    Dehydroascorbic acid, secondary peak 1 vitamin

    Dehydroascorbic acid, secondary peak 2 vitamin

    Dehydroascorbic acid, secondary peak 3 vitamin

    Heptadecanoic acid fatty acid

    Itaconic acid organic acid

    Maleic acid organic acid

    Pyruvic acid organic acid

    Salicylic acid phenolic acid

    cis-Caffeic acid phenolic acid

    trans-Caffeic acid phenolic acid

    alpha-Tocopherol vitamin

    Rhamnose sugar

    Trehalose sugar

    Glyceric acid-3-phosphate phosphorylated acid

    Phytol alkane alcohol

    Margaric acid fatty acid

    25 DAP Myristic acid fatty acid

    Cysteine, partial derivative amino acid

    mature

    kernel

    Adenosine-5-monophosphate nucleic acid

    Pipecolic acid organic acid

    Page 34 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 35

    Table 4. Metabolites with relatively stable or highly variable levels among different maize

    varieties and their CV values. CV values were calculated from metabolite levels from all

    individual (I) and pooled (P) samples for all varieties. Only metabolites that were detected from

    all I and P samples of all varieties were included for the calculation. Only metabolites with CV

    values of less than 0.30 (relatively stable) and those with CV values greater than 1.00 (highly

    variable) are shown. Metabolites in italic were found in all three tissues and metabolites in bold

    were found in two tissues for the same category. ).

    Page 35 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 36

    V5 Leaf 25 DAP Immature Kernel Mature Kernel

    0.31 Acetic acid 0.30 Acetic acid 0.29 beta-Sitosterol

    0.30 Arabinose 0.39 Arabinose 0.28 Campesterol

    0.13 beta-Sitosterol 0.34 Aspartic acid part. deri. 0.36 Erythritol

    0.23 Campesterol 0.32 Benzoic acid 0.37 Glycerol-3-phosphate 0.40 Cellobiose deri. 1 0.30 Ethanolamine 0.37 Linoleic acid 0.38 cis-Aconitic acid 0.37 Fructose deri. 1 0.38 Malic acid 0.39 Ferulic acid 0.39 Galactose 0.35 myo-Inositol 0.33 Galactitol 0.24 Glucose deri. 1 0.31 Palmitic acid 0.24 Glycerol 0.38 Glyceric acid 0.31 Stigmasterol 0.38 Glycerol-3-phosphate 0.36 Isoleucine part. deri. 0.19 Sucrose 0.40 Heptadecanoic acid 0.34 Leucine part. deri. 0.39 Tyrosine

    0.30 Linoleic acid 0.36 Malic acid 0.17 myo-Inositol 0.36 Mannose

    0.30 Palmitic acid 0.23 myo-Inositol 0.17 Phytol 0.25 Phosphoric acid

    0.37 Serine part. deri.

    0.35 Stigmasterol 0.30 Succinic acid 0.14 Sucrose 0.38 Sucrose 0.36 trans-Caffeic acid 0.31 Threonine part. deri. 0.31 Xylitol 0.35 Tyrosine 0.28 Xylose deri. 1 0.32 Xylose deri. 2

    1.60 2-Aminobutyric acid 1.10 beta-Alanine part. deri. 1.24 Asparagine 1.79 Asparagine part. deri. 2.14 cis-Aconitic acid 2.15 beta-Amyrin 1.52 Aspartic acid 1.39 Citric acid 2.63 Cellobiose deri. 1 2.04 Benzoic acid 2.26 Gluconic acid 1.05 Cysteine part. deri.

    1.21 Dehydroascorbic acid 1.49 Glutamic acid 1.28 Dehydroascorbic acid

    1.54 Dehydroascorbic acid

    2nd peak 1 2.70 Glutamine part. deri. 1.03 Gaba

    1.71 Ethanolamine 1.84 Histidine 1.04 Glucose deri. 2

    1.36 Glutamic acid 1.63 Isocitric acid 1.83 Glucose-6-phosphate deri. 2 2.14 Glutamine part. deri. 1.25 Linolenic acid 1.70 Glutamine part. deri. 1.19 Glycine 2.56 Myristic acid 1.04 Glyceric acid 1.22 Glycine part. deri. 1.03 Oleic acid 1.60 Maltose 1.07 Isoleucine 3.43 para-Coumaric acid 1.25 Pipecolic acid 1.11 Leucine 1.57 Pyroglutamic acid 1.14 Pipecolic Acid part. deri.

    1.45 Ornithine 1.32 Pyroglutamic acid 1.06 Serine 1.16 Xylitol

    1.60 Trehalose 1.64 Xylose deri. 2

    1.21 Tyrosine

    Page 36 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 37

    FIGURES

    Figure 1

    Page 37 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 38

    Figure 2

    Page 38 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 39

    Figure 3

    Page 39 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 40

    Figure 4

    Page 40 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 41

    Figure 5

    Page 41 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 42

    Figure 6

    Page 42 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 43

    Images for TOC

    Page 43 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry

  • 44

    Page 44 of 44

    ACS Paragon Plus Environment

    Journal of Agricultural and Food Chemistry