High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted...

8
High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly 1 , Manuel Fréret 2 , Laurent Drouot 2 , Jenny T.C. Ho 3 , Pascal Cosette 4 , Olivier Boyer 2 1 Thermo Fisher Scientific, Courtaboeuf cedex, France; 2 Inserm, U905, Université de Rouen, Faculté de médecine et de pharmacie, Rouen cedex, France; 3 Thermo Fisher Scientific, Hemel Hempstead, UK; 4 Plate-forme de protéomique, Université de Rouen, Mont-Saint-Aignan, France Introduction Normal muscle fibers do not express detectable levels of class I major histocompatibility complex (MHC-I). In contrast, high-level expression of MHC-I is a hallmark of muscle autoimmune diseases such as polymyositis, inclusion-body myositis or dermatomyositis. It is generally considered that damage of muscle fibers in myositis results from an auto-immune attack by auto- reactive cytotoxic T lymphocytes (CTLs) that recognize muscle autoantigens presented by MHC-I molecules (Figure 1). We used a MHC-I–transgenic mouse model in which a controllable muscle-specific promoter governs the conditional up-regulation of a MHC-I molecule (H-2K b ) in skeletal muscle resulting in severe myopathy. A proteomic study was performed on mouse muscle tissue to evaluate the consequences of this forced H-2K b expression. An experimental workflow was designed including protein digestion and label-free qualitative and quantitative analysis by high-resolution, accurate mass LC-MS/MS. Three muscle tissues from MHC-I–inducible mice were compared to two healthy controls using a differential analysis and targeted quantitation approach. Goal Identification and confirmation of candidate biomarkers of a mouse transgenic model of myopathy by high- resolution LC-MS/MS. Application Note: 555 Key Words • LTQ Orbitrap Velos • Label-Free Protein Quantitation • Protein Identification • Biological Annotation Experimental Sample Preparation Mice (kindly provided to Inserm by NIAMS, NIH, Bethesda, MD) were sacrificed and muscle tissues were collected, snap-frozen in liquid nitrogen for 30 seconds, and stored at -80 °C. Then, 150 mg of muscle tissues were manually homogenized in liquid nitrogen and solubilized in 5.2 mL lysis buffer (urea 7 M, thiourea 2 M, DTT 65 mM, CHAPS 4%, Tris 40 mM, 50 µL of ampholytes and 100 µL of protease inhibitor (Sigma)). Samples were ultra-centrifuged for 30 minutes (84,000 rpm at 4 °C (Beckman, rotor TLA-120). Supernatants were aliquoted in 200 µL fractions and stored at -80 °C. Digestion The sample (50 µg) was cleaned up with a 2-D clean up kit (GE Healthcare) to remove interfering molecules and proteins were separated on SDS-PAGE. Gel migration was stopped in the stacking gel. In-gel digestion was performed after reduction with DTT and alkylation with iodoacetamide. Supernatants from the digestion were collected and peptide-containing fractions were combined. Samples were dried and suspended in water / 0.1% formic acid prior to LC-MS/MS analysis. Figure 1. Schematic of the MHC-I molecules displaying fragments of cytoplasmic proteins on the cell surface in order to be presented to immune cytotoxic T cells. CTL cell MHC-I Cell lysis

Transcript of High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted...

Page 1: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of MyopathyClaire Dauly1, Manuel Fréret2, Laurent Drouot2, Jenny T.C. Ho3, Pascal Cosette4, Olivier Boyer2

1Thermo Fisher Scientific, Courtaboeuf cedex, France; 2Inserm, U905, Université de Rouen, Faculté de médecine et de pharmacie, Rouen cedex, France; 3Thermo Fisher Scientific, Hemel Hempstead, UK; 4Plate-forme de protéomique, Université de Rouen, Mont-Saint-Aignan, France

IntroductionNormal muscle fibers do not express detectable levels of class I major histocompatibility complex (MHC-I). In contrast, high-level expression of MHC-I is a hallmark of muscle autoimmune diseases such as polymyositis, inclusion-body myositis or dermatomyositis. It is generally considered that damage of muscle fibers in myositis results from an auto-immune attack by auto-reactive cytotoxic T lymphocytes (CTLs) that recognize muscle autoantigens presented by MHC-I molecules (Figure 1). We used a MHC-I–transgenic mouse model in which a controllable muscle-specific promoter governs the conditional up-regulation of a MHC-I molecule (H-2Kb) in skeletal muscle resulting in severe myopathy. A proteomic study was performed on mouse muscle tissue to evaluate the consequences of this forced H-2Kb

expression. An experimental workflow was designed including protein digestion and label-free qualitative and quantitative analysis by high-resolution, accurate mass LC-MS/MS. Three muscle tissues from MHC-I–inducible mice were compared to two healthy controls using a differential analysis and targeted quantitation approach.

GoalIdentification and confirmation of candidate biomarkers of a mouse transgenic model of myopathy by high-resolution LC-MS/MS.

Application Note: 555

Key Words

•LTQOrbitrapVelos

•Label-FreeProtein Quantitation

•ProteinIdentification

•BiologicalAnnotation

Experimental

Sample PreparationMice (kindly provided to Inserm by NIAMS, NIH, Bethesda, MD) were sacrificed and muscle tissues were collected, snap-frozen in liquid nitrogen for 30 seconds, and stored at -80 °C. Then, 150 mg of muscle tissues were manually homogenized in liquid nitrogen and solubilized in 5.2 mL lysis buffer (urea 7 M, thiourea 2 M, DTT 65 mM, CHAPS 4%, Tris 40 mM, 50 µL of ampholytes and 100 µL of protease inhibitor (Sigma)). Samples were ultra-centrifuged for 30 minutes (84,000 rpm at 4 °C (Beckman, rotor TLA-120). Supernatants were aliquoted in 200 µL fractions and stored at -80 °C.

DigestionThe sample (50 µg) was cleaned up with a 2-D clean up kit (GE Healthcare) to remove interfering molecules and proteins were separated on SDS-PAGE. Gel migration was stopped in the stacking gel. In-gel digestion was performed after reduction with DTT and alkylation with iodoacetamide. Supernatants from the digestion were collected and peptide-containing fractions were combined. Samples were dried and suspended in water / 0.1% formic acid prior to LC-MS/MS analysis.

Figure 1. Schematic of the MHC-I molecules displaying fragments of cytoplasmic proteins on the cell surface in order to be presented to immune cytotoxic T cells.

CTL

cell

MHC-I Cell lysis

Figure 1

Page 2: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

Table 2. Mass spectrometer parameter settings

MassSpectrometerSettings

Source nano-ESI

Capillary temperature 200 °C

S-lens RF level 60%

Source voltage 1.75 kV

Full MS mass range m/z 400 - 2000

Resolution settings (FWHM at m/z 400) Full MS 60,000

Target value Full MS 1E6

CID (Ion Trap) 5E3

Max. injection times Full MS [ms] Full MS 500

MSn (Ion Trap) 100

Dynamic Exclusion™ Repeat count 1

Exclusion list size 500

Exclusion duration 30 s

Exclusion mass width low/high 5 ppm

MS n parameters

Isolation width 2 µ

Minimum signal required 500

Collision energy (CID) 30%

Activation time 10 ms

Top n MS/MS 20

Charge state screening on: +1 rejected

Monoisotopic precursor selection enabled

Table 3. Data analysis parameter settings

DatabaseSearchParameters(ProteomeDiscovererSoftware)

Peak list generation conditions

Total intensity threshold 100

Minimum peak count 8

SEQUEST search

Database Swissprot _MOUSE (57.9)

Mass tolerance (precursor) 5 ppm

Mass tolerance (fragment) 0.5 Da

Search against decoy database True

Dynamic modifications Oxidation (M)

Static modifications Carbamidomethyl (C)

SIEVE Parameters

m/z width 0.015 Da

RT width 2 min

Intensity threshold 10000

Maximum frames 100000

SEQUEST probability threshold 10

SEQUEST Xcorr threshold

Charge state = 2 2.5

Charge state ≥ 3 3

SEQUEST maximum rank 1

Table 1. LC parameter settings

LCSeparation

HPLC System Easy-nLC

Column C18 Reprosil analytical column (100 µm id x 15 cm packed tip column, Nikkyo Technos Co. Ltd)

Pre-column Peptide trap cartridge (Captrap, Michrom Bioresources)

Mobile Phases 0.1% formic acid in water (eluent A); 0.1% formic acid in acetonitrile (eluent B)

Gradient 5% – 40% B in 100 min

Flow 300 nL/min

LC-MS/MSAnalysisSamples were separated by online reversed-phase chromatography using a Thermo Scientific Proxeon Easy-nLC system equipped with a peptide trap cartridge (CapTrap, Michrom Bioresources) and C18 packed tip column (100 µm ID x 15 cm, Nikkyo Technos Co. Ltd). Peptides were separated using an increasing amount of acetonitrile (5%-40% over 100 minutes) and a flow rate of 300 nL/min. The LC eluent was electrosprayed directly from the analytical column and voltage was applied via a liquid junction of the nanospray source. The chromatography system was coupled to a Thermo Scientific LTQ Orbitrap Velos mass spectrometer. About 1 µg of peptides was used for each LC-MS/MS experiment. Peptides were analyzed with full-scan MS detection in the OrbitrapTM mass analyzer at 60,000 (FWHM) resolving power. Precursors were selected “on the fly” for collision-induced dissociation (CID) fragmentation with ion trap detection. The method was set to analyze the top 20 most intense ions from the survey scan. Details of chromatographic and mass spectrometric settings are listed in Tables 1 and 2.

DataProcessingTandem mass spectra were processed with Thermo Scientific Proteome Discoverer software version 1.2. Spectra were searched against Swiss-Prot® Mouse (version 57.9) using the SEQUEST® algorithm. Differential analysis was performed between the two sample groups with Thermo Scientific SIEVE software. LC-MS files were aligned in retention time (RT) and peak detection was performed. Frames consisting of a 2-minute RT window and 0.015-m/z windows were created around each detected feature and peaks were integrated for each frame and each LC-MS/MS file. Peak areas were normalized against the signal of the peptide FTQAGSEVSALLGR, which was used as an internal standard to account for injection or ionization bias. Results from the database search were imported into SIEVETM software. Database search parameters and SIEVE software parameters are detailed in Table 3.

Page 3: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

Thermo Scientific Pinpoint software was used for the targeted quantitative analysis of proteins identified with a ratio < 0.5 or >2 using SIEVE software (see details of proteins/peptides selection and data processing in the Results and Discussion section).

Accession numbers of differentially regulated proteins were submitted to Thermo Scientific ProteinCenter software and statistical analysis was performed against the Swiss-Prot Mouse database for KEGG pathway annotation.

Results and Discussion

Label-Free,UntargetedApproachIn this study, a complete biomarker research workflow was used to identify and confirm proteins which are involved

in the development of myopathy (Figure 2). A database search was performed for the five LC-MS/MS runs separately resulting in the identification of 1693 proteins (1471 protein groups, of which 1096 were identified in at least 2 samples) and 5948 unique peptides with a FDR below 1% after merging the result files. Differential analysis was performed between the two sample groups to find proteins that are differentially expressed in the context of the disease. A total of 1623 proteins could be quantified based on peptide MS trace. Protein ratios were calculated based on the variance weighted average of the peptide ratios. Protein-ratio distribution described an S-type plot with a median at 1.01 indicating that there was no bias in the quantitation (Figure 3).

Figure 3

Log

Ratio

-5

-4

-3

-2

-1

0

1

2

3

4

5

6

0 200 400 600 800 1000 1200 1400 1600 1800

Figure 3. S-Plot of the protein ratios from the label-free approach with SIEVE software

Figure 2. Complete analytical workflow for biomarker discovery including non-targeted label-free differential analysis and biomarker confirmation with targeted quantitation based on high resolution MS measurements

Page 4: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

SIEVE software lists all protein references for each candidate peptide and equally considers each protein candidate as a valid result. That is, if a candidate peptide is found in three different proteins, then each of the three proteins are considered for display in the protein report. Protein ratios can therefore be biased by peptides that belong to several proteins. It is then important to qualify potential biomarkers identified with SIEVE software with a targeted approach that considers only peptides that are unique to the target protein. Here 335 proteins were identified with at least two peptides and observed with ratios >2 or <0.5 with SIEVE software. These proteins were subsequently selected for validation with a targeted quantitation approach.

TargetedQuantitationUsingPinpointSoftwareGeneration of a targeted assay begins with selecting the targeted protein(s) and/or peptide(s) required for the biological study. Here, all targeted protein sequences were simultaneously uploaded from a FASTA file exported from Proteome DiscovererTM software. The second step of the workflow is to create a list of peptides to serve as surrogates for the target proteins. A list of theoretical peptides was created through the in silico digestion of the targeted protein sequences (Figure 4A).

Trypsin was selected as the enzyme used in the experimental protocol. Refinement of the resulting in silico peptide list was accomplished by defining a sequence length from 7 to 50 amino acids and excluding peptides containing potentially modified amino acids such as cysteine and methionine. Proteotypic peptides were selected as being unique to the target protein in a reference database containing all known mouse protein sequence in Swiss-Prot. The peptide list was finally refined by only selecting peptides that were already identified by SEQUEST and therefore present in Proteome Discoverer software results files (Figure 4B). A total of 975 peptides corresponding to 289 proteins

A

B

Figure 4

A B

Figure 4

A B

Figure 4. Method setup for targeted quantitation with Pinpoint software. Selection of proteotypic peptides based on in silico digestion of target proteins (A) and previous identification with SEQUEST algorithm through the import of Proteome Discoverer spectral libraries (B).

Page 5: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

were selected for targeted quantitation. Some proteins failed to fulfill the peptide selection criteria described above, mostly because they were missing proteotypic peptides resulting from a complete digestion with trypsin. One solution would be to target peptides with one missed cleavage site and verify that the digestion efficiency is similar in all samples. It is also possible to target theoretical proteotypic peptides and use the hydrophobicity factor of the peptide for the retention time validation.

Quantitation was performed based on full-scan MS measurements and retention time confirmed in Proteome Discoverer software. Peptides were quantified by extracting the first, second and third isotope of each charge state of identified peptide (Figure 5). For each mass, an extracted ion chromatogram was generated with 5-ppm mass tolerance and a retention-time-shift tolerance of 2 minutes between the sample runs. Peak areas with a minimum of 10,000 counts were integrated and normalized against the signal of the internal standard

Figure 5. Method setup for targeted quantitation with Pinpoint software: Batch selection of peptides and isotopes which were used for quantitation. Only peptides following in silico digestion rules and present in spectral libraries were selected. The first three isotopes of each identified charge states were used for the quantitation.

Figure 5

975 peptides 289 proteins

Page 6: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

peptide FTQAGSEVSALLGR. Peptides and proteins were filtered out based on a coefficient of variation (CV) of normalized signal area below 50% in each sample group.

In total, 230 proteins were successfully analyzed and relative quantitation was performed between MHC-I inducible mice and healthy controls. Of these, 84 proteins were validated as being down-regulated with ratios ≤ 0.5 and 103 proteins were found to be up-regulated with protein expression levels at least two times higher in MHC-I–mice (ratios ≥ 2). Figure 6 displays two myosin forms which were down-regulated. High consistency was observed between all the peptide ratios. This can be explained by the pre-selection of peptides used for quantitation based on previous identification with the SEQUEST algorithm and their uniqueness to the target proteins. Some proteins were found to be highly up-regulated with ratios >100 such as protein-glutamine gamma-glutamyltransferase 2 (Figure 7). In addition, 59 proteins could not be quantified because of

excessive biological variations or inconsistency of the peptide responses. Finally, 42 proteins were found not to be differentially expressed between the two sample groups although they were in the discovery phase. This emphasizes the need of a targeted approach on proteotypic peptides for qualifying biomarkers.

The accession numbers of potential biomarkers were submitted to ProteinCenter data interpretation software in order to retrieve relevant biological information from publicly available protein databases (including gene ontology references and correlation with metabolic pathways). Six of the down-regulated proteins were found to be involved in the “Dilated cardiomyopathy” KEGG pathway (Figure 8A) and six of the up-regulated proteins also showed a correlation with the “Antigen processing and presentation” pathway (Figure 8B). This was anticipated because of the high expression level of H-2Kb protein in MHC-I inducible mice.

Figure 6. Example of quantitation of Myosin regulatory light chain 2 and Myosin light chain 1/3 at the protein level (A and C) and at the peptide level (B and D). R=Ratio.

A B

Myosin regulatory light chain 2

Myosin light chain 1/3

R=0.25

R=0.28

R=0.24 R=0.27 R=0.24

R=0.22 R=0.29 R=0.29 R=0.22

Figure 6 A B

C D

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

3.00E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3 EAFTVIDQNR GADPEDVITGAFK FSQEEIK

EAFLLFD

R

ITLSQVGDVLR

DQGGYEDFVEGLR

HVLATLGEK

Transgenic mice

Control

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Control

Myosin regulatory light chain 2

Myosin light chain 1/3

R=0.25

R=0.28

R=0.24 R=0.27 R=0.24

R=0.22 R=0.29 R=0.29 R=0.22

Figure 6 A B

C D

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

3.00E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3 EAFTVIDQNR GADPEDVITGAFK FSQEEIK

EAFLLFD

R

ITLSQVGDVLR

DQGGYEDFVEGLR

HVLATLGEK

Transgenic mice

Control

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Control

Myosin regulatory light chain 2

Myosin light chain 1/3

R=0.25

R=0.28

R=0.24 R=0.27 R=0.24

R=0.22 R=0.29 R=0.29 R=0.22

Figure 6 A B

C D

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

3.00E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3

0.00E+00

5.00E+00

1.00E+01

1.50E+01

2.00E+01

2.50E+01

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3 EAFTVIDQNR GADPEDVITGAFK FSQEEIK

EAFLLFD

R

ITLSQVGDVLR

DQGGYEDFVEGLR

HVLATLGEK

Transgenic mice

Control

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

Control

C D

Page 7: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

Figure 7. Example of quantitation of protein-glutamine gamma-glutamyltransferase 2 at the protein level (A) and the peptide level (B).

A

Figure 8. Biological annotation with ProteinCenter. A statistical analysis was performed against Swiss-Prot Mouse database for KEGG pathway annotations. (A) Down-regulated proteins with ratios ≤ 0.5. (B) Up-regulated proteins with ratios ≥ 2. R=Ratio.

Protein-glutamine gamma-glutamyltransferase 2

Figure 7

A B

Control

0.00E+00 5.00E-03 1.00E-02 1.50E-02 2.00E-02 2.50E-02 3.00E-02 3.50E-02 4.00E-02

NEFGELESNK

YPEGSPEER

GLLIEPAANSYLLA

ER

DLYLENPEIK

SVEVSDPVPAGDLVK

Control_1

Control_2

iMHC_1

iMHC_2

iMHC_3

GYEASVDSLTFGAV

TGPDPSEEAGTK

Transgenic mice

Transgenic mice

Transgenic mice

Transgenic mice

B

Figure 8

A

B

R=0.23 R=0.24

R=0.34

R=0.34 R=0.34 R=0.21

R=23.3 R>100

R=4.1

R=2.7 R>100 R=4.3

A

B

Page 8: High-Resolution Targeted Quantitation:Biomarker Discovery ....pdf · High-Resolution Targeted Quantitation: Biomarker Discovery in a Mouse Transgenic Model of Myopathy Claire Dauly1,

Part of Thermo Fisher Scientific

www.thermoscientific.comLegal Notices: ©2011 Thermo Fisher Scientific Inc. All rights reserved. SEQUEST is a registered trademark of the University of Washington. Swiss-Prot is a registered trademark of SIB Institut Suisse de Bioinformatique Foundation Switzerland. All other trademarks are the property of Thermo Fisher Scientific and its subsidiaries. This information results from not-for-profit research and is presented solely as an example of the capabilities of Thermo Fisher Scientific Inc. products. It is not intended to encourage use of these products in any manners that might infringe the intellectual property rights of others. Specifications, terms and pricing are subject to change. Not all products are available in all countries. Please consult your local sales representative for details.

Thermo Fisher Scientific, San Jose, CA USA is ISO Certified.

In addition to these

offices, Thermo Fisher

Scientific maintains

a network of represen­

tative organizations

throughout the world.

Africa-Other +27 11 570 1840Australia +61 3 9757 4300Austria +43 1 333 50 34 0Belgium +32 53 73 42 41Canada +1 800 530 8447China +86 10 8419 3588Denmark +45 70 23 62 60 Europe-Other +43 1 333 50 34 0Finland/Norway/ Sweden +46 8 556 468 00France +33 1 60 92 48 00Germany +49 6103 408 1014India +91 22 6742 9434Italy +39 02 950 591Japan +81 45 453 9100LatinAmerica +1 561 688 8700Middle East +43 1 333 50 34 0Netherlands +31 76 579 55 55NewZealand +64 9 980 6700Russia/CIS +43 1 333 50 34 0SouthAfrica +27 11 570 1840Spain +34 914 845 965Switzerland +41 61 716 77 00UK +44 1442 233555USA +1 800 532 4752

AN63495_E 10/11S

Conclusion• A complete biomarker discovery workflow including

label-free analysis followed by targeted quantitation of potential biomarkers was applied to the study of MHC-I induced mice developing myopathy.

• PinpointTM software was used to perform targeted quantitation based on full-scan MS measurements from five separately acquired LC-MS/MS runs corresponding to two control samples and three transgenic mice with severe myopathy.

• A total of 230 proteins were quantified at high stringency based on the discovery data.

• Several candidate biomarkers were identified and confirmed with protein expression ratios below 0.5 or above 2.

• Biomarkers could be correlated to metabolic pathways which are known to be associated with the disease.