Mass Spectrometry Applications for Comparative Proteomics ...
Proteomics and mass spectrometry Manimalha Balasubramani.
-
date post
21-Dec-2015 -
Category
Documents
-
view
222 -
download
0
Transcript of Proteomics and mass spectrometry Manimalha Balasubramani.
Proteomics and mass spectrometry
Manimalha Balasubramani
Outline
Mass spectrometers Protein identification Quantitative proteomics Protein-protein interactions
A mass spectrum
799.0 1179.2 1559.4 1939.6 2319.8 2700.0
Mass(m/z)
6.3E+4
0
10
20
30
40
50
60
70
80
90
100
% I
nte
nsi
ty
1258
.560
3
1303
.700
7
878.
4913
983.
4860
1031
.537
4
1254
.561
4
1114
.542
8
1657
.795
3
842.
4926
1586
.806
4
1232
.590
7
1964
.888
2
924.
5113
1153
.533
4
965.
4456
1035
.569
6
833.
0566
1315
.578
012
80.5
370
1074
.540
5
2120
.988
3
1360
.720
9
1191
.613
0
1689
.786
5
870.
5201
2518
.106
2
1800
.932
4
1395
.706
2
1848
.941
9
1475
.737
4
1630
.773
8
2211
.052
0
1593
.769
3
2393
.082
3
2169
.920
7
2439
.087
2
2021
.911
6
Basically measures mass
Adapted from google
Components…
Adapted from an Analytical chemistry textbook
Ionization process
ESIESI
Nobel prize in Chemistry, 2002
Matrix Assisted Laser Desorption Ionization
ElectroSpray Ionization
MALDIMALDI
MALDI – Matrix Assisted Laser Desorption Ionization
ESI – Electro Spray Ionization
Mass analyzers – several designs
Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207
GPCL inventory ABI Voyager DE PRO, walk-up use ABI 4700 Proteomics Analyzer Thermoelectron LCQ Deca with Surveyor
HPLC ABI Qstar Elite with Ultimate 3000 HPLC Bruker micrOTOF with Ultimate 3000 HPLC Bruker 12 Tesla FTMS with Ultimate 3000
HPLC
Time-of-flight (TOF) analyzers
MALDI TOFVoyager DE PRO
ESI TOFUltimate 3000 with micrOTOF
MALDI TOF - principle
KE = zeV = 1mv2
2
MS of serum albumin
MALDI TOF
ESI TOF
Tandem mass spectrometer
MALDI TOF/TOF
MS and MS/MS
Ion Trap
MS, MS2, MS3, ….MSn
Quadrupole-q-TOF
ESI QqTOF
…installation phase….
FT MS
…bottom line…
..Resolution and mass accuracy…..Resolution and mass accuracy…
FWHMFull width at half maxima of a peak
Resolution and mass accuracy
R = M Δm
R = resolutionM = mass of the peak of interest Δ m = width in daltons of the peak
Δm measured at 50% peak
height is the Full Width at Half
Maxima (FWHM)
Mass accuracy is measured as parts per million value
ppm = 106Δm = 106 M R
outline
Mass spectrometers Protein identification Quantitative proteomics Protein-protein interactions
Peptide Mass Fingerprinting - PMF
Database entry NCBI
From: http://gobi.ym.edu.tw/course/mass/2004-0325.pdf
Informatics Search engines
Mascot, Matrix Science Sequest, Thermoelectron
Free-ware Protein prospector (http://prospector.ucsf.edu/) TPP tools
(http://tools.proteomecenter.org/TPP.php)
Database searching using MASCOT
Overview of the experimentSubmission of data to MASCOT
webserver
1D SDS PAGE of proteins
Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207
Mass spectrum
699.0 1159.2 1619.4 2079.6 2539.8 3000.0
Mass (m/z)
1.6E+4
0
10
20
30
40
50
60
70
80
90
100
% In
tens
ity
4700 Reflector Spec #1 MC=>TR[BP = 1479.9, 15779]
1479
.882
4
1439
.896
7
1567
.827
6
1163
.700
0
2045
.127
3
927.
5582
1881
.022
3
1724
.927
2
1305
.788
8
1730
.772
3
1399
.775
1
1249
.695
4
1895
.038
6
1283
.788
1
1433
.807
4
1554
.743
7
1640
.027
7
841.
5205
2555
.290
3
1763
.782
0
1687
.869
1
2262
.055
7
1516
.713
5
1014
.682
7
1590
.861
9
1081
.547
9
1121
.552
0
2458
.305
2
1195
.624
3
789.
5378
898.
5428
2493
.350
1
Mass to charge ratio (m/z)
Inte
nsity
Peak list Compiled from the mass spectra
Mass list Mass list and intensity
Submitted to the search engine
http://www.matrixscience.com/
Mascot scoringA frequency factor matrix, F, is created, in which each row represents an interval of 100 Da in peptide mass, and each column an interval of 10 kDa in intact protein mass. As each sequence entry is processed, the appropriate matrix elements fi,j are incremented so as to accumulate statistics on the size distribution of peptide masses as a function of protein mass. The elements of F are then normalised by dividing the elements of each 10 kDa column by the largest value in that column to give the Mowse factor matrix M:
After searching the experimental mass values against a calculated peptide mass database, the score for each entry is calculated according to:
Where MProt is the molecular weight of the entry and the product term is calculated from the Mowse factor elements for each match between the experimental data and peptide masses calculated from the entry.
List of common contaminants Trypsin autolysis peptides Matrix peaks Keratin from skin, hair Other contaminants
Protein Identification
Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207
Tandem mass spectrum
http://qbab.aber.ac.uk
Tandem mass spectrum
69.0 386.8 704.6 1022.4 1340.2 1658.0
Mass (m/z)
3105.9
0
10
20
30
40
50
60
70
80
90
100
% Int
ensit
y
4700 MS/MS Precursor 1570.7 Spec #1 MC[BP = 175.1, 3106]
175.1326
1056.51071554.7853
1571.9679684.3845
1556.5172
112.0977 1558.4042813.4371
246.1672 333.21051559.94171441.7213
480.27491039.4810316.1747 1570.2634741.3559463.2531 627.3450 942.4836120.0979 1040.9976 1551.70021268.54271171.5131400.2173229.1560 758.3326490.3423 629.3128 910.867972.1029 1445.2834837.0470
Database Searching•Peptide Mass Fingerprinting•Sequence tag approach
De novo sequencinginspect raw data http://qbab.aber.ac.uk
Tandem mass spectra (MS/MS) can be used for peptide sequencing
Mascot Search Results Search title : SampleSetID: 362, AnalysisID: 567, MaldiWellID: 15790, SpectrumID: 17225, Path=\Mani\102004\New Analysis 1 Database : NCBInr 20040606 (1846720 sequences; 611532004 residues) Timestamp : 20 Oct 2004 at 14:52:50 GMT Top Score : 681 for gi|180570, creatine kinase [Homo sapiens]
Probability Based Mowse Score
Score is -10*Log(P), where P is the probability that the observed match is a random event. Protein scores greater than 75 are significant (p<0.05).
Accession Mass Score Description 1. gi|180570 42591 681 creatine kinase [Homo sapiens] 2. gi|21536286 42617 681 brain creatine kinase; creatine kinase-B [Homo sapiens] 3. gi|33304149 42730 681 creatine kinase, brain [synthetic construct] 4. gi|125292 42674 568 CREATINE KINASE, B CHAIN (B-CK) [Cannis familiaris] 5. gi|180572 42658 538 creatine kinase-B 6. gi|125295 42636 514 CREATINE KINASE, B CHAIN (B-CK) 7. gi|180555 42460 507 creatine kinase-B 8. gi|203476 40598 473 creatine kinase-B 9. gi|31542401 42685 471 creatine kinase, brain [Rattus norvegicus] 10. gi|203474 42699 471 creatine kinase 11. gi|40807002 44540 469 Unknown (protein for IMAGE:5598839) [Rattus norvegicus] 12. gi|47477783 44782 469 Ckb protein [Rattus norvegicus] 13. gi|13096153 42551 441 Chain A, Crystal Structure Of Bovine Retinal Creatine Kinase 14. gi|12852054 42700 427 unnamed protein product [Mus musculus] 15. gi|10946574 42686 427 creatine kinase, brain [Mus musculus] 16. gi|47213348 42953 237 unnamed protein product [Tetraodon nigroviridis] 17. gi|627264 40353 236 creatine kinase (EC 2.7.3.2) isozyme IV - African clawed frog 18. gi|27503418 42214 235 Ckb-prov protein [Xenopus laevis] 19. gi|45384340 42844 209 B-creatine kinase [Gallus gallus] 20. gi|6573489 42713 201 Chain A, Crystal Structure Of Chicken Brain-Type Creatine Kinase
Top hits from Mascot Search – there are multiple accession numbers for the same protein
Search returns a cluster of proteins with the same matching peptides
1. gi|180570 Mass: 42591 Score: 681 creatine kinase [Homo sapiens] Observed Mr(expt) Mr(calc) Delta Start End Miss Ions Peptide 1232.62 1231.61 1231.61 0.00 87 - 96 0 45 DLFDPIIEDR 1232.62 1231.61 1231.61 0.00 87 - 96 0 ---- DLFDPIIEDR 1254.57 1253.56 1253.58 -0.02 97 - 107 0 ---- HGGYKPSDEHK 1303.70 1302.70 1302.72 -0.02 33 - 43 0 ---- VLTPELYAELR 1303.70 1302.70 1302.72 -0.02 33 - 43 0 54 VLTPELYAELR 1458.70 1457.69 1457.67 0.02 139 - 151 1 ---- GFCLPPHCSRGER 1586.81 1585.80 1585.83 -0.03 157 - 172 0 81 LAVEALSSLDGDLAGR 1586.81 1585.80 1585.83 -0.03 157 - 172 0 ---- LAVEALSSLDGDLAGR 1656.79 1655.79 1655.82 -0.03 367 - 381 0 ---- LEQGQAIDDLMPAQK 1657.80 1656.79 1656.83 -0.04 224 - 236 0 47 TFLVWVNEEDHLR 1657.80 1656.79 1656.83 -0.04 224 - 236 0 ---- TFLVWVNEEDHLR 1848.94 1847.93 1847.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK 1864.93 1863.92 1863.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK 1964.88 1963.88 1963.92 -0.05 321 - 341 0 ---- GTGGVDTAAVGGVFDVSNADR 1964.88 1963.88 1963.92 -0.05 321 - 341 0 139 GTGGVDTAAVGGVFDVSNADR 2120.98 2119.97 2120.02 -0.05 320 - 341 1 ---- RGTGGVDTAAVGGVFDVSNADR 2120.98 2119.97 2120.02 -0.05 320 - 341 1 27 RGTGGVDTAAVGGVFDVSNADR 2169.91 2168.91 2168.96 -0.05 14 - 32 0 ---- FPAEDEFPDLSAHNNHMAK 2225.06 2224.05 2224.17 -0.12 157 - 177 1 ---- LAVEALSSLDGDLAGRYYALK 2439.08 2438.07 2438.14 -0.07 12 - 32 1 31 LRFPAEDEFPDLSAHNNHMAK 2439.08 2438.07 2438.14 -0.07 12 - 32 1 ---- LRFPAEDEFPDLSAHNNHMAK 2518.10 2517.09 2517.16 -0.07 108 - 130 0 92 TDLNPDNLQGGDDLDPNYVLSSR 2518.10 2517.09 2517.16 -0.07 108 - 130 0 ---- TDLNPDNLQGGDDLDPNYVLSSR 3753.61 3752.60 3752.73 -0.13 97 - 130 1 ---- HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR 3753.61 3752.60 3752.73 -0.13 97 - 130 1 55 HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR
4. gi|125292 Mass: 42674 Score: 568 CREATINE KINASE, B CHAIN (B-CK) Observed Mr(expt) Mr(calc) Delta Start End Miss Ions Peptide 1254.57 1253.56 1253.58 -0.02 97 - 107 0 ---- HGGYKPSDEHK 1303.70 1302.70 1302.72 -0.02 33 - 43 0 ---- VLTPELYAELR 1303.70 1302.70 1302.72 -0.02 33 - 43 0 54 VLTPELYAELR 1458.70 1457.69 1457.67 0.02 139 - 151 1 ---- GFCLPPHCSRGER 1586.81 1585.80 1585.83 -0.03 157 - 172 0 81 LAVEALSSLDGDLAGR 1586.81 1585.80 1585.83 -0.03 157 - 172 0 ---- LAVEALSSLDGDLAGR 1624.76 1623.75 1623.85 -0.10 367 - 381 0 ---- LEQGQAIDDLVPAQK 1848.94 1847.93 1847.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK 1864.93 1863.92 1863.97 -0.04 342 - 358 0 ---- LGFSEVELVQMVVDGVK 1964.88 1963.88 1963.92 -0.05 321 - 341 0 ---- GTGGVDTAAVGGVFDVSNADR 1964.88 1963.88 1963.92 -0.05 321 - 341 0 139 GTGGVDTAAVGGVFDVSNADR 2120.98 2119.97 2120.02 -0.05 320 - 341 1 ---- RGTGGVDTAAVGGVFDVSNADR 2120.98 2119.97 2120.02 -0.05 320 - 341 1 27 RGTGGVDTAAVGGVFDVSNADR 2169.91 2168.91 2168.96 -0.05 14 - 32 0 ---- FPAEDEFPDLSAHNNHMAK 2225.06 2224.05 2224.17 -0.12 157 - 177 1 ---- LAVEALSSLDGDLAGRYYALK 2439.08 2438.07 2438.14 -0.07 12 - 32 1 31 LRFPAEDEFPDLSAHNNHMAK 2439.08 2438.07 2438.14 -0.07 12 - 32 1 ---- LRFPAEDEFPDLSAHNNHMAK 2518.10 2517.09 2517.16 -0.07 108 - 130 0 92 TDLNPDNLQGGDDLDPNYVLSSR 2518.10 2517.09 2517.16 -0.07 108 - 130 0 ---- TDLNPDNLQGGDDLDPNYVLSSR 3753.61 3752.60 3752.73 -0.13 97 - 130 1 ---- HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR 3753.61 3752.60 3752.73 -0.13 97 - 130 1 55 HGGYKPSDEHKTDLNPDNLQGGDDLDPNYVLSSR
Nominal mass (Mr): 42591; Calculated pI value: 5.34Observed Mass & pI: 43kd, 6.2-6.27
Creatine kinase - B [Homo sapiens]Match to: gi|21536286 ; Score: 681
Sequence Coverage: 46%
1 MPFSNSHNAL KLRFPAEDEF PDLSAHNNHM AKVLTPELYA ELRAKSTPSG
51 FTLDDVIQTG VDNPGHPYIM TVGCVAGDEE SYEVFKDLFD PIIEDRHGGY
101 KPSDEHKTDL NPDNLQGGDD LDPNYVLSSR VRTGRSIRGF CLPPHCSRGE
151 RRAIEKLAVE ALSSLDGDLA GRYYALKSMT EAEQQQLIDD HFLFDKPVSP
201 LLSASGMARD WPDARGIWHN DNKTFLVWVN EEDHLRVISM QKGGNMKEVF
251 TRFCTGLTQI ETLFKSKDYE FMWNPHLGYI LTCPSNLGTG LRAGVHIKLP
301 NLGKHEKFSE VLKRLRLQKR GTGGVDTAAV GGVFDVSNAD RLGFSEVELV
351 QMVVDGVKLL IEMEQRLEQG QAIDDLMPAQ K
Creatine kinase B is the highest scoring protein
outline
Mass spectrometers Protein identification Quantitative proteomics Protein-protein interactions
Sample preparation
Quantitative Proteomics
From 2D gels ….to MALDI or ESI MS
Control Test
Cy3 Cy5
Pool
Image analysis with Delta2D, DecodonQuantitate
Export spot list to robotic picker
2nd Dimension – SDS PAGE
1st Dimension - Isoelectric focussing
Spot picking
Trypsin gel digest
..its high-throughput…
Colorectal cancer markersMass spectral analysis
MS
MS/MSm/z
m/z
Database Search
Protein Identified
YesImmunohistochemistry
2D
1D
Immunoblotting Validation
Isolate Nuclear Matrix
In-gelTryptic digest
No
Tumor specific markersCC3, CC4, CC5, CC6a, CC6b
de novo sequencing
Balasubramani et al., Cancer Res., 2006
Shotgun proteomics
Adapted from Aebersold, R.; Mann, M. Nature 2003, 422, 198-207
…typical workflow to identify biomarkers that distinguish indolent versus aggressive forms of cancer..
Group A, Indolent Group B, Aggressive
FractionateEg. Immunodeplete, subcellular
FractionateEg. Immunodeplete, subcellular
Tryptic peptides Tryptic peptides
Label with iTRAQ reagent 115 Label with iTRAQ reagent 116
Combine labeled digestsLC fractionate
MS and MS/MS
Protein ID and Quantitate
Sample handling
HPLC
1D or 2D LC MALDIIn-solutionIsoelectricfocussing
Protein-protein interaction studies Immunoaffinity pull-downs Tandem affinity purification
GPCL Billy W DayPaul Wood
Mirunalni ThangaveluTamanna SultanaEmanuel M SchreiberChris BolcatoChris Myers
Patrick MillerRobert Wolfe
definitions The amu is defined as 1/12th the mass of
one neutral 6C12 atom Amu is also called the dalton 1 amu =1/12 ( 12g 12C/mol 12C
6.0221 x 1023 atoms 12C/mol 12C
1.6605 x 10-24 g/atom 12C
Isotopic species of M (M + H)+ (M + 1H)/1H+
(M + 2H)2+ (M + 2H)/2H+
(M + 3H)3+ (M + 3H)/3H+