Integrative Proteomics 01/24/2001 Mass Spectrometry for Protein Identification Brian Cox Integrative...
-
Upload
gregory-black -
Category
Documents
-
view
232 -
download
0
Transcript of Integrative Proteomics 01/24/2001 Mass Spectrometry for Protein Identification Brian Cox Integrative...
Integrative Proteomics01/24/2001
Mass Spectrometry for Protein Identification
Brian Cox Integrative Proteomics
Translation elongation factorEF-Tu
Integrative Proteomics01/24/2001
Protein Characterization by Mass Spectrometry
• Protein identification– correlative database matching
• Quality control– recombinant protein, protein modification reactions
• Post-translational modifications– phosphorylation, glycosylation, methylation, etc
• Identification of Domains– Functional structural domains, pro-enzyme to enzyme cleavage
Integrative Proteomics01/24/2001
Sarcoplasmic Reticulum Terminal Cisternae Fraction
Myosin PI3 Kinase
TAK-1 TGF beta activated kinase Sarcaluminum precursor
Diamine oxidase 6-Phosphofructokinase, aconitase
GRP 78 HSP 70, GRP 75
Calsequestrin, SK muscle Calsequestrin, SK muscle
SR 53 kDa, ATP synthase alpha
Glyceraldehyde 3-phosphate dehydrogenase Malate dehydrogensase L-lactate dehydrogenase
Calretenin
Succinate dehydrogenase
GTP:AMP Phosphotransferase
SFI SFI
SFI 3-hydroxyacyl-CoA dehydrogenase type II
SFI Myosin light chain
SFI
Myosin regulatory l, light chain two SFI
SFI
SFI
Cytochrome C
Chaperonin 10
Nitric oxide synthase
Calcium transporting ATPase (SERCA1)
6-Phosphofructokinase Glycerol-3-phosphate dehydrogenase ADP, ATP carrier protein
ATP synthase Alpha subunit ATP synthase Alpha/Beta subunit
ATP synthase Beta subunit
Actin SFI SFI
ADP, ATP carrier protein
SFI SFI
SFI
SFI
SFI
Cytochrome C
SFI SFI Complex I-MLRQ
} Similar spectra
} Similar spectra
S P
HSP60, Rabbit serum albumin
ATP synthase beta Creatine kinase, Citrate synthase
Fructose bisphosphate aldolase Aspartate aminotransferase
Actin
Voltage independant pH sensitive K+ channel
3-hydroxyacyl-CoA dehydrogenase type II
Integrative Proteomics01/24/2001
0Salt Elution
[NusA]
Ligand
RNA polymerase subunit ’RNA polymerase subunit
Hypothetical ORFRNA polymerase subunit
SUHB
Affinity Chromatography - E. coli NusA
Integrative Proteomics01/24/2001
2D Gel Analysis
Integrative Proteomics01/24/2001
Identifying and Mapping Protein Domains by Partial Proteolysis
NusA
[Trypsin]
ProteolyticProducts
0
Integrative Proteomics01/24/2001
How Do I Identify Isolated Proteins?
• Genetics – applicable to simple organisms
– time consuming
• Antibody screening – identifies only known, usually well characterized proteins
• Edman sequencing – requires large quantities of sample
• Mass Spectrometry – Correlative mass matching (sequence must be known)
– MS/MS fragment data (sequence)
Integrative Proteomics01/24/2001
MS Protein Identification
MALDI-ToF
Triple quad
Ion Trap
Q-ToF
FT-MS
Liquid
Solid
time consuming, PTMs, protein sequencing
high throughput, sensitive
Instruments
Integrative Proteomics01/24/2001
MALDI-ToF vs ElectrosprayWhich do I use?
• Experimental design – What kind of sample will I have
• Reason for analysis– Unknown protein, quality control or mapping of a
known protein
• Species– Is the genome fully sequenced?
Integrative Proteomics01/24/2001
MALDI-ToF MS
100 well sample plate
Bruker DaltonicsReflex III ©
384 well sample plate
Integrative Proteomics01/24/2001
MALDI-TOF
• Typically a resolved (isolated) protein• Protein identified by correlative mass mapping of
peptides generated by a known chemistry– Enzymatic or chemical
• Protein sequence must be in a database for identification
• Identification meausred by a statistical matching score
Integrative Proteomics01/24/2001
Electrospray MS
Applied Biosystems | MDS SCIEX API 3000™ LC/MS/MS System
Applied Biosystems | MDS SCIEX QSTAR™ Hybrid LC/MS/MS Quadrupole TOF System
Integrative Proteomics01/24/2001
Electrospray MS
• Usually requires a chromatographic resolving step as only a limited number of peptides can be analyzed at one time
• Full cell lysate or sub-fraction can be analyzed• Generate MS/MS data (fragmentation of peptide
for de novo sequence or small sequence tags)• Fragmentation can generate enough data that only
one or two peptides are required for a positive identification
Integrative Proteomics01/24/2001
MALDI-ToF Schematic
Detectors
Flight tube
Timed ion selectorLaser
Sampleplate
Reflector
Acceleratingfield
++
+
+
+
+ +
Integrative Proteomics01/24/2001
In-Gel Sample Preparation
C E Candidate protein bands are excised, reduced, alkylated, and digested overnight with trypsin.
Tryptic peptides are recovered, purified and analyzed by MALDI-ToF mass Spectrometry.
Integrative Proteomics01/24/2001
Reflector MALDI Spectrum
Integrative Proteomics01/24/2001
Reflector MALDI Spectrum
Integrative Proteomics01/24/2001
Correlative Database Searching
• Based on the principal that each protein generates its own unique “fingerprint” of proteins after cleavage by an enzymatic or chemical method into smaller peptides
• A list of experimental masses can then be compared to lists of calculated masses for each known sequence
Integrative Proteomics01/24/2001
Correlative Database Searching
• Sequence database used to generate lists of peptide masses based on chemistry rules– Enzymatic (trypsin) or chemical cleavage (CNBr)
• Parameters of search– Chemistry of peptide generation, database restrictions:
mol. wt., taxa, error tolerance for peptide matching
Integrative Proteomics01/24/2001
Correlative Database Searching
• Mann et al, 1993 used intact mass and peptide masses to identify unknown proteins.
• Error was from 0.5 to several Daltons for peptides due to poor resolution, average masses only
• Protein with most peptides matching was ranked first, true mass correlation
• Problem was large proteins (>100kDa) generate large number of peptide masses, spurious matches
• Use restriction of maximum protein size to prevent this
• Total protein sequences available was only 26,000
Integrative Proteomics01/24/2001
Correlative Database Searching
Available number of sequences as of Nov. 1999NCBInr 415408
Available number of sequences as of June 2000NCBInr 510935
Available number of sequences as of Jan. 2001NCBInr 606272
Integrative Proteomics01/24/2001
Correlative Database Searching
• Improvements in instrument mass accuracy and resolution• Countered by dramatic increase in number of sequences
available to search• Develop new searching algorithms using weighted
statistical value of a peptide, larger mass peptide contains more “unique information”
• Knowledge of distribution frequencies of peptide masses allowed for calculation of probability for a match (Perkins et al 1999 and Eriksson et al 2000) – using database size (number of peptides), error tolerance, number
of matching peptides (weighted)
Integrative Proteomics01/24/2001
Correlative Database Searching
• More parameters for calculation of a match– Correlation coefficient (r2) for matching peptides to
error tolerance– Amino acid composition from chemical tags or MS/MS
spectrum– Calculation of a Z score for true probability certainty of
the match ”…a Z score is estimated when the search result is compared against an estimated random match population. Z score is the distance to the population mean in unit of standard deviation.”
Proteometrics at the 48th ASMS annual meeting
Integrative Proteomics01/24/2001
Reflector MALDI Spectrum
Integrative Proteomics01/24/2001
Reflector MALDI Spectrum
Integrative Proteomics01/24/2001
Reflector MALDI Spectrum
Integrative Proteomics01/24/2001
Proteometrics home pageProfound Search
Integrative Proteomics01/24/2001
Profound Search
Integrative Proteomics01/24/2001
Profound Search
Integrative Proteomics01/24/2001
Profound Search
Integrative Proteomics01/24/2001
Profound Search
Integrative Proteomics01/24/2001
Profound Search
Integrative Proteomics01/24/2001
Protein Prospector HomepageMS-Fit Search
Integrative Proteomics01/24/2001
MS-Fit Search
Integrative Proteomics01/24/2001
MS-Fit Search
Integrative Proteomics01/24/2001
MS-Fit Search
Integrative Proteomics01/24/2001
MS-Fit Search
Integrative Proteomics01/24/2001
MS-Fit Search
Integrative Proteomics01/24/2001
Homologous Proteins
Matching peptide to Xenopus protein
Human protein spectrum
Integrative Proteomics01/24/2001
Homologous Proteinstrans reg pw ang #36
Residue NumberN 100 200 300 400 500
1
3
5
7
9
11
13
15
17
19
#
# Mass Matching sequence (± error)
1 3100.3 K[1-26]R (+0.35)
2 3123.3 T[27-52]K (+0.21) G[174-201]R (-0.99)
3 1099.1 K[73-81]K (-0.77) T[420-428]K (-0.81)
4 2754.3 G[147-172]K (-1.06)
5 2625.8 E[190-211]K (+0.95)
6 3407.4 S[229-258]K (-2.75)
7 2349.6 Y[273-293]K (+0.20)
8 1648.6 N[294-307]R (+0.18)
9 1355.4 I[308-319]R (+0.12)
10 1677.8 N[320-334]K (+0.01)
11 2855.4 N[320-344]K (-1.24) N[204-228]K (-1.25)
12 1827.0 A[345-361]R (+0.03) E[197-211]K (-1.07) K[128-143]K (+0.16)
13 2154.4 H[377-395]K (-0.07) K[109-127]K (-1.06)
14 2212.4 H[396-412]R (-0.04)
15 2568.7 N[424-444]R (+0.18)
16 1937.2 V[429-444]R (-0.02)
17 1295.1 E[435-444]R (+0.28)
18 1111.0 D[445-454]R (-0.84)
19 1129.0 N[496-504]K (+0.32)
20 1172.0 No matches
21 1241.2 No matches
22 1257.2 No matches
23 1331.7 No matches
24 1447.2 No matches
25 1513.5 No matches
26 1573.5 No matches
27 1603.2 No matches
28 1625.6 No matches
29 2164.2 No matches
30 2274.6 No matches
31 2528.6 No matches
32 2542.8 No matches
33 2685.3 No matches
34 2813.1 No matches
35 2829.0 No matches
36 2959.6 No matches
37 3180.1 No matches
38 3212.9 No matches
Xenopus Human EST
Linear sequence of protein
Pep
tid
e m
atch
nu
mb
er
Integrative Proteomics01/24/2001
MS/MS
• Fragmentation of parent ion (a peptide) to produce sequence dependent data
• Fragmentation induced in MALDI as post source decay
• In electrospray by CID (collision induced decay)• Searching done by similar principles
Integrative Proteomics01/24/2001
MALDI-ToF Schematic
Detectors
Flight tube
Timed ion selectorLaser
Sampleplate
Reflector
Acceleratingfield
Integrative Proteomics01/24/2001
Peptide Fragmentation
P. Roepstorff and J. Fohlman, Biomed. Mass Spec. 11 (1984) 601. K. Biemann, Biomed. Env. Mass Spec. 16 (1988) 99.
Integrative Proteomics01/24/2001
Peptide Fragmentation
NH3-D R V Y I H P F H L-COOH
P F H L-COOH
Y I H P F H L-COOH
NH3-D R V
NH3-D R V Y I H
R V Y
I H P H
y series
b series
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation :FYEEVHDLER
R E L D H V E
F Y E E V H D
HD
VH
D
EV
HD
EE
VH
D
B ions
Y ions
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
Peptide Fragmentation
Integrative Proteomics01/24/2001
ElectrosprayHPLC UV trace
MS scan (deconvoluted)
MS/MS of selected ion
Integrative Proteomics01/24/2001
Electrospray
• Continuous scanning for peptides above a threshold intensity
• Selected peptides under go MS/MS
• Return to scan to find next candidate
• Slow sample introduction for increased peak width off HPLC to allow more time of MS/MS experiments
• MS/MS process less than a few seconds
• Data used for search is a peptide mass plus any sequence tags determined from MS/MS
Integrative Proteomics01/24/2001
Unmatched spectrum
PSD candidates
Integrative Proteomics01/24/2001
PSDs of candidate peptides
Integrative Proteomics01/24/2001
Coverage of ESTNew Sequence
Residue NumberN 50 100 150 200
1
3
5
7
#
# Mass Matching sequence (± error)
1 2789.4 G[32-57]K (-0.35)
2 2903.9 F[58-84]K (-0.73)
3 1665.0 V[90-103]R (-0.10)
4 2959.7 A[104-130]R (-0.43)
5 1429.4 D[119-130]R (+0.10)
6 917.1 I[131-137]R (+0.00)
7 4879.0 C[138-182]K (-0.53)
8 861.5 No matches
9 880.2 No matches
10 1245.4 No matches
11 1342.5 No matches
12 1373.4 No matches
13 1404.5 No matches
14 1453.3 No matches
15 1562.8 No matches
16 1686.1 No matches
17 1724.1 No matches
18 2120.7 No matches
19 2164.4 No matches
20 2197.9 No matches
21 2274.9 No matches
22 2752.6 No matches
23 2917.3 No matches
24 2973.7 No matches
25 2987.7 No matches
26 3180.9 No matches
27 3780.4 No matches
28 3979.1 No matches
29 4350.6 No matches
30 4395.7 No matches
31 4495.3 No matches
32 4508.2 No matches
33 4893.1 No matches
EST PSD
Linear sequence of protein
Integrative Proteomics01/24/2001
Conclusion
• Mass spectrometry will continue to be a powerful tool for the analysis of biopolymers
• Further improvements in instrumentation and computational sciences will increase the utility and versatility of mass spectrometry
• Simplification of process will lead to greater use and availability of this technology– e.g. wide variety molecular biology kits currently
available
Integrative Proteomics01/24/2001
Contact Us
Integrative Proteomics is located in Downtown Toronto
For employment inquires please see our web page at
www.integrativeproteomics.com