PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
1
Transcript of PROTEOMICS De novo sequence prediction for: nsi78_11.1803.1806.2.dta SequenceAbsoluteRelative...
PROTEOMICS
De novo sequence prediction for:nsi78_11.1803.1806.2.dta
Sequence Absolute RelativeProbability Probability
CRGSVNFP[PL]FK 3.9% 36.3%CRGSVN[DE][PL]FK 2.3% 24.7%CRGSVPFN[PN]FK 6.1% 17.2%CRGSV[SR]D[PL]FK 3.1% 6.5%CRGSVPFNWGDK <0.1% 2.7%
Genomics DNA (Gene)
FunctionalGenomics
Transcriptomics RNA
Proteomics PROTEIN
Metabolomics METABOLITE
Transcription
Translation
Enzymatic reaction
The “omics” nomenclature…
GenTranscriptProteMetabol
~ome Sequence of a complete set of
GenesTranscriptsProteinsMetabolites
=
GenProte
~omics = Analysis of the GenomeProteome
A few definitions…
Current -omics
The proteome is defined as the set of all expressed proteins in a cell, tissue or
organism (Wilkins et al., 1997).
Proteomics can be defined as the systematic analysis of proteins for their
identity, quantity and function.
Proteome Genome
dynamic static
No amplification possible
Amplification possible
Hetergenous molecules
Homogenous molecules
Large variability of the amount
No variability of the amount
Complexity of the proteome
Applications of Proteomics• Mining: identification of proteins (catalog the
proteins)• Quantitative proteomics: defining the relative or
absolute amount of a protein• Protein-expression profile: identification of
proteins in a particular state of the organism• Protein-network mapping: protein interactions in
living systems• Mapping of protein modifications: how and
where proteins are modified.
Proteins classes for Analysis
• Membrane
• Soluble proteins
• Organelle-specific
• Chromosome-associated
• Phosphorylated
• Glycosylated
• Multi-protein complexes
General flow for
proteomics analysis
SEPA
RA
TIO
N
IDE
NTIF
ICA
TIO
N
Current Proteomics Technologies• Proteome profiling/separation
– 2D SDS PAGE (two-dimensional sodium dodecylsulphate polyacrylamide gel electrophoresis)
– 2-D LC/LC (LC = Liquid Chromatography)– 2-D LC/MS (MS= Mass spectrometry)
• Protein identification– Peptide mass fingerprint– Tandem Mass Spectrometry (MS/MS)
• Quantative proteomics
- ICAT (isotope-coded affinity tag)
- SILAC (stable isotopic labeling of amino acids)
The first dimension (separation by isoelectric focusing)- gel with an immobilised pH gradient- electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0)
2D-PAGE gel
Isoelectric point (pI)
• Separation by charge:
4
5
6
7
8
9
10
Sta
ble
pH
g
rad
ien
t
High pH: protein is negatively charged
Low pH:Protein is positively charged
At the isolectric point the protein has no net charge and therefore no longer migrates in the electric field.
The first dimension (separation by isoelectric focusing)- gel with an immobilised pH gradient- electric current causes charged proteins to move until it reaches the isoelectric point (pH gradient makes the net charge 0)
The second dimension (separation by mass)-pH gel strip is loaded onto a SDS gel-SDS denatures and linearises the protein (to make movement solely dependent on mass, not shape)
2D-SDS PAGE gel
2D-SDS PAGE gel
2D-gel technique example
Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J. Mass Spectrom., 36, 1083-1091.
Some limitations of 2DE:
• Limited dynamic range of detection - bias towards high abundant proteins
• Co-migration of proteins
• Separation of proteins– Basic proteins (IP > 10)– Hydrophobic proteins– Small and large proteins (< 10; >150 kDa)
Methods for protein
identification
Mass Spectrometry (MS) Stages• Introduce sample to the instrument• Generate ions in the gas phase• Separate ions on the basis of differences in m/z
with a mass analyzer • Detect ions
Vacuum Vacuum SystemSystem
SamplesSamples
HPLCHPLCDetectorDetector
Data Data SystemSystem
Mass Mass AnalyserAnalyser
Ionisation Ionisation MethodMethod
MALDI
ESI
Aebersold, R. and Mann, M. (2003) Mass spectrometry-based proteomics. Nature, 422, 198-207.
Mass spectrometers used in proteomic research
Principles of MALDI-TOF Mass
Spectrometry
Mann, M., Hendrickson, R.C. and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem, 70, 437-473.
Electro-spray ionisation
ESI
M + RH+ MH+ + R (in solution)
Methods for protein
identification
Protein identification by Peptide Mass fingerprint
• Use MS to measure the masses of proteolytic peptide fragments.
• Identification is done by matching the measured peptide masses to corresponding peptide masses from protein or nucleotide sequence databases.
Mass spectrometry – method of separating molecules based on mass/charge ratio
Compare peptide m/z with protein databases
eg. MALDI-TOF
(trypsin)
Mass spectometry (MS)
Protein Identification by MS
Artificial spectra built
Artificially trypsinated
Database of sequences
(i.e. SwissProt)
Spot removed from gel
Fragmented using trypsin
Spectrum of fragments generated
MATCH
Lib
rary
MALDI peptide map and identification of a protein. A 116-kDa band was excised and subjected to tryptic digestion in gel.
Mann, M., Hendrickson, R.C. and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem, 70, 437-473.
Advantages vs. Disadvantages
• Determination of MW
• High-throughput capability
• Relative low costs
• Ambiguous results difficult to interpret
• Requires sequence databases for analysis
Limitations can be overcome by peptide sequencing using tandem mass spectrometry
How the protein sequencing works?
• Use Tandem MS: two mass analyzer in series with a collision cell in between
• Collision cell: a region where the ions collide with a gas (He, Ne, Ar) resulting in fragmentation of the ion
• Fragmentation of the peptides in the collision cell occur in a predictable fashion, mainly at the peptide bonds (also phosphoester bonds)
• The resulting daughter ions have masses that are consistent with known molecular weights of dipeptides, tripeptides, tetrapeptides…
Ser-Glu-Leu-Ile-Arg-Trp
Collision Cell
Ser-Glu-Leu-Ile-Arg
Ser-Glu-Leu
Ser-Glu-Leu-Ile
Etc…
Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J Mass Spectrom, 36, 1083-1091.
Schematic of a quadrupole TOF instrument
After traversing a countercurrent gas stream (curtain gas), the ions enter the vacuum system and are focused into the first quadrupole section (q0). They can be mass-separated in Q1 and dissociated in q2. Ions enter the time-of-flight analyzer through a grid and are pulsed into the reflector and onto the detector, where they are recorded. There are 14,000 pulsing events per second. Mann, M., Hendrickson, R.C. and Pandey, A. (2001) Analysis of proteins and proteomes by mass spectrometry. Annu Rev Biochem, 70, 437-473.
Peptide Fragmentation
Isolates individual peptide fragments for 2nd mass spec – can obtain peptide sequence
Compare peptide sequence with protein
databases
(trypsin)
Tandem Mass Spectrometry
Advantages vs. Disadvantages
• Determination of MW and aa. Sequence
• Detection of posttranslational modifications
• High-throughput capability
• High capital costs
• Requires sequence databases for analysis
LCIon trap
MS75 µm RP
200 nL to MSPeptide:1. MW2. Sequence3. Modification
Tryptic digested proteins
Coupling of LC and tandem MS
Polypeptides enter the column in the mobile phase……the hydrophobic “foot” of the polypeptides adsorb to the
hydrophobic (non polar) surface of the reverse-phase material (stationary phase) where they remain until……the organic modifier concentration rises to critical
concentration and desorbs the polypeptides
Reverse Phase column
0 20 40 60 80 100 120 140 160 180 200
Time (min)
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
10047.64
75.8157.90
82.90 104.24111.7774.48
134.7846.013.39 26.43 140.20 146.61 206.18160.29 181.98
47.97
83.07
82.0770.11
85.56 102.41 126.8946.01
134.7843.6329.48 144.13
172.59163.9727.2919.24 181.98 197.48
NL: 2.83E9
TIC MS RS_Contest_04
NL: 4.22E8
Base Peak m/z= 400.0-2000.0 F: + c
Full ms [ 400.00-2000.00] MS RS_Contest_04
Data acquired - Chromatogram
Triple Play
RT: 120.99 - 124.07
121.0 121.5 122.0 122.5 123.0 123.5 124.0Time (min)
0
50
1000
50
1000
50
1000
50
1000
50
100
Rel
ativ
e A
bund
ance
0
50
1000
50
1000
50
1004516
4504
4507
4516
4513
4519
4528
NL: 1.14E7Base Peak m/z= 400.0-2000.0 F: + c Full ms [ 400.00-2000.00] MS
RS_Contest_04
m/z= 626.6
m/z= 852.3
m/z= 872.5
m/z= 865.0
m/z= 684.0
m/z= 774.5
m/z= 1046.1
+ c Full ms [ 400.00-2000.00]
400 600 800 1000 1200 1400 1600 1800 2000m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bun
danc
e
626.3
835.5
982.4
610.21054.4
1156.2852.21157.5703.2
885.0578.8503.9 765.91217.7445.1 1469.71259.8
+ d Z ms [ 622.30-632.30]
622 623 624 625 626 627 628 629 630 631 632m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100 626.1
626.6
627.1
627.71
2+ c d Full ms2 [email protected] [ 160.00-
1890.00]
200 400 600 800 1000 1200 1400 1600 1800m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bun
danc
e
479.4535.8
828.2
957.3
715.2958.2
1070.3406.2 602.2
3
Triple Play Dynamic Exclusion
Scan 4501
Scan 4502Scan 4503
+ c d Full ms2 [email protected] [ 220.00-2000.00]
400 600 800 1000 1200 1400 1600 1800 2000
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100721.2
471.0
1261.0
697.1636.8 1141.91076.2
787.5611.5
1029.11558.2515.2340.0 830.0 1648.0930.3
+ c Full ms [ 400.00-2000.00]
400 600 800 1000 1200 1400 1600 1800 2000m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100626.3
835.0
982.6
610.1
957.31156.3872.0
885.2766.8 1024.3445.0 579.2 1252.6
852.2
1
+ d Z ms [ 848.00-858.00]
848 849 850 851 852 853 854 855 856 857 858m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100852.2
853.1
2
3
Triple Play Dynamic Exclusion
Scan 4504
Scan 4505Scan 4506
2D - LC/MS
Peng, J. and Gygi, S.P. (2001) Proteomics: the move to mixtures. J Mass Spectrom, 36, 1083-1091.
Multidimensional Protein Identification Technology (MudPIT).
Whitelegge JP (2002) Plant proteomics: BLASTing out of a MudPIT. Proc Natl Acad Sci U S A 99: 11564-6.
Koller A, Washburn MP, Lange BM, Andon NL, Deciu C, Haynes PA, Hays L, Schieltz D, Ulaszek R, Wei J, Wolters D, Yates JR, 3rd (2002) Proteomic survey of metabolic pathways in rice. Proc Natl Acad Sci U S A
5: 5.