Introduction to Proteomics - Metabolomicsmetabolomics.se/sites/default/files/courses_files/Intro to...
Transcript of Introduction to Proteomics - Metabolomicsmetabolomics.se/sites/default/files/courses_files/Intro to...
1
Introduction to Proteomics
Åsa Wheelock, Ph.D.Division of Respiratory Medicine &
Karolinska Biomics [email protected]
In: Systems Biology and the Omics Cascade, Karolinska Institutet, June 9-13, 2008
2
Focus of course: Tools for data analysis
Your analysis is no better than data you have collected...
The goals of this proteomics overview:• Understand possibilities & limitations • Pros and cons of different method• Sources of variance in proteomics• Take advantage of proteomics core facilities • Perform proteomics collaborations • Write a short research proposal in
proteomics
3
Proteomics publications in Pubmed
0
500
1000
1500
2000
2500
3000
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
First ”proteomics”publication
4
Why Proteins?!?• Business end of the cell• Detailed information with limited efforts
– As compared to metabolomics
TRANSCRIPTOMICS
PROTEOME
METABOLOME
Limited info from mRNA
Detailed infoRobust technology
Techincally challenging
• Relatively robust methods available
5
Proteomics Methodology• No ”protein PCR”
– 4 nucleotides vs 20+ amino acids– Post-translational modifications (PTM)
3 MAIN PROTEOMICS PLATFORMS• Gel based methods • Shotgun methods (mass spec-based)
”chromatography-based”, ”gel-free”• Array based (antibody based)
6Klose, J. 1975. Humangenetic 26, 231-43O’Farrel, P. 1975. J. Biol.Chem, 250, 4007-21
Gel based: 2-Dimensional Electrophoresis
Isoelectric point (pI)
Net
cha
rge
3 4 5 6 7 8 9 10 11 pH
+2+1
0-1-2
log
Mw
mobility
Molecular weight
SEPARATION VISUALISATION
QUANTIFICATION
Stoichiometric protein stain => 3rd dimension
7
Image acquisition
Image acquisition using fluorescent scanner
8
Quantitative image analysis
1. Detect spots
2. Match spots across gels
3. Quantify spot volumes
Pixel intensity => 3rd dimensionSpot volume = protein quantity
9
Protein identification
⇒ Mass spectrometry(MALDI-TOF/TOF)
Trypsin digestion
⇒ DATABASE SEARCH => IDENTIFICATION(Swiss-prot, EnSemble) (statistical probability)
10
Protein Identification
Protein Trypsine digestion Peptides EXPERIMENTAL Peptide masses
Protein database
DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCPF DEHVKLVNELTEFAKTCVADESHAGCEKSLHTLFGDELCKVASLRET
Virtual digest
MS analysis
EXPECTED Peptide masses
Statistical matching
11
Peptide mass mapping
LHTLFGDR
MS/MS analysis=> sequence information
MS analysis=> peptide masses
Statistical matching
DLGEEHFK database search
DTHKSEIAHRFKDLGEEHFKGLVLIAFSQYLQQCPF DEHVKLVNELTEFAKTCVADESHAGEKSLHTLFGRELCKVASLRET
Homology search Validate statistical hit
12
Shotgun vs. Gel-based proteomics
Adapted from Patterson and Aebersold, Nature Genetics 2003, 33:311-23. Fig. 3
extract
Separate Quantify
Separate(LC)
Digest
Digest
MS/MS
MS/MS
(2DE)
ID
Quantify
13
Semi-quantitative proteomics
Both 2DE and MS-based methods NOT quantitative by nature
Co-separation: 2 samples => ratiosTags => Semi-quantitative proteomics
14
Shotgun isotope-tagging :Isotope coded affinity tag (ICAT)
15
Multiplexing in 2DE: DIGEMultiplexing in 2DE: DIGE-- Differential Gel ElectrophoresisDifferential Gel Electrophoresis
Protein abundance = 532/633 signal ratio
Co-separation by 2DE
Direct ratiometric normalization
Spot quantification
Statistical analysis
Protein visualization (super-imposable images)
λex 633nm ⇒ Cy5λex 532nm ⇒ Cy3
Treated sample: Covalently labeled (Cy3)
Control sample: Covalently labeled (Cy5)
16
Semi-quantitative proteomics
Both 2DE and MS-based methods NOT quantitative by nature
Co-separation: 2 samples => ratiosTags => Semi-quantitative proteomics
Pooled internal standard + 2-3 samples => Relative quantification
17
Internal Standards in 2DE: DIGEInternal Standards in 2DE: DIGE
Protein abundance = relative to IS
Co-separation by 2DE
Direct ratiometric normalization
Spot quantification
Statistical analysis
Protein visualization (super-imposable images)
λex 633nm ⇒ Cy5λex 532nm ⇒ Cy3
Treated Sample: Covalently labeled (Cy3)
Control Sample:Covalently labeled (Cy5)
λex 488nm ⇒ Cy2
Pooled Internal standard (Cy2)
18
Proteomics in Pubmed
0
500
1000
1500
2000
2500
3000
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
proteomics 2DE
First ”proteomics”publication
First DIGE-method
published
19
Sample A
Trypsin digestion (peptides)
31114
PRG
30
115
PRG
29
116
PRG
28
117
PRG
Multiplexing in MS: iTRAQ- isobaric Tag for Relative and Absolute Quantitation
Sample B Sample C Sample D
a b c diTRAQ labelling
MS/MS m/z
ID
Quantification
20
Differential labelling opens up new possibilities
• Cysteine oxidative states
• Identify peptides on plasma membrane surface
• Cellular re-localization
21
2D or not 2D?2D or not 2D?Gel-based methods: 2-D electrophoresis
+ soluble proteins
+ post-translational modifications
22
Post-translational modifications”Spot trains”
Intact proteins
23
2D or not 2D?2D or not 2D?Gel-based methods: 2-D electrophoresis
+ soluble proteins
+ post-translational modifications
- technical variance, time consuming
MS-based (Gel-free) methods: ICAT, iTRAQ
+ membrane proteins
+ low abundance proteins
24
Extremes of physiochemical properties: Peptides
• Charge
- pI range from 3-12
• Size
- Mw range of 5 – 500,000 kDa
• Hydrophobicity
- membrane proteins
25
2D or not 2D?2D or not 2D?Gel-based methods: 2-D electrophoresis
+ soluble proteins
+ post-translational modifications
- technical variance, time consuming
MS-based (Gel-free) methods: ICAT, iTRAQ
+ membrane proteins
+ low abundance proteins
- expensive, data intense
26
Shotgun approcahes and gel-based approaches complementary
No ”true” proteomics technique yet
SHOTGUNGEL-BASED
27
日本Japan
103
100 101 102 103 104 105 106 107 108 109 1010 1011 1012
COPIES of each
PROTEIN
Osaka 大阪
Kyoto 京都
Kobe 神戸
104
1012
DYNAMIC RANGE
28
- 400 reported PTMs
Protein copy numbers in blood (log10)0 1 2 3 4 5 6 7 8 9 10 11 12
Phos
phor
ylat
ion
site
s
7
6
5
4
3
2
1
Post-translational Modifications (PTMs)
29
Variance in 2DE• BIOLOGICAL VARIANCE• Experimental variance
– Pre-fractionation, isolation & labelling of proteins – Protein staining
• Technical variance– Gel-to-gel variation in 2DE– Image acquisition (scanner)
• Post-experimental variance– Software-induced variance– User dependant variance
30
Variance in 2DE• BIOLOGICAL VARIANCE•• Experimental varianceExperimental variance
– Pre-fractionation, isolation & labelling of proteins – Protein staining
• Technical variance– Gel-to-gel variation in 2DE– Image acquisition (scanner)
• Post-experimental variance– Software-induced variance– User dependant variance
31
Remember that variance adds up:Multiple-step method is not your friend...
10% 40%
32
Variance in 2DE• BIOLOGICAL VARIANCE• Experimental variance
– Pre-fractionation, isolation & labelling of proteins – Protein staining
• Technical variance– Gel-to-gel variation in 2DE– Image acquisition (scanner)
• Post-experimental variance– Software-induced variance– User dependant variance
33
Technical variance in 2DE
Solubilization Rehydration Isoelectric focusing
SDS-PAGELoad IPG strip Protein visualization
Recovery?
Recovery?
Inhomogenouselectric field?
Gel-to-gel variations?
-SH modifications?
Staining artifacts?
Backgroundfluorescence?
Biological variance
34
Tools to reduce variance
Technical variance• Internal standard:
– DIGE
• Software algorithms:– Background subtraction– Normalization
35
Dynamic range of scanner
16 bit pixel resolution (216 ∼ 65,000 ~ 105)Make sure you are using the entire range!
36
Variance in 2DE• BIOLOGICAL VARIANCE• Experimental variance
– Pre-fractionation, isolation & labelling of proteins – Protein staining
• Technical variance– Gel-to-gel variation in 2DE– Image acquisition (scanner)
• Post-experimental variance– Software-induced variance– User dependant variance
37
2DE analysis software
• Main purpose: match and quantify spots• Normalization: reduce gel-to-gel variation• Background subtraction:
– Reduce background noise– Increase signal/noise ratio– Increase sensitivity
38
39
40
Global Background SubtractionGlobal Background Subtraction
PDQuestPDQuest: Floating/Rolling Ball: Floating/Rolling Ball
41
Software induced variance Software induced variance
Expression; No background subtraction
0
25
50
75
100
1 51 101 151 201 251 301 351 401 451
CV
(%) n
=5
Expression; No background subtraction
0
25
50
75
100
1 51 101 151 201 251 301 351 401 451
CV
(%) n
=5
Average CV=4.6%Average CV=10%
PDQuest PG200PD Quest; Power mean
0
25
50
75
100
1 51 101 151 201 251 301 351
CV
(%) n
=5
PD Quest; Power mean
0
25
50
75
100
1 51 101 151 201 251 301 351
CV
(%) n
=5
Software variance up to 30% of technical variance
42
Applications of proteomicsApplications of proteomicsBIOMARKER DISCOVERY• Biomarker of disease & susceptibility
CLINICAL APPLICATIONS• Pharmaceutical target identification• Improved diagnostics
MECHANISTIC STUDIES• Protein-protein interactions• Protein adduction /Altered protein expression• Hypothesis generation: avoid local ”maxima”• Systems Biology
43
Proteomics in the future
• Improved sensitivity– Currently: scratching the surface– laser capture microdissection
• Protein microarrays– Antibody arrays (e.g. for cytokines)– Tissue microarrays (Peter Nilsson, Friday)
• In vivo subcellular localization assays• Protein amplicifation method?
– i.e. ”protein-PCR”
44
Focus on INTERPRETINGINTERPRETING data,not on ACQUIRING ACQUIRING data.
Pathway Analysis• Integrate data from omics cascade• Integrate heatmap with biological pathways
Proteomics in the NEAR future...
45
Preview of coming attractionsPreview of coming attractions……KEGG & KEGG & KegArrayKegArray
46
Take home messages...
...keep your variance downand your dynamic range up!
...keep your false positives down,and your power up!