Proteomics Informatics –
description
Transcript of Proteomics Informatics –
![Page 1: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/1.jpg)
Proteomics Informatics – Protein characterization: post-translational
modifications and protein-protein interactions (Week 10)
![Page 2: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/2.jpg)
Top down / bottom up
Top down
Bottom up
mass/charge
inte
nsity
![Page 3: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/3.jpg)
Top down Bottom up
Charge distribution
mass/chargein
tens
itymass/charge
inte
nsity
1+
2+
3+
4+
27+
31+
![Page 4: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/4.jpg)
Top down Bottom upm = 1035 Da m = 1878 Da m = 2234 Da
Isotope distribution
mass/chargein
tens
itymass/charge
inte
nsity
![Page 5: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/5.jpg)
Fragmentation
Top down Bottom up
Fragmentation
![Page 6: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/6.jpg)
Correlations between modifications
Top down
Bottom up
![Page 7: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/7.jpg)
Alternative Splicing
Top down
Bottom up
Exon 1 2 3
![Page 8: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/8.jpg)
Top down
Kellie et al., Molecular BioSystems 2010
Proteinmass
spectraFragment
mass spectra
![Page 9: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/9.jpg)
Protein Complexes
AB
AC
D
Digestion
Mass spectrometry
![Page 10: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/10.jpg)
Sowa et al., Cell 2009
Protein Complexes – specific/non-specific binding
![Page 11: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/11.jpg)
Protein Complexes – specific/non-specific binding
Choi et al., Nature Methods 2010
![Page 12: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/12.jpg)
Tackett et al. JPR 2005
Protein Complexes – specific/non-specific binding
![Page 13: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/13.jpg)
Analysis of Non-Covalent Protein Complexes
Taverner et al., Acc Chem Res 2008
![Page 14: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/14.jpg)
Non-Covalent Protein Complexes
Schreiber et al., Nature 2011
![Page 15: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/15.jpg)
More / better quality interactions
Affinity Capture Optimization Screen
+
Cell extraction
Lysate clearance/Batch Binding
Binding/Washing/Eluting
SDS-PAGE
Filtration
LaCava, Hakhverdyan, Domanski, Rout
![Page 16: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/16.jpg)
Over 20 different extraction and washing conditions ~ 10 years or art.(41 pullouts are shown)
Molecular Architecture of the NPC
Actual model Alber F. et al. Nature (450) 683-694. 2007 Alber F. et al. Nature (450) 695-700. 2007
![Page 17: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/17.jpg)
Cloning nanobodies for GFP pullouts
• Atypical heavy chain-only IgG antibody produced in camelid family – retain high affinity for antigen without light chain
• Aimed to clone individual single-domain VHH antibodies against GFP – only ~15 kDa, can be recombinantly expressed, used as bait for pullouts, etc.
• To identify full repertoire, will identify GFP binders through combination of high-throughput DNA sequencing and mass spectrometry
VHH clone for recombinant expression
![Page 18: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/18.jpg)
Cloning llamabodies for GFP pullouts
Llama GFP immunization
Lymphocytetotal RNA
Crude serum
VHH amplicon
454 DNA sequencing
RT / Nested PCRIgG fractionation &
GFP affinity purification
VHH DNA sequence library
GFP-specific VHH fraction
LC-MS/MS
GFP-specific VHH clones
Bone marrow aspiration Serum bleed
500400300
1000 bp
0
100,000
200,000
300,000
400,000
500,000
No.
of R
eads
Read length (bp)
VH
VHH
Fridy, Li, Keegan, Chait, Rout
![Page 19: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/19.jpg)
CDR3: 100.0% (14/14); combined CDR: 100.0% (33/33); DNA count: 10MAQVQLVESGGGLVQAGGSLRLSCVASGRTFSGYAMGWFRQTPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS
CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33; DNA count: 1MADVQLVESGGGLVQSGGSRTLSCAASGRVLATYHLGWFRQSPGREREAVAAITWSAHSTYYSDSVKGRFTISIDNARNTGYLQMNSLKPEDTAVYYCTVRHGTWFTVSRYWTDWGQGTQVTVS
CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33); DNA count: 1MAQVQLVESGGALVQAGASLSVSCAASGGTISKYNMAWFRRAPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS
CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1MAQVQLEESGGGLVQAGDSLTLSCSASGRTFTNYAMAWSRQAPGKERELLAAIDAAGGATYYSDSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS
CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1MAQVQLVESGGGRVQAGGSLTLSCVGSEGIFWNHVMGWFRQSPGKDREFVARISKIGGTTNYADSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS
CDR1 CDR2 CDR3
Underlined regions are covered by MS
Rank sequences according to:CDR3 coverage; Overall coverage;Combined CDR coverage; DNA counts;
Identifying full-length sequences from peptides
![Page 20: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/20.jpg)
Sequence diversity of 26 verified anti-GFP nanobodies
• Of ~200 positive sequence hits, 44 high confidence clones were synthesized and tested for expression and GFP binding: 26 were confirmed GFP binders.
• Sequences have characteristic conserved VHH residues, but significant diversity in CDR regions.
FR1 CDR1 FR2 CDR2 CDR3FR3 FR4
![Page 21: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/21.jpg)
HIV-1
gp120Lipid Bilayer gp41
MA
CA
NC
PRIN
RT
RNA
Particle
Genome
env
rev
vpu
tat
nef
3’ LTR5’ LTR
vif gagpol
vpr
CAMA NC p6
PR RT IN
gp41gp120
9,200 nucleotides
![Page 22: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/22.jpg)
Genetic-Proteomic Approach
Tagged Viral Protein
Tag
Protein ComplexSDS-PAGE
*
Mass Spectrometry
![Page 23: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/23.jpg)
I-Dirt for Specific Interaction
3xFLAG Tagged HIV-1 WT HIV-1
Infection
Light Heavy (13C labeled Lys, Arg)
1:1 Mix
Immunoisolation
MS
I-DIRT = Isotopic Differentiation of Interactions as Random or Targeted
Lys Arg(+6 daltons)(+6 daltons)
Modified from Tackett AJ et al., J Proteome Res. (2005) 4, 1752-6.
![Page 24: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/24.jpg)
IDIRT and Reverse IDIRT
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0.40 0.50 0.60 0.70 0.80 0.90 1.00
Spec
ifici
ty, R
erve
rse
Specificity, Forward
gp160 IDIRT: Forward-Reverse Ratio Comparison
Env-3xFLAG Vif-3xFLAG
Luo, Jacobs, Greco, Cristae, Muesing, Chait, Rout
![Page 25: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/25.jpg)
Protein Exchange
Vif-3F
Heavy labeled Vif-3F lysate
IP in heavy labeled Vif-3F lysate
Vif-3F
Light labeled wt lysate
Incubation with light labeled wt lysate
Vif-3F
15min
Vif-3F
5min
Stable Interactor
Vif-3F
Interactor with fast exchange
60min
![Page 26: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/26.jpg)
Env Time Course SILAC
• Differentially labeled infection harvested at early or late stage of infection
• Distinguish proteins that interact with Env at early or late stage during infection
Early during infection Late during infection
LightHeavy (13C labeled Lys, Arg)
1:1 Mix
Immunoisolation
MS
Early interactor Late interactor
![Page 27: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/27.jpg)
M/Z
PeptidesFragments
Fragmentation
ProteolyticPeptides
Enzymatic Digestion
ProteinComplex
Chemical Cross-Linking
MS
MS/MS
Isolation
Cross-LinkedProtein Complex
Interaction Partners by Chemical Cross-Linking
![Page 28: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/28.jpg)
M/Z
PeptidesFragments
Fragmentation
ProteolyticPeptides
Enzymatic Digestion
ProteinComplex
Chemical Cross-Linking
MS
MS/MS
Isolation
Cross-LinkedProtein Complex
Interaction Sites by Chemical Cross-Linking
![Page 29: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/29.jpg)
Cross-linking
protein
n peptides with reactive groups
(n-1)n/2 potential ways to cross-link peptides pairwise+ many additional uninformative forms
Protein A + IgG heavy chain 990 possible peptide pairs
Yeast NPC ˜106 possible peptide pairs
![Page 30: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/30.jpg)
Protein Crosslinking by Formaldehyde
~1% w/v Fal20 – 60 min
~0.3% w/v Fal5 – 20 min1/100 the volume
LaCava
![Page 31: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/31.jpg)
Protein Crosslinking by Formaldehyde
RED: triplicate experiments, FAl treated grindateBLACK: duplicated experiments, FAl treated cells (then ground)
SCORE: Log Ion Current / Log protein abundance Akgöl, LaCava, Rout
![Page 32: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/32.jpg)
Cross-linkingMass spectrometers have a limited dynamic range and it therefore important to limit the number of possible reactions not to dilute the cross-linked peptides.
For identification of a cross-linked peptide pair, both peptides have to be sufficiently long and required to give informative fragmentation.
High mass accuracy MS/MS is recommended because the spectrum will be a mixture of fragment ions from two peptides.
Because the cross-linked peptides are often large, CAD is not ideal, but instead ETD is recommended.
![Page 33: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/33.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25Number of fragment ions
Pro
babi
lity
of L
ocal
izat
ion
Phosphopeptide identification
mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation
Localization of modifications
![Page 34: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/34.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25
Prob
abili
ty o
f Loc
aliz
atio
n
Number of fragment ions
ID3
Localization (dmin=3)
mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation
dmin>=3 for 47% of human tryptic peptides
Localization of modifications
![Page 35: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/35.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25
Prob
abili
ty o
f Loc
aliz
atio
n
Number of fragment ions
ID32
Localization (dmin=2)
mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation
dmin=2 for 33% of human tryptic peptides
Localization of modifications
![Page 36: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/36.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25
Prob
abili
ty o
f Loc
aliz
atio
n
Number of fragment ions
ID321
Localization (dmin=1)
mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation
dmin=1 for 20% of human tryptic peptides
Localization of modifications
![Page 37: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/37.jpg)
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25
Prob
abili
ty o
f Loc
aliz
atio
n
Number of fragment ions
ID3211*
Localization(d=1*)
mprecursor = 2000 DaDmprecursor = 1 DaDmfragment = 0.5 DaPhosphorylation
Localization of modifications
![Page 38: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/38.jpg)
Peptide with two possible modification sites
Localization of modifications
![Page 39: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/39.jpg)
Peptide with two possible modification sites
MS/MS spectrum
m/z
Inte
nsity
Localization of modifications
![Page 40: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/40.jpg)
Peptide with two possible modification sites
MS/MS spectrum
m/z
Inte
nsity
Matching
Localization of modifications
![Page 41: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/41.jpg)
Peptide with two possible modification sites
MS/MS spectrum
m/z
Inte
nsity
Matching
Which assignment doesthe data support?
1, 1 or 2, or 1 and 2?
Localization of modifications
![Page 42: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/42.jpg)
AAYYQK
Visualization of evidence for localization
AAYYQK
![Page 43: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/43.jpg)
Visualization of evidence for localization
AAYYQK
AAYYQK
![Page 44: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/44.jpg)
Visualization of evidence for localization
3
2
1
3
2
1
![Page 45: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/45.jpg)
Estimation of global false localization rate using decoy sites
By counting how many times the phosphorylation is localized to amino acids that can not be phosphorylated we can estimate the false localization rate as a function of amino acid frequency.
0
0.005
0.01
0.015
0.02
0 0.05 0.1 0.15
0
0.005
0.01
0.015
0.02
0 0.05 0.1 0.15
Amino acid frequency
Fals
e lo
caliz
atio
n fr
eque
ncy
Y
![Page 46: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/46.jpg)
S21
Sm1
How much can we trust a single localization assignment?
If we can generate the distribution of scores for assignment 1 when 2 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.
SS mm21
0
2
1
21
2
0
2
1
21
2
2
1
1
dSSFdSSFp
S m
)(
)(
1.
2.
![Page 47: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/47.jpg)
Is it a mixture or not?If we can generate the distribution of scores for assignment 2 when 1 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.
S12
Sm2
SS mm21
0
12
12
1
0
12
12
11
2)(
)(2
dSSF
dSSFp
Sm
1.
2.
![Page 48: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/48.jpg)
ppppthth
and1
2
2
1 1 and 2 pppp
ththand
1
2
2
1 1 pppp
ththand
1
2
2
1
ppppthth
and1
2
2
1 1 or 2Ø )( ppSS mm
1
2
2
121
Peptide with two possible modification sites
MS/MS spectrum
m/zIn
tens
ity
Matching
Which assignment doesthe data support?
1, 1 or 2, or 1 and 2?
Localization of modifications
![Page 49: Proteomics Informatics –](https://reader035.fdocuments.in/reader035/viewer/2022062323/568166f4550346895ddb52a3/html5/thumbnails/49.jpg)
Proteomics Informatics – Protein characterization: post-translational
modifications and protein-protein interactions (Week 10)