MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught...

60

Transcript of MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught...

Page 1: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 2: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

MBP1001

Advanced Cell Biology 2010

Proteomics and Mass Spectrometry

Brian [email protected]

Page 3: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Proteomics is an extremely powerful and broadly applicable technology

can be used to identify e.g. low stoichiometry PTMs, components of protein complexes, or to characterize all protein components in an organelle, tissue or organism

the key - but poorly understood - technology in this processis mass spectrometry-based peptide sequencing

today’s lecture will provide a brief overview of the approach, followed by some examples of its utility

Page 4: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

First step- sample preparationthe goal - simplify

depending upon the goal of your experiment, you will isolatelarge or small numbers of proteins for analysis

you may subject your protein population to one or more fractionation steps, e.g.

1D SDS-PAGE2D gel electrophoresisstrong cation exchange liquid chromatographynewer technologies - free flow electrophoresis

you will then convert your protein sample to peptides

Page 5: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Why are peptides (and not proteins) sequenced?

top-down approaches can identify intact proteins, but...

proteins can be difficult to handle, and all proteins in your sample may not be soluble under the same conditions (e.g. membrane-spanning proteins vs DNA binding prots)

proteins are often significantly processed and modified, resulting in many different isoforms, making identification difficult

ion trap mass spectrometers are most efficient at obtaining sequence info from peptides up to ~40aa in length – ID of prots via peptides is bottom-up proteomics

Page 6: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Proteases are used to convert proteins to peptides

trypsinstable and very active, cleaves on the carboxy-terminal side of K and R residues (except when modified or followed by a P)results in information rich, easily interpretable peptidefragment spectra

other commonly used proteasesLysCAspNGluC

sequence non-specific proteases are generally avoided, since they divide the peptide signal into multiple overlapping species, and thereby generate unnecessarily complex peptide mixtures

Page 7: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

How are peptides introduced into the mass spectrometer?

1. liquid chromatography (LC) directly coupled (in-line) with MS (LC-MS), introduced via electrospray (ESI)

2. peptides spotted onto metal surface, released into the MS via controlled laser shots (MALDI)

Page 8: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

LC-MSpeptides are loaded onto an extremely small (50-150um) reversed-phase (silica particles coated with C18) column,and eluted directly into the machine by a gradient of increasing organic solvent (water - acetonitrile, with a small amount of acid – pH~2)

100-400nl/min flow rates (nanoflow)

separated according to hydrophobicity (standard 1-2hr runs)

eluted into the MS in a very small volume, and therefore at high concentrations

Page 9: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

In most MS applications, peptides are positively charged, via the application of a strong current to the buffer in the LC column (~2kV)

some amino acids, as well as the peptide amino terminus, are positively charged at low pH (e.g. K, R, H) – so most peptides (esp. tryptic peptides) are multiply charged

charge is critical - the MS optics manipulate only charged ions, whereas uncharged peptides are “invisible”

LC column ends in a very fine needle (~5microns); since the HPLC system is under pressure, and an electrical charge is applied, this results in a fine spray of droplets emanating from the tip containing charged peptides – electrospray ionization (soft ionization = Nobel prize)

Page 10: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 11: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Positively charged peptides are guided into the machineby a strong charge potential (and vacuum)

peptides first enter a small heated tube - as the fine droplets containing the peptides traverse the length of the tube, the buffer is rapidly evaporated

as the concentration of positively charged peptides increases in smaller and smaller droplets, they begin to repel one another, resulting in a series of Coulombic explosions

end result - individual positively charged peptides in the gas phase are ready for manipulation and measurement

Page 12: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

So what is in a mass spectrometer, anyway?

think of it as a series of boxes, connected to eachother via a pipe - each box has the ability to trap and release peptides, some boxes can also smash your peptides

at the end of the pipe sits a peptide counter (detector)

1 2 3

detector

Page 13: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Step 1peptides enter the first chamber (Q1), where they are trapped (until the trap is full)

typical ion traps (Paul trap) use a combination of static DC and RF oscillating AC electric fields to move and manipulate the charged molecules

to characterize the contents of the trap, a small amount of the peptides (~10%) is released to the detector

this process is called the parent ion, precursor, or MS scan, and yields the m/z and intensity of all of the peptides in the first chamber at that moment

readout is expressed as intensity of signal (number of counts) for a given mass (actually m/z or mass/charge)

Page 14: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

a parent ion (MS) scanio

n

inte

nsi

ty

m/z

select for fragmentation

Page 15: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Step 2collision induced dissociation

a process whereby a (mostly) pure population of a single peptide (actually a small m/z window) is ejected to a second chamber (the collision cell), and mixed with an inert gas

as energy is applied to the isolated peptide population, they collide with the gas particles, and fragment – luckily for us, most of the time peptides fragment at peptide (amide) bonds between amino acids

add just enough energy to the collision cellsuch that an individual peptide fragments just once

the resulting mixed populations of peptide fragments is thenanalyzed to give a product ion, tandem or MS/MS spectrum

Page 16: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

a real CID spectrum

Page 17: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

While dependent upon the particular goal of your analysis,the MS is usually programmed to conduct a single MS scanfollowed by several MS/MS scans

MS/MS scans are usually conducted on the x most abundantpeptides (m/z), where x is 1-20

1 MS followed by 4-20 MS/MS scans (depending upon the instrument) is typical

Step 3The ion trap is emptied, refilled, and the process repeated - the entire MS-MS/MS cycle takes 1-4 secondsand is thus repeated thousands of times per MS analysistypical LC-MS run is 1-2 hrsaverage ~10,000 MS/MS per hour for a complex sample

Page 18: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 19: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

How does the MS/MS give you sequence information?

the most common and informative fragment ions are generated by fragmentation of the amide bonds between amino acids

b-ions if charge is retained by the amino-terminal fragment

y-ions if charge is retained by carboxy-terminal fragment

the differences in mass between the peptide fragments can be used to reconstruct the sequence of the original (parent) peptide (this is called de novo sequencing)

but fragmentation pattern matching is used more often (we will talk about this later)

Page 20: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

a real CID spectrum

Page 21: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

getting your sequence – most of the time, we use database searching

a user-defined protein database is subjected to in-silico digestion with the appropriate protease(s) to generate a list of all possible peptides

a theoretical fragmentation pattern is then generated for each peptide

parent ion mass (MS) and fragmentation data (MS/MS) from your analysis are compared to the theoretical data to find the best match

matches may then be subjected to statistical analysis to determine the quality of the ID (p-value)

Page 22: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 23: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

spectral matching is also becoming more popular

millions of spectra have been generated and searched already

can keep these spectra in a library, then search for the best match to our newly generated spectra in the library

advantages – can identify “messier” spectra, and is very fast

disadvantges – if your peptide of interest has not been observedbefore, it won’t be in the library, and may not be compatiblebetween different machine types

Page 24: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Real spectral matching

Page 25: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Mass spectrometry identification of proteins

proteinProteolyticdigestion

peptides

1212 1414 1616

Time (min)Time (min)

LCseparation

m/z

Peptide selection;

fragmentation

200200 400400 600600 800800 1000100012001200

m/zm/z

Database searching

Peptideidentification

Proteinidentification

Page 26: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

putting it all together

identification of peptides tells you which proteins were in your sample in the first place

can identify hundreds of proteins in a single MS run

can identify thousands of proteins in multiple MS runsof fractionated samples

Page 27: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

questions?

take a break

Page 28: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

MBP 1001 LecturePart 2

Okay, so I understand how to identify peptides - and thereforeproteins - so what?

i.e. what can proteomics do for you?

Page 29: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

some typical proteomics goals:

global protein analysis

protein machines

protein-protein interactions

PTMs

quantitation

Page 30: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

global protein analysis

goal - identification of every protein in a cell, tissue or organism- can compare state A to state Be.g. growth conditions, developmental stages, +/- hormone, mitogen or stress

normal vs. disease state?

typically involves extensive upstream protein (or peptide)fractionation

however, some issues:dynamic range (MS vs serum?)massive amounts of machine, computer, and analysis time

Page 31: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

what proteins are present in each organelle?

Page 32: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

protein-protein interactions

most cellular processes are carried out by multiprotein complexes(think transcription, translation, mRNA splicing, proteosomal degradation)

to know your friends is to know you:interacting partners provide invaluable insight into understanding protein function and regulation

interacting partners also change in response to signaling events, providing further clues to function

signaling or metabolic pathways function in a stepwise fashion - understanding how these pathways are structurally connected

Page 33: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

tagged protein/MS analysis - general

protein of interesttag

isolation

sample fractionation

MS identification

expression in relevant cell/tissue

optional

Page 34: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

epitope taggingshort AA sequence recognized by Ab - FLAG, HA, GluGlu, etc.metal binding - 6xHiscalcium binding - CaMother strong bimolecular interactions: biotin/avidin, GST/glutathione, chitinBP/chitin, MBP/maltose

TAP (tandem affinity purification) consists of two proteintags, usually separated by a protease cleavage site

*how might a tag affect protein-protein interactions?*pros/cons of different tag types?

Page 35: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

tandem affinity purification (TAP) strategy

1 express POI as a fusion with 2 peptide tags

ProtA CaMBP protein of interest

2 bind to IgG matrix, cleave with Tobacco Etch Virus protease

TEV

interactingpartners

Page 36: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

TAP tag strategy (step 2)

CaM

3 bind to calmodulin matrix

4 elute

CaM

EDTA

5 identify co-purifying proteins

Page 37: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

large-scale tagging projects

good:pull down multiprotein complexes, providing a more realisticpicture of interactionspossible to see interactions that are dependent upon PTMscan do this type of analysis in relevant organism/cell/tissue

not so good:lots of non-specific interactions; with sepharose, tags, or due to overexpressiondetection of low abundance proteins may require scale-up

*how might you deal with these problems?

several large-scale tagging/MS projects now published have identified thousands of novel protein-protein interactions

Page 38: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

other problems with large-scale techniques?

all of these techniques are biased toward proteins of higher abundance

-many low stoichiometry interactions may be missed

-usually conducted under a single condition, may miss very interesting regulated interactions

Page 39: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

large-scale take-home messages

large-scale prot-prot interaction techniques are extremely valuable for obtaining a snapshot in time, and under a given set of environmental/developmental conditions

this knowledge is extremely valuable - connects formerly unconnectedpathways and processes

provides an overview of how protein machines are built and interactwith each other

however-not much fine detail in these studies, much of the data uncorroboratedby other methods -if you are interested in a particular protein, protein machine, or biochemical pathway, present large-scale data will likely be unsatisfactory-for these types of questions, more focused studies are required

Page 40: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

IPs and tagged proteins

high density prot-prot interaction networks

small-scale quantitative proteomics

directed studies

Page 41: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

116 kD 97 kD

control

experimental

45 kD

66 kD

samples are cleaned up until maximal difference between sample and controlis achieved:

*pros/cons?

classical IP analysis of protein complexes

weak interactors are lostlots of backgroundextensive optimization requiredconditions vary for each samplespecificity of Ab?what kind of control(s)?

Page 42: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

what does my protein do?generating a high-density interaction map

you have found an interesting protein of unknown function

what does it do?

Page 43: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

protein phosphatase 2A (PP2A or PPP2)

regulatory (B)catalytic (C)

adapter (A)

major Ser/Thr phosphatase in mammalian cells

PPP2 functions in most cases as a trimeric complex

conserved from yeast to human

numerous regulatory subunits (B) thoughtto confer substrate specificity

Page 44: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

additional human PP2A-related phosphatases

PPP2regulatory (B)

PPP2catalytic (C)

PPP2 adapter (A)

PPP4catalytic (C)

PPP6catalytic (C)

??

two additional phosphatases highly related to PPP2C

PPP4C is 67% identical to PPP2C

PPP6C is 58% identical to PPP2C

molecular organization of PPP4 and PPP6 was unknown

who do PPP4 and PPP6 talk to?

Page 45: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Generating a human protein interaction network

Stably express TAP-tagged proteins in human 293 cells

Harvest cells, and affinity-purify recombinant proteins, as

well as associated proteins

Identify all proteins in the complex by mass spectrometry

Obtain the cDNA for each protein identified

A

DB C

F E

IHG

Clone protein of interest intoa TAP-tag vector

Page 46: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

high density data via iterative TAP-tagging reveals mutually exclusive and cooperative interactions in the

PPP2 module

PPP2CPPP2R1

PPP2R2

IGBP1

PPP2R5

1

2

3

Page 47: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

PTMs

PTMs commonly identified using MS

phosphorylationubiquitylationglycosylationmethylationacetylation

hundreds of others…

identified primarily via a mass shift of a particular amino acid

Page 48: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Reading a CID spectrum

i. unmodified peptide

ii. phosphopeptide

iii. sumoylated peptide

Page 49: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

enrichment of phosphopeptides

IMACimmunocapturechemical captureaffinity chromatography

Page 50: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 51: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

identification of aUb conjugation site

Page 52: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

quantitation and mass spectrometrytwo primary methods

spectral counting - characterizing the number of spectra observed for a given protein, in relation to other proteins, or between samples

stable isotopes (e.g. 13C, 15N) incorporation of stable isotopes into peptides does not alter biochemical properties (e.g. chromatography is unaffected) but changes the mass of the peptide - this, of course, is a property that the MS can see

Page 53: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Identification

m/z

MS

Quantitation

LC m/z

MS/MS

Separation

“heavy” peptide

“light” peptide

Isotope-coding

inte

nsi

ty

inte

nsi

ty

Isotopic mass difference

intensity is proportionalto peptide abundance

m/z

inte

nsi

tyquantitative proteomics with stable isotopes

Page 54: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

spectral counting in a series of AP-MS analyses

protein A was tagged and isolated, sample subjected to LC-MS/MS

dataprotein condition 1 condition 2 protein B knockoutA 684 599 620B 131 157 0C 176 10 204D 34 0 0

what can you get from this data?

Page 55: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

Proteolyticdigestion

Proteolyticdigestion

Labeling with“light” ICAT

Labeling with“heavy” ICAT

Isolation ofICAT-labeled

peptides

Metabolic labeling: SILAC

Chemical labeling: ICAT

Cells grown in“light” SILAC

Cells grown in“heavy” SILAC

lysis

Affinitypurification

Lysis Affinitypurification

FractionationLC-MS/MS

FractionationLC-MS/MS

isotopic labeling strategies

Page 56: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

absolute quantitation

what if you would like to know absolute levels of your protein/peptide?

e.g. determine stoichiometries of various proteins in protein complexes?

AQUA – peptides synthesized with stable isotopes, to use as internal standards

spiked into sample, and used to quantify endogenous peptide by comparingion intensities

can be made with standard PTMs

Page 57: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

END

Page 58: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 59: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.
Page 60: MBP1001 Advanced Cell Biology 2010 Proteomics and Mass Spectrometry Brian Raught brian.raught@uhnres.utoronto.ca.

iTRAQ 114

0 min 30 min 60 min 120 min

AC

AAC

B B

A

treat cells

isolatecomplex

proteolyze

iTRAQ label

combine

quantitate and identify

iTRAQ 115 iTRAQ 116 iTRAQ 117

iTRAQ